Neue Methoden und ihre Anwendungen

Effizientes Abtasten der
Gleichgewichtsensembles gelöster Polypeptide:
Neue Methoden und ihre Anwendungen
Robert Denschlag
München 2010
Effizientes Abtasten der
Gleichgewichtsensembles gelöster Polypeptide:
Neue Methoden und ihre Anwendungen
Robert Denschlag
Dissertation
an der Fakultät für Physik
der Ludwig-Maximilians-Universität
München
vorgelegt von
Robert Denschlag
aus Worms
München, im Juni 2010
Erstgutachter: Prof. Dr. Paul Tavan
Zweitgutachter: Prof. Dr. Martin Zacharias
Tag der mündlichen Prüfung: 21.07.2010
Inhaltsverzeichnis
Zusammenfassung
1 Einführung
1.1 Lichtinduzierte Dynamik . . . . . . . . . . . . . . . . .
1.2 Aufbau und Struktur von Proteinen . . . . . . . . . . . .
1.3 Computersimulationen an Biomolekülen . . . . . . . . .
1.3.1 Molekularmechanische Kraftfelder . . . . . . .
1.3.2 Simulationsmethoden . . . . . . . . . . . . . . .
1.3.3 Schwachstellen von MM Computersimulationen
1.4 Verallgemeinerte-Ensemble Abtasttechniken . . . . . . .
1.4.1 Simulated Tempering . . . . . . . . . . . . . . .
1.4.2 Replica Exchange . . . . . . . . . . . . . . . . .
1.4.3 Solute Tempering . . . . . . . . . . . . . . . . .
1.5 Die Struktur dieser Arbeit . . . . . . . . . . . . . . . . .
vi
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
1
5
8
8
10
11
12
14
18
19
21
2 Effizienzreduktion durch RE Simulationen
23
3 Optimierte Replica Exchange Protokolle
3.1 Optimale Temperaturleitern . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Vergleich verschiedener Austauschschemata . . . . . . . . . . . . . . . .
31
31
36
4 Simulated Solute Tempering
43
5 Relaxation eines lichtschaltbaren Peptides
57
6 Resümee und Ausblick
87
Literaturverzeichnis
95
Danksagung
103
v
Zusammenfassung
Aussagekräftige Molekulardynamik (MD) Simulationen von Biomolekülen in kondensierter Phase stellen hohe Anforderungen an die Genauigkeit des Kraftfeldes und die Dauer
der simulierten Zeitspanne. Letztere muss lang genug sein, damit das Zeitmittel der im
Blickpunkt stehenden Observablen dem entsprechenden Ensemblemittel entspricht. Ferner ist es wegen der geforderten Genauigkeit in der Regel notwendig, das Lösungsmittel
atomar zu beschreiben.
Trotz der raschen Fortschritte der Computertechnik verhindert auch heute noch der damit verbundene Rechenaufwand schon bei kleinen Peptiden das ergodische Abtasten ihrer
Konformationsräume, solange lediglich herkömmliche Verfahren der MD Simulation eingesetzt werden. Dagegen können moderne MD Techniken das Abtastproblem durch die
Verwendung sogenannter „verallgemeinerter Ensembles“ abmildern. Ein populäres Verfahren ist die „Replica Exchange“ (RE) Methode, deren verallgemeinertes Ensemble im
einfachsten Fall aus einem Satz bis auf die Temperatur identischer Simulationsssysteme
(Replikate) besteht. Die Interpretation, Optimierung und Weiterentwicklung solcher Techniken stehen im Zentrum dieser Arbeit.
Dazu stellt sie vier Publikationen aus den vergangenen zwei Jahren vor /3-6/. Hier wird
zunächst anhand eines einfachen Modells für ein β-Hairpinpeptid gezeigt, dass und unter welchen Bedingungen RE Methoden die Abtasteffizienz sogar reduzieren können /3/.
Es wird dabei der Zusammenhang zwischen den Kinetiken simulierter Faltungs-Entfaltungsgleichgewichte und der Geschwindigkeit der Durchmischung des verallgemeinerten
Ensembles andiskutiert. In Nachfolgearbeiten /4,5/ wird diese Frage aufgegriffen und gezeigt, wie die Mischungsgeschwindigkeit maximiert werden kann, so dass die Abtasteffizienz durch eine geeignet gewählte Temperaturleiter optimiert wird. Schließlich wird eine
neue Abtastmethode vorgeschlagen /6/, welche eine deutlich effizientere Simulation der
Gleichgewichtsensembles gelöster Polypeptide ermöglicht. Die neue „Simulated Solute
Tempering“ (SST) Methode verwandelt das von Liu et al. (Proc. Natl. Acad. Sci. USA
102, 13749, 2005) vorgeschlagene „Replica Exchange with Solute Tempering“ (REST)
Verfahren in einen sequentiellen Zugang. Dabei übernimmt SST von REST die im Vergleich zu RE schon ausgedünnten Temperaturleitern. Wie anhand eines Oktapeptids in
Lösung demonstriert wird, ist die Abtasteffizienz von SST nicht nur größer als jene von
RE, sondern sogar jener von REST deutlich überlegen.
Darüber hinaus demonstriert diese Dissertation anhand einer Anwendung /9/ den Nutzen der angesprochenen Verfahren. Diese Anwendung behandelt ein zyklisches, lichtschaltbares Modellpeptid namens cAPB, dessen Gleichgewichtsensembles bei Zimmertemperatur mit RE Methoden berechnet werden. Darauf aufbauend wird die durch cis/trans
Photoisomerisierung des kovalent integrierten Azobenzolfarbstoffs ausgelöste Relaxation
des Peptidrückgrats simuliert und kinetisch charakterisiert. Die Ergebnisse der Simulationen werden zur Analyse experimenteller Daten genutzt. Für die langsamste, experimentell
bislang nicht zugängliche Relaxationskinetik liefern die Simulationen eine Vorhersage von
23 ns. Weitere Anwendungen /1,7,8/ und methodische Beiträge /2/ wurden vor allem deshalb nicht in den Text integriert, um das Hauptanliegen dieser Dissertation zur Geltung zu
bringen, das in der Analyse und Optimierung von RE Methoden besteht.
vi
Zusammenfassung
Verzeichnis der unter meiner Mitwirkung entstandenen Publikationen
/1/ TE Schrader, WJ Schreier, T Cordes, FO Koller, G Babitzki, R Denschlag, C Renner,
S-L Dong, M Löweneck, L Moroder, P Tavan, and W Zinth (2007). Light triggered
ß-hairpin folding and unfolding. Proc. Natl. Acad. Sci. USA 104, 15729-15734.
/2/ M Lingenheil, R Denschlag, R Reichold, and P Tavan (2008). The "Hot-Solvent/ColdSolute" problem revisited. J. Chem. Theory Comput. 4, 1293-1306.
/3/ R Denschlag, M Lingenheil, and P Tavan (2008). Efficiency reduction and pseudo-convergence in replica exchange sampling of peptide folding-unfolding equilibria. Chem. Phys. Lett. 458, 244-248.
/4/ R Denschlag, M Lingenheil, and P Tavan (2009). Optimal temperature ladders
in replica exchange simulations. Chem. Phys. Lett. 473, 193-195.
/5/ M Lingenheil, R Denschlag, G Mathias, and P Tavan (2009). Efficiency of exchange schemes in replica exchange. Chem. Phys. Lett. 478, 80-84.
/6/ R Denschlag, M Lingenheil, P Tavan, and G Mathias (2009). Simulated solute
tempering. J. Chem. Theory Comput. 5, 2847-2857.
/7/ G Babitzki, R Denschlag, and P Tavan (2009). Polarization effects stabilize bacteriorhodopsin’s chromophore binding pocket: A molecular dynamics study. J. Phys.
Chem. B 113, 10483-10495.
/8/ M Lingenheil, R Denschlag, and P Tavan (2010). The polarity of the environment
steers the stability of PrPc helix 1. Eur. Biophys. J. DOI: 10.1007/s00249-009-05706.
/9/ R Denschlag, WJ Schreier, B Rieff, TE Schrader, FO Koller, L Moroder, W
Zinth, P Tavan (2010). Relaxation time prediction for a light switchable peptide
by molecular dynamics. Phys. Chem. Chem. Phys. 12, 6204 - 6218
Die durch Fettdruck hervorgehobenen Arbeiten sind in den Text der Dissertation eingearbeitet und dort nachgedruckt.
vii
1 Einführung
Mit einem Anteil von über 50% an der Trockenmasse stellen die Proteine die dominante
Klasse der in einer Zelle vorkommenden Makromoleküle dar [1]. Dort haben sie vielfältige Funktionen. So bestimmen sie beispielsweise als Strukturelemente die Beschaffenheit des Gewebes, steuern als Enzyme lebensnotwendige biochemische Reaktionen,
regeln als Membrankanäle die Aktivität von Nervenzellen, übernehmen als Verpackungsmaterial den Transport wichtiger Substanzen, steuern als Hormone Vorgänge im Körper
oder dienen als Antikörper dem Immunsystem [1, 2]. Maßgeblich für die Funktion eines
Proteins ist dessen 3-dimensionale Struktur. Diese Struktur hängt nicht nur von der chemischen Zusammensetzung des Proteins ab, sondern auch von der Umgebung, in der es
sich befindet. Daher kann sich die Struktur durch umfeldgesteuerte Umfaltung ändern.
Gerade diese Steuerbarkeit der Proteinstruktur durch das physikalisch-chemische Umfeld
ermöglicht es Proteinen, in der Zelle als biomolekulare Maschinen zu fungieren.
Die Entwicklung und Anwendung von Methoden, welche erweiterte Einblicke in die
Struktur und Dynamik von Proteinen liefern, kann zu einem verbesserten Verständnis ihrer
Funktionsweise führen und sind daher Gegenstand aktueller Forschung. Diesem übergeordneten Ziel war auch der Sonderforschungsbereich 533 „Lichtinduzierte Dynamik von
Biopolymeren“ gewidmet, durch den meine Arbeit hauptsächlich finanziert wurde.
1.1 Lichtinduzierte Dynamik
Zu Beginn meiner Doktorarbeit war mir die Aufgabe gestellt worden, ein von Heiko Carstens begonnenes und in das Teilprojekt C1 „Theorie und Computersimulation der Konformationsdynamik von Peptiden und Proteinen in natürlicher Umgebung“ des SFB 533
eingebettetes Forschungsvorhaben fortzuführen. Dieses Vorhaben befasste sich mit der in
silico Untersuchung der Konformationsdynamik lichtschaltbarer Modellpeptide [3]. Da
es sich bei diesen Modellpeptiden um vergleichsweise kleine Moleküle handelt, hatten
schon die Voruntersuchungen von Heiko Carstens gezeigt, dass sie sich hervorragend für
aufeinander abgestimmte theoretische und experimentelle Untersuchungen eignen [4, 5].
Gerade bei komplex geordneten, aber gleichzeitig auch flexiblen Makromolekülen wie
den Polypeptiden kann ein genaues Verständnis der ablaufenden Prozesse nur durch ein
enges Zusammenwirken von Experiment und Theorie gewonnen werden [6–8]. Zum einen
sind ohne weitere Zusatzinformationen eindeutige Interpretationen experimenteller Daten
häufig schwierig. Hier können theoretische Ansätze in Form von Computersimulationen
wichtige noch fehlende Informationen liefern. Zum anderen liefert der Vergleich mit dem
Experiment Aufschluss über die Qualität der Modelle, die den Computersimulationen zugrundeliegen, und kann dadurch deren Verfeinerung und Weiterentwicklung inspirieren.
1
1 Einführung
Abbildung 1.1: Azopeptide. Links ist das zyklische Peptid cAPB [9] in einer α-helikalen Struktur
dargestellt, die zeitweise während ausgedehnter Computersimulationen [10] angenommen wurde.
Die helikale Struktur wird dabei durch ein graues Band hervorgehoben. Rechts erkennt man das
β-Hairpinpeptid, welches mit seinen zwei Strängen eine Haarnadelstruktur ausbildet [11]. Beide
Konformationen (Helix und Haarnadel) setzen voraus, dass sich der Azobenzolfarbstoff in der cis
Konfiguration befindet.
Im Mittelpunkt meiner Untersuchungen standen die beiden in der Abbildung 1.1 dargestellten Modellpeptide. Es waren dies ein durch ein Azobenzolderivat namens APB
[(4-AminoPhenyl)azoBenzolsäure] zyklisiertes Peptid namens cAPB [9] (links) und ein
β-Hairpinpeptid [11] (rechts), dessen Turn-Struktur durch das Azobenzolderivat AMPP
[3-(3-AminoMethylPhenylazo)Phenylacetatsäure] gebildet wird. Die hier in das jeweilige Peptidrückgrat kovalent integrierten Azobenzolfarbstoffe können nach Absorption von
Licht mit der Wellenlänge λ ≈ 500 nm durch Photoisomerisierung ultraschnell (≈ 300 fs)
von den im Bild dargestellen cis Konfigurationen in die zugehörigen trans Zustände übergehen.
Die mit dieser cis-trans Isomerisierung verbundene Geometrieänderung des Farbstoffs
ist in Abbildung 1.2 hervorgehoben. Sie besteht sowohl in einer Streckung des Farbstoffs
um etwa 3 Å als auch in einer Änderung des Winkels zwischen den beiden Phenylgruppen,
an welche die jeweiligen Peptidstränge kovalent gebunden sind. Daher löst die lichtinduzierte cis-trans Isomerisierung zunächst starke Zugkräfte auf die angrenzenden Peptidreste aus, welche von dort in das restliche Peptid propagiert werden. Insgesamt lenkt die
dadurch getriebene Konformationsdynamik die Moleküle aus den nun gestörten Gleichgewichtsensembles der cis Zustände in die relaxierten Gleichgewichtsensembles der trans
Zustände.
Hier stellen sich nun die Fragen, wie die angesprochenen Konformerenensembles zusammengesetzt sind, welche Wege in den hochdimensionalen Konfigurationsräumen im
Verlaufe der lichtinduzierten cis-trans Relaxation eingeschlagen werden und auf welchen
2
1.1 Lichtinduzierte Dynamik
Abbildung 1.2: Geometrieänderung von Azobenzol durch Photoisomerisierung. Durch Lichteinfall kann Azobenzol vom cis- zum trans-Isomer und vice versa geschaltet werden, was starke
Geometrieänderungen mit sich bringt.
Zeitskalen die Relaxationsprozesse ablaufen. Es war das Hauptziel der vorliegenden Dissertation, diese Fragen mit Mitteln der Molekulardynamik-(MD-)Simulation [7, 8, 12],
in enger Kooperation mit laufenden spektroskopischen Untersuchen in der Arbeitsgruppe
Zinth, zu beantworten. Trotz der stark reduzierten Komplexität der beiden recht kleinen
Modellpeptide zeigte sich im Laufe meiner Arbeit relativ schnell, dass der notwendige Rechenaufwand zur Beantwortung der angesprochenen Fragestellungen selbst mit den vorhandenen, sehr effizienten und parallelisierten MD-Simulationsprogrammen EGO/MMII
[13] und GROMACS [14], sowie unter Einsatz moderner Linux-Rechnercluster, kaum zu
bewältigen war. So konnte ich im Verlaufe meiner Arbeit zeigen [15], dass der Rechenaufwand, der durch die Faltungsreaktion unseres β-Haarnadelpeptids aufgeworfen wird,
mit gegenwärtigen Mitteln der MD-Simulation nicht zu bewältigen ist.
Dieser Faltungsprozess kann durch trans-cis Photoisomerisierung mit Licht der Wellenlänge λ ≈ 400 nm ausgelöst werden und ist nach zeitaufgelöster Spektroskopie [16]
erst auf einer Zeitsakala von etwa 30 µs abgeschlossen. Solche Zeitskalen sind durch MD
Simulationen mit explizitem Lösungsmittel derzeit noch nicht erreichbar, zumal für statistisch valide Aussagen dutzende Simulationen notwendig sind. Beispielsweise liegt der
kürzlich erreichte ”Weltrekord” für die Simulationsdauer einer einzelnen Simulation eines in Lösung simulierten Peptids bei 10 µs [17]. Entsprechend konzentrierte ich mich auf
die durch cis-trans Photoisomerisierung ausgelöste Entfaltungsdynamik, die auf sehr viel
kürzeren Zeitskalen abläuft. Allerdings müssen für eine simulationsgestützte Beschreibung dieser Dynamik zunächst die stationären cis- und trans-Konformerenensembles des
in Methanol gelösten β-Haarnadelpeptids berechnet werden, um damit (i) die Start- und
Endpunkte der lichtinduzierten Dynamik zu kennen und (ii) den Fortschritt der Relaxation
messen zu können. Da ein statistisch valides Abtasten der Konformerenensembles mittels konventioneller MD wegen den angesprochenen Faltungszeiten aber aussichtslos ist,
wurde zur Lösung des Abtastproblems die Verwendung von Verallgemeinerte-Ensemble
3
1 Einführung
Techniken vom Replica-Exchange (RE) Typ [18–20] angedacht. Hier setzte nun meine
oben erwähnte und im Kapitel 2 abgedruckte Arbeit an, in der ich zeigen konnte, dass
RE-Techniken im gegebenen Falle von β-Haarnadelpeptiden bedauerlicherweise keinen
Vorteil gegenüber konventionellen Simulationstechniken aufweisen.
Durch das bisher Gesagte drängt sich der Eindruck auf, dass das von meinem Doktorvater vorgeschlagene und oben skizzierte Projekt von vornherein zum Scheitern verurteilt
war. In Teilen, nämlich bei dem angesprochenen β-Haarnadel Peptid, war dies tatsächlich
der Fall. Dennoch trugen die wenigen zu diesem Haarnadelpeptid durch MD Simulationen
erzielten Resultate zu einer hochrangig publizierten Arbeit bei [16] und führten außerdem
zu einem besseren Verständnis der RE-Abtasttechniken [15], welches kurzgefasst darin
besteht, dass Temperaturerhöhung nicht notwendigerweise zu einem beschleunigten Abtasten führen muss.
Aufgrund der geschilderten Schwierigkeiten verwendete ich die mir zur Verfügung
stehenden Computerressourcen zur Beschreibung des zyklischen cAPB Peptids. Auch für
dieses Peptid besteht unter Verwendung konventioneller MD das Abtastproblem, weshalb
mein Vorgänger, Heiko Carstens, dazu übergegangen war, die Ensembles der Gleichgewichtskonformere durch drastische Temperaturerhöhung auf 500 K zu berechnen. Seinem
Vorgehen lag die Annahme zugrunde, dass das Abtastproblem bei cAPB durch enthalpischen Barrieren zwischen den einzelnen Konformationen hervorgerufen wird und diese
Barrieren daher durch Temperaturerhöhung schneller überwunden werden, so dass das
Abtasten des Konformationsraumes erheblich beschleunigt wird [5].
Trotz guter Übereinstimmung der bei 500 K simulierten Konformerengemische mit bei
300 K gewonnenen NMR Daten [9], musste aber befürchtet werden, dass die berechneten Hochtemperaturensembles stark von den bei Raumtemperatur zu erwartenden Simulationsergebnissen abweichen. Zur Begründung sei festgestellt [10], dass die beobachtete
Übereinstimmung [5] mit den NMR Daten ein zwar notwendiges nicht aber hinreichendes Kriterium für die Realitätsnähe eines durch Simulation bestimmten Ensembles von
Peptidstrukturen ist. Wie ich weiter oben bereits erklärt habe, ist die genaue Kenntnis
der Konformerenensembles bei Raumtemperatur aber eine entscheidende Voraussetzung
für eine simulationsgestützte Beschreibung des lichtinduzierten Übergangs von dem einen
(z.B. cis) in das andere (z.B. trans) Ensemble. Aufgrund der relativ guten Konvergenz
der 500 K Simulationen [5] bestand im Fall von cAPB die begründete Hoffnung, dass die
300 K Konformerenensembles mit RE-Techniken statistisch valide berechnet werden können, da diese Methoden Hochtemperatursimulationen nutzen, um den Konformationsraum
bei Raumtemperatur beschleunigt abzutasten.
Aus diesen Gründen stellte sich mir zu Beginn meiner Dissertation die spezielle Aufgabe, die in der Literatur vorgeschlagenen RE-Techniken für die Arbeitsgruppe technisch
verfügbar zu machen, um sie dann für cAPB und andere Systeme einsetzen zu können.
Diese Aufgabe entwickelte sich nach und nach zu einem eigenen Projekt, in dem über
die Sichtung und Implementierung vorhandener RE-Methoden hinaus deren Verbesserung
und Weiterentwicklung auf der Agenda stand. Aufgrund der erzielten Erfolge stellen die
Ergebnisse dieser Untersuchungen den Grundpfeiler meiner kumulativen Dissertation dar.
Um das Verständnis der nachfolgend im Hauptteil dieser Arbeit abgedruckten Publikatio-
4
1.2 Aufbau und Struktur von Proteinen
Abbildung 1.3: a) Peptidsynthese: Zwei Aminosäuren verbinden sich unter Abspaltung von Wasser zu einem Dipeptid. Das durch die Peptidbindung entstehende torsionsstabile und polare Peptidplättchen ist durch den rautenförmigen gestrichelten Rahmen gekennzeichnet. Die Polarität des
Plättchens wird durch den eingezeichneten Dipol (Pfeil) angezeigt. Das Peptidrückgrat ist grau
unterlegt. b) Die φ/ψ Diederwinkel bestimmen die relative Lage der Peptidplättchen zueinander.
Sie stellen die wesentlichen Freiheitsgrade des Peptidrückgrats dar.
nen zu erleichtern, möchte ich einleitend zunächst noch auf einige grundlegende Begriffe
zum Aufbau und zur Struktur von Proteinen sowie zu deren theoretischer Beschreibung
durch Computersimulationen eingehen. Anschließend werde ich einen Überblick über die
sog. Verallgemeinerte-Ensemble Abtasttechniken geben und mit der Simulated Tempering
und der Replica Exchange Methode die Grundlagen zweier populärer Vertreter dieser Verfahren ausführlich darstellen.
1.2 Aufbau und Struktur von Proteinen
Die molekularen Grundbausteine der Proteinen sind die Aminosäuren, deren chemische
Zusammensetzung in Abbildung 1.3a dargestellt ist. Neben dem zentralen Cα Atom be−
steht eine Aminosäure aus einer Aminogruppe (NH+
3 ), einer Carboxylgruppe (COO ),
einem Wasserstoffatom (H) und einer aminosäurespezifischen Seitengruppe (R), dem sog.
Rest. Durch die in Abbildung 1.3a dargestellte Peptidbindung, an der die beiden funktionellen Gruppen COO− und NH+
3 beteiligt sind, können Aminosäuremonomere sukzessive
unter Abspaltung von Wasser zu einem Polypeptid synthetisiert werden, wobei Peptide
mit mehr als etwa 100 Aminosäuren generell als Proteine bezeichnet werden [2].
Durch die Peptidbindung entsteht die in der Abb. 1.3a grau unterlegte flache und tor-
5
1 Einführung
sionsstabile Peptideinheit, das sog. Peptidplättchen. Eine weitere, besonders wichtige Eigenschaft des Peptidplättchens ist seine große Polarität, die durch die hohe Elektronegativität des Sauerstoffes verursacht wird. Die resultierende Ladungsverteilung erzeugt ein
elektrisches Dipolmoment, welches in Abb. 1.3a durch einen senkrechten Pfeil angedeutet
wird. Aufgrund der Steifigkeit der Peptidplättchen wird die Struktur des Peptidrückgrats
im Wesentlichen durch die relative Orientierung der Plättchen bestimmt. Deren relative
Ausrichtung wird durch die in Abb. 1.3b dargestellten φ/ψ-Diederwinkel beschrieben.
Bei Vernachlässigung der Seitengruppen stellen diese Winkel die wesentlichen Freiheitsgrade des Peptids dar.
In den Zellen werden die Proteine in den Ribosomen synthetisiert [2]. Dabei greift die
Natur auf einen Satz aus 20 verschiedenen Aminosäuren zurück, deren Abfolge die Primärstruktur des Proteins definiert und im Genom festgelegt ist [2]. Während und nach der
Synthese faltet ein Protein in seine native 3-dimensionale Struktur, die sog. Tertiärstruktur,
die seine Funktion bestimmt [21]. Die Tertiärstruktur ist üblicherweise die Struktur mit
minimaler freier Energie (Beispiele für Ausnahmen findet man in [22, 23]) und ist eindeutig durch die Primärstruktur und durch die Umgebung, in die das Protein eingebettet
ist, festgelegt [24–26]. Über die Tertiärstruktur hinaus können sich mehrere Proteine auch
zu einem Proteinkomplex zusammensetzen. Die dann aus verschiedenen Tertiärstrukturen zusammengesetzte Struktur des Proteinkomplexes wird als Quartärstruktur bezeichnet. Aufgrund der schon eingangs angesprochenen hohen Proteinkonzentration in der Zelle besteht insbesondere während des Aufbaus sehr komplex zusammengesetzter Proteine
das Risiko der unkontrollierten Aggregation von metastabilen Faltungszwischenzuständen
[27]. Die damit verbundene Fehlfaltung [21, 28] kann Krebs, Alzheimer, Parkinson und
andere schwere Krankheiten auslösen [28]. Um das Risiko der Fehlfaltung zu vermindern
wird daher der Faltungsprozess in der Zelle durch Helferproteine, die sog. Chaperone,
unterstützt [29].
Allen nativen Proteinstrukturen ist gemein, dass sie sich aus rigiden Untereinheiten,
den sog. Sekundärstrukturelementen, zusammensetzen. Die beiden häufigsten Sekundärstrukturen sind die α-Helix und das antiparallele β-Faltblatt, die uns schon im letzten Kapitel als mögliche Strukturen des cAPB und des Haarnadel Modellpeptids begegnet sind.
Beide Strukturelemente wurden Anfang der 1950er Jahre von Linus Pauling und Kollegen
als mögliche Strukturuntereinheiten der Proteine wegen der besonders günstigen elektrostatischen Anordnung der Dipole vorgeschlagen [30, 31]. In Abbildung 1.4 sind diese beiden Sekundärstrukturen mit Blick auf die Anordnung der Dipole schematisch dargestellt.
Während die Dipole des Faltblattes (Abb. 1.4 rechts) entlang des Rückgrats abwechselnd
in entgegengesetzte Richtungen zeigen und mit durch viele Residuen getrennten Dipolen
bindend in Wechselwirkung treten, richten sich die Dipole einer α-Helix (Abb. 1.4 links)
parallel aus und binden an Dipole, die entlang des Rückgrats nur wenige Residuen entfernt
liegen. Diese „lokale“ bzw. „nichtlokale“ Wechselwirkung der Dipole in der Helix bzw.
im Faltblatt hat Auswirkungen auf die Zeitskalen, auf denen sich diese Sekundärstrukturelemente bilden. Während die Faltung einer ungeordneten Peptidkette zu einer α-Helix
in weniger als 0.1 µs erfolgen kann, dauert die Bildung von Faltblättern mit mindestens
1 µs erheblich länger [21, 32, 33]. Entsprechend falten Proteine je nach Helix- und Falt-
6
1.2 Aufbau und Struktur von Proteinen
a
b
H
C
Cα
Cα
N
C
O
O
H
Cα
N
C
Cα
O
H
Cα
N
C
Cα
H
N
O
C C
N C C
H
O
H
N
O
C C
H
C
N
N
C
H
O
H
O
H
N C C
O
C
H
N
C N C C
O
N CC N
H
C
O
H
O
O
Abbildung 1.4: Sekundärstrukturmotive. a) Schematische Darstellung einer α-Helix. Die
kollinear-parallele Anordnung der Dipole führt zur Stabilisierung dieses Strukturmotivs und zur
Ausbildung eines makroskopischen Dipolmoments. b) Antiparalleles β-Faltblatt. Energetisch stabilisierend wirken hier die kollinear-parallel ausgerichteten Dipole verschiedener Peptidstränge
und die axial-antiparallel ausgerichteten Dipole im Strang benachbarter Peptidgruppen.
blattanteil unterschiedlich schnell. Kurze Faltungszeiten von weniger als 100 µs sind für
helixreiche Proteine möglich [34–36], während heterogener zusammengesetzte Proteine
üblicherweise auf Zeitskalen von dutzenden Millisekunden bis Minuten falten [37–39].
In vielen Fällen kann die 3-dimensionale Struktur eines Proteins unter kontrollierten
Bedingungen außerhalb der Zelle durch experimentelle Methoden wie die Röntgenkristallographie [40–42] oder die Kernspinresonanzspektroskopie (NMR) [42–44] recht genau
aufgeklärt werden. Die detaillierte Aufklärung dynamischer Prozesse wie Faltung und
Umfaltung gestaltet sich hingegen trotz vielfältiger experimenteller Methoden [45–52] als
extrem schwierig. Dies liegt an den hohen Anforderungen, die an die zeitliche und räumliche Auflösung gestellt werden müssen, um ein detailliertes Bild der Proteindynamik zu
gewinnen. Im Prinzip können Computerexperimente diesen hohen Anforderungen gerecht
werden und somit wichtige Einsichten in die Struktur und Dynamik von Proteinen liefern.
Für den Fall einer unbekannten Proteinstruktur können Computersimulationen beispielsweise Informationen zur Stabilität potentieller Molekülstrukturen liefern [53–55], oder
Hinweise auf Faltungsdynamiken geben, aus denen sich mögliche Faltungsszenarien [42]
ableiten lassen [56, 57].
7
1 Einführung
1.3 Computersimulationen an Biomolekülen
Wie ich im vorangegangenen Abschnitt bereits angedeutet habe, können Computersimulationen Einblicke in die Struktur und Dynamik von Biomolekülen in einer räumlichen
und zugleich zeitlichen Auflösung vermitteln, die bislang mit experimentellen Methoden
nicht erreicht wird [58]. So konnte im Jahr 1977 die bis dahin verbreitete Vorstellung,
es handele sich bei Proteinen im nativen Zustand um recht rigide Strukturen, durch eine
Molekulardynamik (MD) Simulation an einem Protein widerlegt werden [12, 59]. Dieser
Simulation lag ein Modell zugrunde, in dem die Wechselwirkung der Atome durch eine
semi-empirische Energiefunktion beschrieben wurde [59, 60]. Die Bauart dieser auch als
molekularmechanisches (MM) Kraftfeld bezeichneten Energiefunktion findet sich im Wesentlichen noch heute in modernen MM Kraftfeldern wie CHARMM [61, 62], AMBER
[63], GROMOS [64] oder OPLS [65] wieder und wird im nachfolgenden Abschnitt näher
erläutert.
1.3.1 Molekularmechanische Kraftfelder
Die Wechselwirkungen zwischen den Atomen werden in einem MM Kraftfeld durch eine
Energiefunktion der Form
X (ijk)
X (ij)
kθ (θijk − θˆijk )2
E(R; A) =
kb (rij − rˆij )2 +
angle
bond
+
X
torsion
(ijkl)
kφ,h [1
− cos(nh φijkl − φˆijkl )]
#
"
X qi qj
X Aij 12 Bij 6
−
+
+
rij
rij
rij
coul.
vdW
(1.1)
beschrieben1 [59, 60]. Hierin bezeichnet R ≡ (r1 , r2 , . . . , rn ) die Konfiguration der n
Atome des Simulationssystems. Die Ausdrücke rij ≡ |rj − ri |, θijk ≡ θ(ri , rj , rk ) und
φijkl ≡ φ(ri , rj , rk , rl ) bezeichnen die in Abbildung 1.5 dargestellten internen Koordinaten. Das Symbol A steht für einen Satz kraftfeldspezifischer Parameter die entweder
empirisch aus experimentellen Daten oder ab inito aus quantenmechanischen Rechnungen gewonnen werden.
Bis auf eine für die Kraftberechnung irrelevante Konstante bestimmen die ersten drei
Summenterme in Gleichung (1.1) den bindenden Anteil der potentiellen Energie E(R; A).
Dieser Anteil umfasst die Wechselwirkungsenergie kovalent gebundener Atome, die durch
höchstens drei Bindungen voneinander getrennt sind. Durch die Parameter rˆij , θˆijk und
φˆijkl wird die Gleichgewichtsgeometrie festgelegt. Abweichungen der Bindungsabstände rij von den Gleichgewichtsabständen rˆij werden energetisch durch den ersten Summenterm „bestraft“. Der zweite bzw. dritte Summenterm bestraft Abweichungen der Bindungswinkel θijk bzw. Diederwinkel φijkl von den zugehörigen Gleichgewichtswinkeln
1
8
Die Abhängigkeit des Indexes h von i, j, k, l wurde der Übersicht wegen unterdrückt.
1.3 Computersimulationen an Biomolekülen
Abbildung 1.5: Interne Koordinaten eines MM Kraftfeldes. Stellvertretend für die Bindungslängen
rij , Bindungswinkel θijk und Diederwinkel φijkl sind in der Abbildung die Bindungslänge r12
(links), der Bindungswinkel θ123 (Mitte) und der Diederwinkel φ1234 (rechts) eingezeichnet.
θˆijk und φˆijkl . Die Stärke der energetischen Bestrafung wird hierbei durch die jeweiligen
Kraftkonstanten k (ij) , k (ijk) und k (ijkl) festgelegt. Dabei sind die Kraftkonstanten k (ij) und
k (ijk) der Bindungsabstände und Bindungswinkel recht groß, so dass in diesen Freiheitsgraden Veränderungen der Molekülgeometrie gegenüber der Gleichgewichtsgeometrie nur
sehr begrenzt möglich sind. Insbesondere ist das Aufbrechen von Bindungen wegen der
(ij)
harmonischen Potentiale kb (rij − rˆij )2 gänzlich ausgeschlossen. Im Gegensatz zu k (ij)
und k (ijk) haben die Kraftkonstanten k (ijkl) vereinzelter Diederwinkel kleine bis moderate
Werte, was in Verbindung mit den beschränkten Termen [1 − cos(nh φijkl − φˆijkl )] im dritten Summenterm zu einer großen Flexibilität in den entsprechenden Diederwinkeln führen
kann. Als Beispiel für sehr flexible Diederwinkel denke man an die schon besprochenen
φ/ψ-Diederwinkel des Peptidrückgrats (vgl. Abb. 1.3).
Die Wechselwirkungsenergien zweier Atome, die innerhalb eines Moleküls durch mehr
als drei kovalente Bindungen voneinander getrennt sind oder zu verschiedenen Molekülen
gehören, werden durch die beiden letzten Summenterme in Gleichung (1.1) erfasst und
stellen den nichtbindenden Anteil der potentiellen Energie E(R; A) dar. Dabei wird die
kurzreichweitige van-der-Waals Wechselwirkung durch ein Lennard-Jones Potential mit
den Parametern Aij und Bij beschrieben. Während einzelne, nichtionisierte Atome ungeladen sind, kommt es in einem Molekül aufgrund der unterschiedlichen Elektronegativitäten der aneinander gebundenen Atome zu mehr oder minder großen Verschiebungen der
Elektronendichte. Im einfachsten Fall wird dieser Sachverhalt pauschal durch sog. Partialladungen qi beschrieben, die sich an den Kernorten ri befinden. Die elektrostatische
Energie zweier partiell geladener Atome i und j ergibt sich dann aus der Coulombwechselwirkung, die durch den letzten Summenterm in der Gleichung (1.1) berücksichtigt wird.
Mögliche, auf dem soeben beschriebenen Kraftfeld E(R; A) beruhende Simulationstechniken sind die bereits erwähnte MD Methode sowie das Monte-Carlo (MC) Verfahren.
Nachfolgend werde ich, beginnend mit der MD Methode, die Grundprinzipen dieser beiden Simulationstechniken darstellen.
9
1 Einführung
1.3.2 Simulationsmethoden
In MD Simulationen wird die Dynamik der Kerne (und damit der Atome) eines Molekülsystems klassisch durch die Newtonschen Bewegungsgleichungen
mi
d2
ri (t) = −∇i E[R(t); A]
dt2
(1.2)
beschrieben, wobei mi die Masse und −∇i E[R(t); A] die auf das Atom i wirkende Kraft
bezeichnet. Diese Bewegungsgleichungen lassen sich numerisch effizient und genau durch
den sog. Verlet Algorithmus [66] lösen. Nach dem Verlet Schema wird der neue Ort
ri (t + ∆t) = 2ri (t) − ri (t − ∆t) − ∇i E[R(t); A](∆t)2 /mi + O(∆t4 )
(1.3)
des Atomes i zum Zeitpunkt t + ∆t durch die aktuelle und vergangene Position ri (t) und
ri (t − ∆t) bis auf die 3. Ordnung in ∆t genau bestimmt. Der Zeitschritt ∆t wird mit
einer Femtosekunde so gewählt, dass selbst die schnellen CH-Streckschwingungen noch
hinreichend glatt diskretisiert werden [67].
Sieht man von numerischen Ungenauigkeiten ab, so bleibt in einer MD Simulation die Energie E erhalten. Um Oberflächeneffekte bei der Beschreibung von ProteinLösungsmittel-Systemen zu vermeiden und den Druck kontrollieren zu können, wird das
endliche Simulationssystem üblicherweise periodisch fortgesetzt. Demnach ist neben der
Energie auch das Simulationsvolumen V und die Teilchenzahl N konstant, so dass eine derartige MD Simulation ein mikrokanonisches (N V E) Ensemble generiert. Dabei
wird die sog. Ergodenhypothese unterstellt, die besagt, dass im Grenzfall unendlich langer
Simulationszeiten das Zeitmittel mit dem Ensemblemittel übereinstimmt. Entsprechend
lassen sich durch hinreichend lange MD Simulationen neben dynamischen Systemeigenschaften auch Ensembleeigenschaften berechnen. In den meisten MD Simulationen wird
allerdings anstatt der Energie E die Temperatur T konstant gehalten, da sich Proteine in
Körperzellen näherungsweise in einem kanonischen (N V T ) Ensemble befinden.
Ist man nicht an den dynamischen Eigenschaften sondern ausschließlich an Ensembleeigenschaften eines Simulationssystems interessiert, dann stellt die MC Simulationsmethode eine Alternative zu MD dar. In einer MC Simulation wird in jedem Simulationsschritt eine neue Systemkonfiguration Rneu = Ralt + δR durch eine zufällige Änderung
δR erzeugt, welche die alte Konfiguration Ralt mit einer zu ermittelnden Wahrscheinlichkeit P ersetzt. Für den Fall, dass mit einer MC Simulation ein kanonisches Ensemble
simuliert werden soll, ergibt sich die Übergangswahrscheinlichkeit aus dem Metropolis
Kriterium [68]
E(Ralt ; A) − E(Rneu ; A)
,
(1.4)
P = min 1, exp
kB T
wobei kB die Boltzmann Konstante bezeichnet.
Tatsächlich bieten die im weiteren Verlauf dieser Arbeit vorgestellten VerallgemeinerteEnsemble Abtasttechniken die Möglichkeit, die MD und MC Methoden miteinander zu
10
1.3 Computersimulationen an Biomolekülen
verbinden. Dahinter steckt das Ziel, die statistische Aussagekraft und damit die Qualität von Computersimulationen zu verbessern. Bevor ich aber auf die VerallgemeinerteEnsemble Abtasttechniken eingehe, werde ich im folgenden Abschnitt die qualitätsbegrenzenden Faktoren von MM Computersimulationen diskutieren.
1.3.3 Schwachstellen von MM Computersimulationen
Einem gemäß Gleichung (1.1) definierten MM Kraftfeld liegen zahlreiche Näherungen zugrunde. Eher unproblematisch ist die Verwendung harmonischer Potentiale zur Beschreibung der Streck- und Bindungswinkelschwingungen, obschon bei Raumtemperatur die
schnellsten Streckschwingungen nach der Quantenmechanik im Grundzustand sind, weshalb die entsprechenden Freiheitsgrade quantenmechanisch „eingefroren“ sind. Diesem
Sachverhalt wird allerdings häufig Rechnung getragen, indem die Auslenkungen der Bindungslängen aus den Gleichgewichtslagen durch den sog. Shake-Algorithmus [69] unterbunden werden. Problematischer ist die Näherung der Ladungsverteilung in einem Molekül durch fixe Partialladungen qi , da hierdurch Ladungsverschiebungen in polarisierbaren
Molekülen, die durch komplex geordnete Umgebungsstrukturen — wie sie in Proteinen
generell anzutreffen sind — in spezifischer Weise erzeugt werden, unberücksichtigt bleiben müssen [7]. Weiterhin kann eine adäquate Beschreibung der φ/ψ Diederwinkelpotentiale durch trigonometrische Funktionen schwierig sein. Für das CHARMM Kraftfeld
weiß man beispielsweise, dass die dort verwendete Parametrisierung der Diederwinkelpotentiale fälschlicherweise die sog. π-Helix Struktur präferiert. Um dieses Verhalten zu
korrigieren, wurde die Parametrisierung der CHARMM Diederwinkelpotentiale durch eine als CMAP bezeichnete Korrekturkarte erweitert [70].
Neben der Genauigkeit des Kraftfeldes ist die statistische Güte einer Simulation entscheidend für deren Aussagekraft. Statistisch aussagekräftig ist eine Computersimulation
genau dann, wenn deren Ergebnisse auch durch eine beliebige Verlängerung der Rechenzeit keine bzw. nur unwesentliche Veränderungen erfahren. Tatsächlich stellt diese Konvergenzforderung eine große Herausforderung dar. Insbesondere für den Fall sehr großer
Simulationssysteme mit abertausenden Atomen ist diese Bedingung auch mit moderner
Computertechnik trotz monatelanger Rechenzeiten häufig nicht zu erfüllen. Je nach Systemgröße sind heutzutage etwa 108 bis 1010 Simulationsschritte möglich [17, 71–73], was
bei einer MD Simulation mit Femtosekundenzeitschritt einer Simulationszeit von einer
zehntel bis zehn Mikrosekunden entspricht. Entsprechend lassen sich nur sehr schnelle
Dynamiken durch vielfache Wiederholung mit hoher statistischer Güte simulieren.
Um Simulationszeiten jenseits von zehn Mikrosekunden zu erreichen, sind weitere, im
obigen Modell nicht enthaltene Näherungen notwendig. Hierzu muss man wissen, dass eine realistische Computersimulation eines Proteins das umgebende Lösungsmittel mit einbeziehen muss. Geschieht dies explizit, so besteht das Simulationssystem typischerweise
zu mehr als 90% aus Lösungsmittelatomen [7, 58]. Daher lässt sich der Rechenaufwand
erheblich reduzieren, falls die durch das Lösungsmittel hervorgerufene elektrostatische
Wechselwirkungsenergie implizit beschrieben wird [7, 58].
Eine populäre Methode zur impliziten Beschreibung des Lösungsmittel ist die sog. Ge-
11
1 Einführung
neralized Born (GB) Methode [74, 75]. Allerdings ist diese Methode dafür bekannt, andere freie Energielandschaften zu erzeugen als Simulationen mit explizitem Lösungsmittel
[76, 77] und Salzbrücken zu überstabilisieren [78]. Ferner kann das Fehlen der Lösungsmittelviskosität zu falschen Zeitskalen dynamischer Prozesse führen [79]. Darüber hinaus
fußt die GB Kontinuumsbehandlung des Lösungsmittel nicht auf der Poisson Gleichung
[80], weshalb es dieser Methode an physikalischer Stringenz fehlt [81]. Trotzdem sind die
mit Kontinuumselektrostatik erreichbaren Simulationszeiten von mittlerweile über einer
Millisekunde [75] sehr beeindruckend, weshalb auch in unserer Arbeitsgruppe seit einigen Jahren an der Entwicklung einer auf der Poisson Gleichung basierenden impliziten
Lösungsmittelmethode [82–84] gearbeitet wird. Allerdings ist diese Entwicklung noch
nicht abgeschlossen und damit ist die Methode derzeit noch nicht einsetzbar.
Aus den geschilderten Gründen sah ich in meiner Arbeit von der Anwendung impliziter Lösungsmittelmethoden ab und konzentrierte mich auf Methoden, die die statistische
Güte einer Simulation erhöhen, ohne dabei zusätzliche Näherungen in Kauf nehmen zu
müssen. Im nachfolgenden Abschnitt werde ich näher auf diese Methoden eingehen.
1.4 Verallgemeinerte-Ensemble Abtasttechniken
Die zur Bestimmung der strukturellen N V T Gleichgewichtsensembles von Peptiden und
Proteinen mittels konventioneller MD (oder MC) nötigen Simulationszeiten werden stark
durch die jeweilige Form der freien Energielandschaft beeinflusst, weil die Geschwindigkeit, mit der eine freie Energiebarriere bei der Temperatur T überwunden wird, proportional zu dem sog. Arrheniusfaktor exp[−∆F/kB T ] mit
∆F = ∆E − T ∆S
(1.5)
ist [85]. Hierin bezeichnet T ∆S den entropischen Anteil und ∆E den enthalpischen Anteil der freien Energiebarriere ∆F und kB die Boltzmann Konstante. Hohe Barrieren
verursachen daher lange Verweildauern des Simulationssystems in Konformationszuständen mit niedriger freier Energie. Entsprechend lange Simulationszeiten können notwendig
werden, damit die wesentlichen Konformationen des Simulationssystems hinreichend oft
besucht werden, was eine Grundvoraussetzung für das ergodisches Abtasten des Konformerenensembles darstellt.
Tatsächlich lässt sich die Abtastgeschwindigkeit mit einem Trick steigern. Der Trick
besteht darin, die Simulation bei einer geänderten Temperatur und/oder unter Verwendung
eines Kraftfeldes E(R; A′ ) mit modifizierten Parametern A′ durchzuführen. Dabei müssen die Änderungen derart vorgenommen werden, dass die dimensionslosen freien Energiebarrieren ∆F/kB T des Simulationssystems verkleinert werden. Allerdings simuliert
die so modifizierte Simulation anstelle des ursprünglichen (physikalischen) Boltzmannensembles ein neues, mehr oder minder unphysikalisches Ensemble, das als verallgemeinertes Ensemble bezeichnet wird [86]. Die Eigenschaften des physikalischen Boltzmannensembles können am Ende der Simulation durch eine sog. Regewichtung [87] aus den
12
1.4 Verallgemeinerte-Ensemble Abtasttechniken
Temperatur 2 kcal/mol 4 kcal/mol 6 kcal/mol 8 kcal/mol 10 kcal/mol
300 K
400 K
500 K
600 K
1.0
2.3
3.8
5.3
1.0
5.3
14.6
28.5
1.0
12.3
55.6
151.9
1.0
28.5
212.3
810.6
1.0
65.8
810.6
4325
Tabelle 1.1: Faktoren, um die sich die Abtastgeschwindigkeit für verschiedenen Temperaturen und
enthalpische Barrieren erhöht. Die Referenztemperatur ist 300 K.
Daten des verallgemeinerten Ensemble berechnet werden. Dabei werden statistische Fehler des verallgemeinerten Ensembles durch das „Zurückrechnen“ umso größer, je stärker
das Kraftfeld und/oder die Temperatur modifiziert wurden. Daher liefert eine Regewichtung nur dann gute Ergebnisse, wenn (i) eine hohe statistische Qualität im verallgemeinerten Ensemble erreicht wurde und (ii) das verallgemeinerte Ensemble nicht „zu weit“
vom physikalischen Ensemble entfernt ist. Insbesondere müssen die wesentlichen Konformationen im urprünglichen Boltzmannensemble auch wesentlich im verallgemeinerten
Ensemble sein, da auch eine Regewichtung keine statistischen Aussagen über Bereiche
liefern kann, die von der Simulation nicht abgetastet wurden.
In den vergangenen drei Jahrzehnten wurden zahlreiche verallgemeinerte Ensembleabtasttechniken [18–20, 88–103] entwickelt.1 Trotz der Vielfalt an Ansätzen lassen sich diese
Techniken allesamt den beiden bereits oben angesprochenen Strategien zuordnen: Erhöhung der Abtastgeschwindigkeit durch (i) die Modifikation des Kraftfeldes [88–97] oder
durch (ii) die Erhöhung der Temperatur [18–20, 98–103].
Mit der sogenannten Umbrella Sampling (US) Methode von Torrie und Valleau [88]
wurde die Strategie (i) erstmals vor etwa 30 Jahren umgesetzt. Deutlich jünger ist die erste
auf Temperaturerhöhung basierende Abtasttechnik, die sogenannte Simulated Tempering
(ST) [99, 100] Methode, die entgegen einfacher Hochtemperatursimulationen auch für Simulationssysteme mit einer hohen Anzahl an Freiheitsgraden noch praktikabel ist und aus
der sich später die sehr populäre Replica Exchange (RE) Methode [18–20] entwickelt hat.
Der Vorteil von ST und RE gegenüber der US Methode ist, dass am Ende einer Simulation keine Regewichtung notwendig ist, da diese schon während der Simulation durch
den dort implementierten Wechsel- bzw. Austauschprozess ganz automatisch stattfindet.
Zudem basiert die US Methode auf einer geeigneten Modifikation des Kraftfeldes, während die ursprünglichen ST und RE Verfahren ausschließlich auf einer sehr viel einfacher
umzusetzenden Temperaturerhöhung beruhen. Aus diesem Grund habe ich mich in meiner Arbeit auf die Untersuchung und Anwendung der beiden letzt genannten Methoden
konzentriert, deren Funktionsweise ich im Folgenden darstelle.
1
Eine sehr umfangreiche Auflistung und Klassifizierung bislang entwickelter Abtasttechniken findet der
Leser in einer Arbeit von Hansen und Hünenberger [97].
13
1 Einführung
Abbildung 1.6: Illustration der Simulated Tempering Strategie. Temperaturerhöhung ermöglicht
dem Simulationssystem das beschleunigte Überwinden von Energiebarrieren ∆E und damit ein
schnelleres Abtasten des Konformationsraumes K.
1.4.1 Simulated Tempering
Werden die freien Energiebarrieren ∆F = ∆E − T ∆S eines Simulationssystems überwiegend durch deren enthalpische Beiträge ∆E bestimmt, so lässt sich durch Temperaturerhöhung der Quotient ∆F/kB T deutlich verkleinern, was nach den obigen Ausführungen
zu einer Steigerung der Abtastgeschwindigkeit führt. Um eine Vorstellung zu bekommen,
wie sich verschiedene Temperaturerhöhungen von ursprünglich 300 K für unterschiedliche enthalpische Barrieren auf die Abtastgeschwindigkeit auswirken, habe ich in der
Tabelle 1.1 den nach dem Arrheniusfaktor zu erwartenden Geschwindigkeitszuwachs für
mehrere Szenarien angegeben. Aus der Tabelle 1.1 geht hervor, dass eine Temperaturerhöhung für den Fall hoher Barrieren ∆E besonders effektiv ist. Das Simulated Tempering
(ST) Verfahren nutzt genau diesen Sachverhalt aus, indem dem Simulationssystem die
Möglichkeit gegeben wird, während der Simulation zu unterschiedlichen Temperaturen
einer vorgegebenen Temperaturleiter T0 < T1 < · · · < TN −1 hin und her zu wechseln,
um letztendlich eine verbesserte Ensemblestatistik bei der Zieltemperatur T0 zu erzeugen. Dazu wird das Simulationssystem einer MD (oder MC) Simulation unterworfen, deren Temperatur Ti in zeitlich vorgegebenen Intervallen gegebenenfalls durch eine neue
Temperatur Tj ersetzt wird. Durch häufige Temperaturwechsel befindet sich das Simulationssystem gelegentlich auch bei hohen Temperaturen, bei denen das System, wie in
Abbildung 1.6 illustriert wird, enthalpische Barrieren leichter überwinden kann.
Damit ein Temperaturwechsel von Ti nach Tj keinen Einfluss auf die von der Simulation erzeugten kanonischen Ensemblestatistiken bei den einzelnen Temperaturen (insbesondere bei T0 ) hat, muss dieser Prozess ein Flussgleichgewicht zwischen dem Zustand
vor dem Wechsel (R, Ti ) und dem Zustand nach dem Wechsel (R, Tj ) darstellen [104].
14
1.4 Verallgemeinerte-Ensemble Abtasttechniken
Diese Forderung wird durch die sog. detaillierte Bilanz
Pi (R)Pi→j = Pj (R)Pj→i
(1.6)
sichergestellt, wobei Pi→j die Akzeptanzwahrscheinlichkeit für einen Temperaturwechsel
von Ti nach Tj bezeichnet.1 Pi (R) ist die Boltzmannwahrscheinlichkeit die Konfiguration
R bei der Temperatur Ti zu finden:
exp[−E(R; A)/kB Ti ]
.
(1.7)
Z
P
Hierin bezeichnet Z = k Zk die kanonische Zustandssumme des verallgemeinerten Ensembles, welches sich aus den einzelnen
kanonischen Ensembles k (k ∈ {0, . . . , N − 1})
P
mit den Zustandssummen Zk = R′ exp(−E(R′ , P)/kB Tk ) zusammensetzt. Akzeptanzwahrscheinlichkeiten, die im Einklang mit der Bilanzgleichung (1.6) stehen, lassen sich
mithilfe des Metropolis Kriteriums [68] wie folgt angeben:
Pi (R) =
Pi→j ≡ min{1, Pj (R)/Pi (R)}
und Pj→i ≡ min{1, Pi (R)/Pj (R)}.
(1.8)
Tatsächlich hat die ST Methode noch einen
P „Haken“. Um dieses Problem zu erkennen,
muss man die Wahrscheinlichkeiten Pk = R′ Pk (R′ ), das Simulationssystem bei einer
Temperatur Tk zu finden, genauer betrachten. Die Wahrscheinlichkeiten Pk ergeben sich
gerade als Quotienten Zk /Z aus den Zustandssummen Zk und Z. An diesen „Aufenthaltswahrscheinlichkeiten“ ändert auch der Wechselprozess der Temperaturen nichts, solange
die geforderte Bilanzgleichung (1.6) erfüllt ist. Da die einzelnen Zustandssummen Zk im
Allgemeinen sehr unterschiedlich ausfallen, gilt dies auch für die Wahrscheinlichkeiten
Pk , was dazu führen kann, dass sich das Simulationssystem während einer Simulation de
facto nie bei der Zieltemperatur T0 befindet (P0 ≈ 0). In einem solchen Fall hätte man
keine Simulationsdaten bei T0 und die Simulation wäre nutzlos.
Glücklicherweise lassen sich die Wahrscheinlichkeiten Pk durch eine temperaturspezifische Anpassung des Kraftfeldes E(R; A) neu justieren. Für jede Temperatur Tk wird
ein neues Kraftfeld
E˜k (R; A) ≡ E(R; A) + wk kB Tk
(1.9)
eingeführt, welches in einfacher Weise aus dem Orginalkraftfeld E(R; A) und einer für
die Kräfteberechnung unwesentlichen Konstanten wk kB Tk hervorgeht. Die Konstanten wk
nennt man Gewichte. Diese werden
durch wk = − ln Zk für gewöhnlich so gewählt, dass
P
die Zustandssummen Z˜k =
exp[−
E˜k (R′ , P)/kB Tk ] des neuen verallgemeinerten
′
R
Ensembles allesamt den Wert 1 annehmen [105]. Entsprechend haben die neuen „Aufenthaltswahrscheinlichkeiten“ P˜k allesamt den gleichen Wert 1/N , was zu einem uniformen
Abtasten des Temperaturraumes führt. Da sich durch die temperaturspezifischen Kraftfeldmodifikationen die Boltzmannwahrscheinlichkeiten Pk (R) aus Gleichung (1.7) zu
exp[−E˜k (R; A)/kB Tk ]
P˜k (R) =
Z˜
1
(1.10)
Demnach wird ein Temperaturwechsel von Ti nach Tj mit der Wahrscheinlichkeit Pi→j ausgewürfelt und
stellt damit einen MC Schritt dar.
15
-10500
Häufigkeiten
pot. Energie (kcal/mol)
1 Einführung
320 K
-10750
300 K
-11000
0
1
2
3
Simulationszeit (ns)
4
5
-11000
-10750
-10500
potentielle Energie (kcal/mol)
Abbildung 1.7: Schwankungen der potentiellen Energie während einer MD Simulation. Links:
Zeitliche Entwicklung der potentiellen Energie für ein Simulationssystem bestehend aus rund 1000
Wassermolekülen und einem darin gelösten Oktapeptid simuliert bei zwei unterschiedlichen Systemtemperaturen (300 K, 320 K). Rechts: Die durch Histogramme dargestellten Verteilungen der
potentiellen Energien der 300 K Simulation (blau) und der 320 K Simulation (rot). Trotz eines vergleichsweisen geringen Temperaturunterschieds haben die beiden Verteilungen nur einen kleinen
Überlapp.
ändern, verändern sich auch die Akzeptanzwahrscheinlichkeiten aus Gleichung (1.8). Mittels Gleichung (1.10) erhält man die neuen Akzeptanzwahrscheinlichkeiten
E˜i (R; A) E˜j (R; A)
−
,
P˜i→j = min{1, exp(∆i→j )} mit ∆i→j =
kB Ti
kB Tj
wobei ∆i→j durch Gleichung (1.9) auf die Form
1
1
∆i→j =
−
E(R, A) − (wj − wi ),
kB Ti kB Tj
(1.11)
(1.12)
gebracht werden kann.
Nachdem bis hierhin die Theorie zu ST eingehend dargestellt wurde, möchte ich abschließend noch einige Anmerkungen zu verschiedenen Details der ST Methode machen,
über die ich bislang stillschweigend hinweggegangen bin, die aber für das Verständnis der
Methode und deren praktische Umsetzung wichtig sind. In Bezug auf die Umsetzung einer ST Simulation ist es nützlich zu erkennen, dass sich die Kraftfeldmodifikationen aus
Gleichung (1.9) nur während des Temperturwechsels durch veränderte Akzeptanzwahrscheinlichkeiten auf die Simulation auswirken. Zwischen den Wechselversuchen haben
die Konstanten wk kB Tk in Gleichung (1.9) keinen Einfluss auf die Dynamik des Simulationssystems, so dass dort weiterhin das Orginalkraftfeld E(R, A) verwendet werden
darf.
In den bisherigen Ausführungen zu ST wurde ausschließlich die potentielle Energie
E(R; A) anstelle der Gesamtenergie berücksichtigt. Für den Fall von MC Simulationen
stellt dies kein Problem dar, da dort die potentielle Energie und die Gesamtenergie identisch sind. Dies gilt allerdings nicht für MD Simulationen, da dort die kinetische Energie
16
1.4 Verallgemeinerte-Ensemble Abtasttechniken
für die Dynamik des Systems verantwortlich ist. In einer wegweisenden Arbeit von Sugita und Okamoto [20] wurde aber gezeigt, dass bei einem Wechselprozess die kinetische
Energie keine Rolle spielt, solange der Temperaturwechsel,
p beispielsweise von Ti nach1 Tj ,
von einer Skalierung der Atomgeschwindigkeiten durch Tj /Ti begleitet wird [20].
Ein weiterer, noch nicht angesprochender Punkt ist die Wahl der Temperaturleiter.
Diese darf nicht völlig willkürlich gewählt werden. Vielmehr müssen die Verteilungen
der potentiellen Energien benachbarter Temperaturen Ti und Ti±1 einen deutlichen Überlapp besitzen, da dieser Überlapp ein Maß für die Akzeptanzwahrscheinlichkeiten Pi→i±1
darstellt [106]. Aus diesem Grund beschränkt man sich gewöhnlich auf Wechselversuche
zwischen benachbarten Temperaturen, da Temperaturwechsel zwischen nicht benachbarten Temperaturen deutlich unwahrscheinlicher sind.
Wie sensibel die Verteilungen der potentiellen Energien auf Temperaturänderungen
reagieren, wird durch Abbildung 1.7 deutlich. Hier bewirkt eine Temperaturerhöhung von
nur ∆T = 20 K schon eine derart große Verschiebung der Energieverteilung, dass der
Überlapp zwischen den Energieverteilungen fast verschwindet. Entsprechend klein würden in diesem Fall die Akzeptanzwahrscheinlichkeiten ausfallen. Dabei ist das Simulationssystem, auf die sich Abbildung 1.7 bezieht, mit ca. 3000 Atomen vergleichsweise
klein. Für Simulationssysteme mit einer sehr viel größeren Teilchenzahl n fällt der Überlapp der potentiellen Energieverteilungen noch deutlich geringer aus, da die Verschiebung
∆E¯ des potentiellen Energiemittelwertes
E¯ proportional zu n ausfällt, während die Brei√
te σE der Verteilung nur mit n anwächst.2 Daher liegen in ST Simulationen mit hinreichend großen Akzeptanzwahrscheinlichkeiten die Temperaturdifferenzen benachbarter
Temperatursprossen üblicherweise bei nur wenigen Grad Kelvin, was eine große Anzahl
N an Sprossen notwendig macht, damit das verallgemeinerte Ensemble einen hinreichend
großen Temperaturbereich abdecken kann. Eine hohe Sprossenzahl N führt allerdings dazu, dass nur ein Bruchteil der im verallgemeinerten Ensemble simulierten Information zur
Ensemblestatistik bei T0 beiträgt, was sich negativ auf die Abtasteffizienz auswirkt.
Ein möglicher Kompromiss zwischen nicht zu kleinen Akzeptanzwahrscheinlichkeiten
und einer nicht zu großen Anzahl an Temperatursprossen besteht darin, die Temperaturleiter so zu wählen, dass die mittlere Dauer, die das Simulationssystem für eine Wanderung
von T0 nach TN −1 und zurück benötigt (die sog. round-trip Zeit), minimiert wird. Dieser
Strategie liegt die Annahme zugrunde, dass eine Minimierung der round-trip Zeit eine
Maximierung der Abtastgeschwindigkeit bei T0 bewirkt. Mit der Frage, wie und unter
welchen Voraussetzungen eine solche Temperaturleiter für ein Simulationssystem konstruiert werden kann, beschäftigt sich das 3. Kapitel dieser Dissertation. Tatsächlich lässt
sich durch eine geeignete, über die Energieverschiebung aus Gleichung (1.9) hinausgehende Modifikation des Kraftfeldes die Anzahl N der Temperatursprossen reduzieren und
damit die Abtasteffizienz der ST Methode steigern. In diesem Zusammenhang wird in Kapitel 4 eine neue ST Variante, das sog. Simulated Solute Tempering vorgestellt. Dort wird
1
Zwar bezieht sich die Arbeit von Sugita und Okamoto direkt auf die im nächsten Abschnitt dargestellte
Replica Exchange Methode, die dargelegte Theorie kann aber direkt auf ST übertragen werden.
2
¯
Diese Argumentation beruht darauf, dass n proportional zur
√ extensiven Wärmekapazität C ≈ ∆E/∆T
ist und im kanonischen Ensemble die Beziehung σE = T kB C gilt [107].
17
1 Einführung
auch ausführlich erläutert, wie die a priori unbekannten Gewichte wk = − ln Zk (genau
genommen die entsprechenden Differenzen wj − wi ) durch Vorabsimulationen bei den
einzelnen Temperaturen Tk abgeschätzt werden können.
Obwohl es sich herausgestellt hat, dass die Gewichte durch recht kurze Vorabsimulationen schon recht gut abgeschätzt werden können [105], bleibt immer eine gewisse
Restunsicherheit, zumal der Erfolg einer ST Simulation sehr sensibel von den Gewichten
abhängt. Im nächsten Abschnitt werde ich daher zeigen, wie eine fast zwanglose Erweiterung der ST Methodik zu der sog. Replica Exchange Methode führt, in der aber, im
Unterschied zu ST, die Gewichte wk keine Rolle mehr spielen.
1.4.2 Replica Exchange
Die Replica Exchange (RE) Methode [18–20], die manchmal auch als Parallel Tempering
bezeichnet wird [108], übernimmt die Grundidee von ST, nämlich dem Simulationssystem
einen Temperaturwechsel entlang einer Temperaturleiter T0 < T1 < · · · < TN −1 zu ermöglichen. Im Gegensatz zum ST Verfahren, bei dem nur eine Simulation durch den Temperaturraum wandert, besteht eine RE Simualtion jedoch aus N identischen Simulationssystemen (Replikaten), die jeweils exklusiv bei einer Temperatur Ti (i ∈ {0, . . . , N − 1})
simuliert werden. In zeitlich vorgegebenen Intervallen kommunizieren jeweils Paare von
Replikaten miteinander, indem sie gegebenenfalls ihre Temperaturen Ti und Tj untereinander austauschen.1 Damit wird auch nach einem Austausch jede Temperatur der Temperaturleiter von genau einem Replikat bevölkert. Anstelle einer Akzeptanzwahrscheinlichkeit Pi→j für einen Temperaturwechsel hat man es nun mit einer Akzeptanzwahrscheinlichkeit Pij für einen gegenseitigen Temperaturaustausch zu tun, von der man — analog
zu ST — die Erfüllung der detaillierten Bilanz
Pi (Ri )Pj (Rj )Pij = Pi (Rj )Pj (Ri )Pji
(1.13)
fordert, wobei Pi und Pj Boltzmannwahrscheinlichkeiten analog zu Gleichung (1.7) bezeichnen. Wiederum liefert das Metropolis Kriterium [68]
Pij = min{1,
Pi (Ri )Pj (Rj )
Pi (Rj )Pj (Ri )
} und Pji = min{1,
}
Pi (Ri )Pj (Rj )
Pi (Rj )Pj (Ri )
(1.14)
eine Lösung der Bilanzgleichung (1.13). Durch Einsetzen der Boltzmannwahrscheinlichkeiten Pi und Pj ergibt sich
Pij = min{1, exp(∆ij )}
(1.15)
∆ij = (1/kB Ti − 1/kB Tj )[E(Ri ) − E(Rj )].
(1.16)
∆ij = ∆i→j ∆j→i ,
(1.17)
mit
Wie stark RE mit der ST Methode verwandt ist, zeigt der einfache Zusammenhang zwischen ∆ij aus Gleichung (1.16) und ∆i→j aus Gleichung (1.12)
1
Tj ist üblicherweise eine benachbarte Temperatur Ti±1 .
18
1.4 Verallgemeinerte-Ensemble Abtasttechniken
wobei zur Bestimmung der ∆i→j bzw. ∆j→i die Energie E(R, A) in der Gleichung (1.12)
trivialerweise durch E(Ri , A) bzw. E(Rj , A) ersetzt werden muss. Die Interpretation
dieser Beziehung wird besonders einfach, wenn man von Akzeptanzwahrscheinlichkeiten kleiner als eins ausgeht, so dass Pij = exp(∆ij ) und Pi→j = exp(∆i→j ) gilt. Damit
folgt aus Gleichung (1.17) Pij = Pi→j Pj→i , d.h. die Akzeptanzwahrscheinlichkeit Pij für
einen Temperaturaustausch zwischen zwei Replikaten ist gleich dem Produkt der beiden
Wechselwahrscheinlichkeiten Pi→j und Pj→i . Mithin ist damit Pij ≤ Pi→j , woraus abgeleitet werden kann, dass die Replikate einer RE Simulation längere round-trip Zeiten
besitzen als das Simulationssystem einer entsprechenden ST Simulation. In der in Kapitel 4 dieser Dissertation abgedruckten Publikation werden neben der Einführung der SST
Methode die Auswirkungen unterschiedlicher round-trip Zeiten auf die Abstastgeschwindigkeit untersucht. Dabei wird neben der SST Methode die Replica Exchange with Solute
Tempering Methode [101] angewendet, die das RE Pendant zu SST darstellt. Beide Methoden beruhen auf dem von Liu und Co-Autoren vorgestellten Solute Tempering Konzept
[101], auf welches ich im Folgenden kurz eingehen werde, da dieses Konzept in unserer
Arbeitsgruppe mithilfe meiner Expertise vielfach angewendet wurde [16, 55, 102, 109].
1.4.3 Solute Tempering
Die ursprüngliche Idee des Solute Tempering Konzepts besteht darin, nur das gelöste Molekül (engl. solute) nicht aber das Lösungsmittel einer Temperaturerhöhung zu unterwerfen. Damit wirkt eine Temperaturerhöhung auf nur noch vergleichsweise wenige Atome
des Simulationssystems. Da identische Simulationssysteme, die sich nur in der Temperatur weniger Atome unterscheiden, einen deutlich größeren Energieüberlapp besitzen als
solche, bei denen der selbe Temperaturunterschied alle Atome betrifft, können mit dieser
Strategie ST und RE Simulationen durchgeführt werden, deren Temperaturleitern deutlich
weniger Sprossen zur Abdeckung eines vorgegebenen Temperaturbereichs benötigen und
die dennoch große Akzeptanzwahrscheinlichkeiten haben.
Eine intuitive aber dennoch naive Umsetzung dieser Strategie könnte darin bestehen,
das gelöste Molekül einfach durch einen Thermostaten auf die gewünschte Temperatur Ti
zu bringen, während man durch einen zweiten Thermostaten das Lösungsmittel auf der
Zieltemperatur T0 hält. Eine derart ablaufende Simulation erzeugt jedoch kein kanonisches Ensemble bei der Temperatur Ti und verstößt somit gegen die Grundannahme, die
der ST und der RE Methode zugrunde liegt. Entsprechend würde das geschilderte Vorgehen, eingebunden in eine ST oder RE Simulation, kein kanonisches Ensemble bei der
Zieltemperatur T0 erzeugen.
Daher schlugen Liu und Co-Autoren [101] einen anderen Weg ein, der auf einem einfachen Sachverhalt beruht: Ein Simulationssystem unter der Wirkung eines skalierten Kraft˜
feldes E(R;
A) = E(R; A) · T /T0 generiert wegen
˜
exp(−E(R;
A)/kB T ) = exp(−E(R; A)/kB T0 )
die gleiche Boltzmannstatistik bei der Temperatur T wie das unskalierte System bei der
Temperatur T0 . Damit ist das skalierte System auch bei der erhöhten Temperatur T > T0
19
1 Einführung
seitens der Statistik effektiv „kalt“. Diesem Ansatz folgend schlugen die Autoren speziell
für die Anwendung auf eine RE Simulation eine temperaturabhängige Skalierung
Ek (R; A) = λk,0 E P P (R; A) + λk,1 E LL (R; A) + λk,2 E P L (R; A)
(1.18)
der Protein-Protein E P P , Lösungsmittel-Lösungsmittel E LL und Protein-Lösungsmittel
E P L Wechselwirkungsanteile des unskalierten Kraftfeldes E = E P P + E LL + E P L mit
den Skalierungsfaktoren
λk,0 = 1,
λk,1 =
Tk
T0 + Tk
und λk,2 =
T0
2T0
(1.19)
vor und bezeichneten dieses Verfahren als Replica Exchange with Solute Tempering (REST)
[101]. Entsprechend weist ein Replikat, dass bei der Temperatur Tk > T0 mit dem modifizierten Kraftfeld Ek (R; A) simuliert wird, ein effektiv kaltes (T0 ) Lösungsmittel auf,
wohingegen das von der Skalierung unberührte Protein die Temperatur Tk besitzt. Da das
Kraftfeld bei T0 unskaliert bleibt, erzeugt REST bei der Zieltemperatur T0 das gesuchte Boltzmannensemble, während die restlichen Ensembles (Tk > T0 ) wegen der Skalierung unphysikalische Boltzmannensembles sind. Allerdings müssen die ∆ij aus Gleichung (1.16) angepasst werden, da die Replikate nun neben den Temperaturen auch ihre
temperaturspezifischen Energiefunktionen austauschen. REST wird daher der Familie der
sog. Hamiltonian Replica Exchange Methoden [94, 95] zugerechnet [10]. Das ∆ij aus
Gleichung (1.16) ist nun durch
∆ij =
1
1
· [Ei (Rj ; A) − Ei (Ri ; A)] +
· [Ej (Ri ; A) − Ej (Rj ; A)]
kB Ti
kB Tj
(1.20)
zu ersetzen. Diese Beziehung lässt sich wieder über eine Bilanzgleichung analog zur Gleichung (1.13) ableiten.1
Ich musste allerdings feststellen, dass die Implementierung der von Liu und Co-Autoren vorgeschlagenen Skalierung (1.19) mit einem erheblichen programmiertechnischen
Aufwand verbunden ist. Dieser Aufwand lässt sich gänzlich umgehen, wenn man für λk,2
anstelle des arithmetischen Mittels aus λk,0 und λk,1 das geometrische Mittel verwendet.
Entsprechend wird in unserer Arbeitsgruppe die alternative Solute Tempering Skalierung
p
Tk
λk,0 = 1, λk,1 =
und λk,2 = Tk /T0
(1.21)
T0
verwendet, die durch eine einfache Modifikationen von Kraftfeldparametern herbeigeführt
werden kann.2 Beispielsweise
p führt die Skalierung der Partialladungen der Lösungsmittelatome mit dem Faktor Tk /T0 dazu, dass alle drei elektrostatischen Energieanteile
PP
LL
PL
Ek,elec
, Ek,elec
und Ek,elec
automatisch die Skalierungen (1.21) erfahren. Weiterführende
Informationen hierzu können im Methodenteil der in Kapitel 5 abgedruckten Publikation
nachgelesen werden.
1
Oder aber man nutzt die im letzten Abschnitt aufgezeigte Gleichung (1.17), in der ∆i→j durch Gleichung
˜i (R; A) durch Ei (Ri ; A) + wi kB Ti ersetzt werden muss (analoges gilt
(1.11) gegeben ist, wobei dort E
für ∆j→i ).
2
Formal bedeutet dies, es lässt sich ein Parametersatz Ak finden, so dass Ek (Rk ; A) ≡ E(Rk ; Ak ) gilt.
20
1.5 Die Struktur dieser Arbeit
1.5 Die Struktur dieser Arbeit
Den Hauptteil meiner Dissertation bilden die Kapitel 2 bis 5. In jedem dieser Kapitel
ist eine Publikation abgedruckt. Eine Ausnahme ist Kapitel 2, welches zwei meiner Publikationen umfasst. Jeder Publikation geht eine in deutsch verfasste Zusammenfassung
voraus, in der die Hintergründe und Ziele der nachfolgenden Veröffentlichung dargestellt
werden. Die Kapitel 2 und 3 befassen sich mit methodischen Aspekten von RE und ST
Simulationen. Das im letzten Abschnitt vorgestellte Solute Tempering Konzept und Erkenntnisse aus Kapitel 3 führten dann zur Entwicklung einer neuen ST Variante, die in
Kapitel 4 vorgestellt wird und deren Abtasteffizienz dort mit bereits etablierten Abtasttechniken verglichen wird. Im letzten Kapitel des Hauptteils steht dann die Untersuchung
von Nichtgleichgewichtsdynamiken im Vordergrund, wobei grundlegend für diese Arbeit
die genaue Bestimmung der zugehörigen strukturellen Start- und Zielensembles mittels
RE Simulationen waren. Den Schlussteil meiner Arbeit bildet das Kapitel 6, in dem die
Ergebnisse des Hauptteils noch einmal zusammengefasst werden und darüber hinaus —
an geeigneter Stelle — ein Blick auf zukünftige Entwicklungen gegeben wird.
21
2 Effizienzreduktion durch RE
Simulationen
Die Anwendung der Replica Exchange Methode beruht auf der Erwartung, dadurch die
Abtasteffizienz der Computersimulation steigern zu können. Der nachfolgend abgedruckte
Artikel1
Robert Denschlag, Martin Lingenheil, Paul Tavan:
„Efficiency reduction and pseudo-convergence in replica exchange sampling of peptide folding-unfolding equilibria“
Chem. Phys. Lett. 458, 244-248 (2008),
den ich zusammen mit Martin Lingenheil und Paul Tavan verfasst habe, zeigt erstmals,
dass dies nicht zwangsläufig der Fall ist. Hierzu wird die Faltungs-Entfaltungsdynamik
eines 3-Zustandssystems simuliert, welches aufgrund der gewählten freien Energiedifferenzen zwischen den einzelnen Zuständen ein Modell für ein β-Hairpin darstellt. Es zeigt
sich, dass Temperaturerhöhung die Faltungsdynamik des Modellsystems derart verlangsamt, dass sich in der Summe auch die Faltungs-Entfaltungsdynamik des Modellsystems
verlangsamt und als Folge die Abtasteffizienz durch die Anwendung der Replica Exchange
Methode reduziert wird. Zusätzlich werden in diesem Artikel die Fragen beantwortet, wie
die Qualität der Konvergenz einer Replica Exchange Simulation abgeschätzt werden kann
und welche Fehler bei der Konvergenzanalyse gemacht werden können.
1
Mit freundlicher Genehmigung des Elsevier Verlags (Lizenznummer: 2415941044663).
23
Chemical Physics Letters 458 (2008) 244–248
Contents lists available at ScienceDirect
Chemical Physics Letters
journal homepage: www.elsevier.com/locate/cplett
Efficiency reduction and pseudo-convergence in replica exchange sampling of
peptide folding–unfolding equilibria
Robert Denschlag, Martin Lingenheil, Paul Tavan *
Lehrstuhl für Biomolekulare Optik, Ludwig-Maximilians-Universität, Oettingenstr. 67, 80538 München, Germany
a r t i c l e
i n f o
Article history:
Received 20 March 2008
In final form 25 April 2008
Available online 3 May 2008
a b s t r a c t
Replica exchange (RE) molecular dynamics (MD) simulations are frequently applied to sample the
folding–unfolding equilibria of b-hairpin peptides in solution, because efficiency gains are expected
from this technique. Using a three-state Markov model featuring key aspects of b-hairpin folding
we show that RE simulations can be less efficient than conventional techniques. Furthermore we
demonstrate that one is easily seduced to erroneously assign convergence to the RE sampling, because
RE ensembles can rapidly reach long-lived stationary states. We conclude that typical REMD simulations covering a few tens of nanoseconds are by far too short for sufficient sampling of b-hairpin folding–unfolding equilibria.
Ó 2008 Elsevier B.V. All rights reserved.
Replica exchange molecular dynamics (REMD) is considered to
be a method for the efficient canonical sampling of biomolecular
properties. Therefore, the REMD simulation technique has been
frequently applied to generate equilibrium ensembles of biomolecules in solution [1–12]. Here, systems of particular interest were
peptides folding into a b-hairpin. Reported simulation times covered the range from a few nanoseconds up to a few tens of nanoseconds [9–12]. As an important result, free energy differences
DF between various conformational states were given [9–11]. A
reliable calculation of these values requires that the REMD simulations generate approximately the associated conformational equilibrium ensembles.
It is not easy to judge whether a necessarily finite simulation
time was actually long enough to sample the most important regions of conformational space. As an indication of convergence
one may take the observation that the calculated DF values do
not change too much upon a substantial elongation of the simulation time. Particularly for a b-hairpin peptide, which folds on time
scales of ls at ambient temperatures [13,14], one may ask whether
the equilibrium folding–unfolding ensemble can be sampled by
REMD within simulation times spanning a few tens of nanoseconds. In this Letter, we will address this question by constructing
a Markovian three-state model for b-hairpin folding–unfolding
and by executing extended replica exchange Monte Carlo (REMC)
simulations for this model. In addition, we check to what extent
replica exchange can help to speed up the sampling in the given
case.
Our three-state model system consists of a folded state f, a
transition state t, and an unfolded state u. For a b-hairpin peptide
* Corresponding author. Fax: +49 89 2180 9220.
E-mail address: tavan@physik.uni-muenchen.de (P. Tavan).
0009-2614/$ - see front matter Ó 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.cplett.2008.04.114
in solution, f will cover a small and compact region in the peptide’s configurational space. Correspondingly, f is characterized
by a small entropy Sf and a low energy U f , which accounts for
the hydrogen bonds linking the b-strands. We assume that the
configurational space covered by the transition state t is not much
larger. Thus, St is also small. However, the energy U t must be higher because some inter-strand hydrogen bonds will be broken. The
unfolded state u is extended and highly flexible with a large entropy Su and a large energy U u because of the broken hydrogen
bonds.
The concepts sketched above can be immediately transferred
into the set-up of a three-state Markov model for b-hairpin folding–unfolding. Table 1 lists our choice for the model parameters
Si and U i , i 2 ff; t; ug f1; 2; 3g. According to this choice, the free
energies F f and F u are equal at the temperature T ¼ T 0 . Note that
the equilibrium probability Pi ðTÞ of a state i at temperature T is
P
expð F i =kB TÞ=½ k expð F k =kB Tފ, where kB is the Boltzmann constant. Consequently, P f ðT 0 Þ equals Pu ðT 0 Þ.
b ¼ ðt ij Þ
Fig. 1 specifies the structure of the transition matrix T
defining the Markov model and illustrates the energetics of the
model at a temperature T > T 0 . Direct transitions from f to u and
vice versa are forbidden. Therefore, t13 and t 31 vanish. Furthermore,
we have chosen t22 ¼ 0 implying that the transition state t cannot
be occupied for more than one Markov step. We have chosen
t12 ¼ t 32 ¼ 1=2, which is plausible for all temperatures T >
ð2=3ÞT 0 because at these temperatures F f and F u are smaller than
F t (cf. Table 1). We assume detailed balance implying that
t21 P f ¼ t12 P t and t23 P u ¼ t32 P t , where Pf , Pt , and P u are the equilibrium probabilities of the states i. One gets t21 ¼ exp½ðF f F t Þ=
b becomes a stochastic makB TŠ=2 and t 23 ¼ exp½ðF u F t Þ=kB TŠ=2. T
trix for t 11 ¼ 1 t 21 and t33 ¼ 1 t23 . The associated stationary distribution is then ðPf ; P t ; Pu ÞT at all T > 2T 0 =3.
245
R. Denschlag et al. / Chemical Physics Letters 458 (2008) 244–248
b determines the folding and unfolding
The transition matrix T
rates kf ¼ t 12 t23 and ku ¼ t 32 t21 , respectively, at a temperature
T. One gets
non- or anti-Arrhenius behavior [15,16]. The time su þ sf for a complete unfolding and refolding process increases with temperature
for T=T 0 > 18=ð18 ln 2Þ 1. Thus, in contradiction to the common expectation of enhanced sampling, the sampling of the folding–unfolding equilibrium slows down at higher temperatures.
At the mid-point temperature T=T 0 ¼ 1, the folding and unfolding
times are equal, i.e. sf ¼ su 6:5 105 . This folding time can be
used as a bridge into real time if one somewhat arbitrarily assumes
that each Markov step represents a peptide dynamics lasting 10 ps.
Then 6:5 105 Markov steps would correspond to 6.5 ls, which is
a typical folding time for b-hairpin peptides at T 0 ¼ 300 K [13,14].
Our model parameters then seem to catch the temporal behavior of
real b-hairpin peptides. On the other hand, our three-state system
is too simple to reproduce the observed temperature dependence
of the folding time sf ðTÞ for all temperatures [14]. According to
the quoted experimental data sf is a convex function of T with a
minimum at a critical temperature, whereas it monotonously increases with T in our model (cf. Fig. 2).
We have applied REMC simulations [17] to our three-state Markov model. Such a REMC simulation propagates N þ 1 copies (replicas) of the investigated system at different temperatures
ðT 0 < T 1 < < T N Þ. The set of trajectories of these replicas constitutes a so-called generalized ensemble. After a fixed number of
simulation steps, one applies a probabilistic check whether the
configuration of a replica at a higher temperature can be exchanged with the configuration of a replica at a lower temperature.
If this exchange process satisfies detailed balance, the canonical
ensemble at each temperature is preserved. Detailed balance is
guaranteed if the exchange probability W i;j between the replicas i
and j is calculated by the Metropolis criterion
kf ðTÞ ¼ exp½bEu ð1=3
W i;j ðX i ¢ X j Þ ¼ minf1; expð½bi
Table 1
Energy terms of a three-state Markov model for b-hairpin folding–unfolding
State
F¼U
f
t
u
0
2Eu =3
Eu ð1
T S
U
T S
T=T 0 Þ
0
2Eu =3
Eu
0
0
Eu T=T 0
The free energy F, the energy U, and the entropy S are given in terms of the
unfolding energy Eu and a temperature T 0 defining the mid-point of the unfolding
transition through F f ¼ F u ¼ 0.
b defining our Markov model for bFig. 1. Illustration of the transition matrix T
hairpin folding–unfolding. States are ordered according to their free energies at a
temperature T > T 0 . The state energies are specified in Table 1 and the matrix
elements tij in the text.
T=T 0 ފ=4
ð1Þ
and
ku ðTÞ ¼ expð 2bEu =3Þ=4;
ð2Þ
with b ¼ 1=kB T. Starting at state u, the average number sf of Markov
steps to reach the folded state f is 1=kf . Likewise, the unfolding time
su is 1=ku . These times are specified by our choice for the unfolding
energy Eu ¼ 18kB T 0 .
Fig. 2 shows the resulting times sf , su , and sf þ su as functions of
the temperature ratio T=T 0 . Because of the purely energetic barrier
2Eu =3 between the states f and t (cf. Table 1), the unfolding time su
decreases strongly at large temperatures T=T 0 . In contrast, the folding time sf increases at elevated temperatures and converges to the
limiting value 4 expðEu =kB T 0 Þ 2:6 108 determined by the
entropic difference Eu =T 0 between the states u and t. The increase
of folding times for increasing temperatures has been termed as
UðX j ފÞg:
ð3Þ
Here, UðX i Þ denotes the energy U of state X 2 ff; t; ug of replica i and
bi ¼ 1=kB T i .
In the time span between exchanges, each replica has to follow
the pathways dictated by Fig. 1. But whenever an exchange attempt happened to be successful, states can appear in the trajectory at T i which are not linked by a Markov step with their
predecessor state. For instance, exchange can put the state u directly after f without setting a transition state t in between. Thus,
the trajectories of the replicas are characterized by the temperatures T i and by occasional exchanges of configurations. For this reason we will call such trajectories ‘‘configurational exchange” (CE)
replicas.
There is an equivalent though different point of view according
to which replicas exchange temperatures instead of configurations
and corresponding trajectories are exclusively composed of state
sequences following existing pathways in Fig. 1. We call such trajectories ‘‘temperature exchange” (TE) replicas. The CE and TE replicas are thus characterized either by discontinuities in state (i.e.
configuration) space (CE) or in temperature (i.e. momentum) space
(TE). As we will see, each of these two viewpoints should be considered when discussing the convergence behavior of replica exchange sampling.
As stated in our introductory remarks, free energy differences
are quantities of interest and their convergence behavior is a matter of concern. For REMC simulations of our three-state model, the
corresponding quantity is the time dependent free energy
difference
DFðtÞ ¼ kB T 0 ln½pf ðtÞ=pu ðtފ
Fig. 2. Average times sf for folding (dot-dashed), su for unfolding (dashed), and
their sum sf þ su (solid) as functions of the normalized temperature T=T 0 .
bj Š ½UðX i Þ
ð4Þ
between the states u and f calculated from the CE replica at temperature T 0 . DFðtÞ is given in terms of the relative frequencies pf ðtÞ and
pu ðtÞ at which the states f and u occurred during the simulation time
span ½0; tŠ. For DFðtÞ the discussion of convergence is simple because
246
R. Denschlag et al. / Chemical Physics Letters 458 (2008) 244–248
its long-time limit vanishes by construction. In simulations of more
complex systems the desired free energies are unknown, of course.
Nevertheless, even for such cases one can define observables
with known long-time limits. An example is the ratio qðtÞ of folding
and unfolding events occurring in the generalized ensemble during
the time span ½0; tŠ. The long-time limit of qðtÞ is obviously one,
independently of the specific system studied. If nf ðt0 Þ denotes the
total number of folded states observed in the generalized ensemble
at time t0 , the ratio qðtÞ is given by
Pt
max½0; nf ðt 0 Þ nf ðt0 1ފ
0
;
qðtÞ ¼ Pt t ¼1
0
1Þ nf ðt0 ފ þ t 0 ¼1 max½0; nf ðt
ð5Þ
with 0 < 1 guaranteeing that qðtÞ is always defined. The ratio qðtÞ
can be seen as a measure for the sampling quality of the folding–
unfolding equilibrium of the generalized ensemble. Therefore, one
can expect that the convergence of DFðtÞ (for the CE replica at T 0 ) requires the convergence of qðtÞ (in the generalized ensemble). Note
that the simple mixing of existing states f and u among the CE replicas, due to the exchange process, does not change the composition of
the generalized ensemble. Instead, transitions within the TE replicas
will be necessary for the convergence of qðtÞ and, thus, of DFðtÞ.
To study the convergence of qðtÞ and DFðtÞ, we have carried out
several REMC simulations of our three-state model differing from
each other with respect to the initial conditions. Independently
of the chosen initial conditions, all these simulations showed the
same type of general behavior. Therefore, we present the results
of a typical simulation. In this simulation the first five CE replicas
ðT 0 < < T 4 Þ started at state f and the remaining five at u.
The number of 10 replicas followed from the requirements that
(i) a temperature range from T 0 up to 2T 0 should be covered (which
is a typical range for REMD simulations with T 0 ¼ 300 K [9,10]) and
that (ii) the upward exchange probability W i;iþ1 ðfi ¢ uiþ1 Þ for a
folded state at T i with an unfolded state at T iþ1 should be 1=e. By
Eq. (3), the requirement (ii) leads to the recursive definition
T iþ1 ¼ Eu T i =ðEu kB T i Þ of a temperature ladder with the property
that T 9 ¼ 2T 0 . Note that, by Eq. (3), the downward exchange probability W i;iþ1 ðui ¢ fiþ1 Þ is 1. In the REMC simulation, an exchange
was attempted every second MC step as described by Sugita and
Okamoto [18].
Fig. 3a shows the total number of folded states nf in the generalized ensemble as a function of the simulation time. Immediately
after the start of the REMC simulation, nf decreases from 5 to 3
and reaches 1 after about 105 time steps. Thus, 4 unfolding events
occurred in this initial period of the REMC simulation. Subsequently, nf is seen to fluctuate around the value one by occasional
folding and unfolding events. The temporal average of nf will thus
P
approximate the expectation value
i P f ðT i Þ ¼ 0:89 increasingly
better. Altogether, 50 folding and unfolding events were counted
within the simulation time of 5 106 steps (on average one exP
pects 5 106 i 2=½sf ðT i Þ þ su ðT i ފ ¼ 40 events for our REMC simulation). For a conventional MC (CMC) simulation at the
temperature T 0 one expects on average one folding or unfolding
event every 6:5 105 time steps (cf. the discussion of Fig. 2). Thus,
the observation of 77 such events would be expected in a CMC simulation at T 0 , if this simulation takes the same computational effort
(10 5 106 MC steps) as the REMC simulation. Therefore, in the
given case the sampling provided by REMC is actually worse than
that provided by a CMC simulation.
The time evolution of the ratio qðtÞ is depicted in Fig. 3b. Apparently, the convergence of qðtÞ is satisfactory after 5 106 time
steps. Thus, the sampling of the folding–unfolding equilibrium in
the generalized ensemble seems to be ergodic, and one expects
convergence also for the free energy difference DFðtÞ at T 0 . In fact,
Fig. 3c demonstrates that DFðtÞ has converged excellently at the
end of the simulation.
a
b
c
Fig. 3. (a) Number of folded states nf , (b) ratio q, and (c) free energy difference DF as
functions of the simulation time t. The dashed lines indicate the respective expectation values. See the text for a discussion.
Recalling the above critique concerning the efficiency of REMC
sampling one now may ask as to whether the noted convergence
of DF can be equivalently achieved at a smaller computational effort using a CMC simulation. Therefore we have calculated the
average number of MC steps required to determine DF with an error smaller than one kB T 0 . From two sets of 100 REMC and 100 conventional MC simulations, respectively, we found average
durations of about 950 103 steps for REMC and of about
5000 103 steps for the conventional one. However, since REMC
employs 10 replicas a CMC simulation is by a factor of about 1.9
more efficient than a REMC simulation. This measured factor closely reproduces the corresponding expectation value which is obtained by dividing the number of folding and unfolding events
expected for a CMC simulation (77) by the number (40) expected
for REMC. Thus, the convergence of DF is directly determined by
the number of folding/unfolding events observed in the generalized ensemble.
The latter statement follows from the plausible assumption that
the average time which is needed by a TE replica to visit all temperatures is much shorter than the timescale on which the number
nf ðtÞ of folded states of the generalized ensemble changes. Under
this condition, which we call the ‘‘mixing condition”, a change of
nf ðtÞ affects all CE replicas before nf ðtÞ changes again. If the mixing
condition applies to a given RE simulation (as one can usually assume for b-hairpin folding–unfolding), the discussion of the RE
efficiency becomes quite simple. Then one solely has to compare
the number of folding–unfolding events expected for the general-
247
R. Denschlag et al. / Chemical Physics Letters 458 (2008) 244–248
ized ensemble in a RE simulation with that expected for a CMC
simulation. These expectation values can be easily calculated from
the temperature dependence of the folding–unfolding time sf þ su
displayed in Fig. 2.
If the mixing condition applies, the dependence of the RE efficiency on the chosen model parameters is easily identified. One
might ask, for instance, as to how a larger number of CE replicas
covering the temperature range ½T 0 ; T N Š will affect the RE efficiency.
Upon inspecting Fig. 2 one predicts a negligible effect (if the mixing
condition applies also to the larger number of replicas and if the
temperatures of the CE replicas are analogously distributed). For
example, if we add nine temperatures ðT i þ T iþ1 Þ=2 to the 10 existP
ing ones, we expect 5 106 10=19 18
i¼0 2=½sf ðT i Þ þ su ðT i ފ ¼ 40
folding and unfolding events (i.e. 20 folding–unfolding events),
which is (up to a difference of 0.06) the same number as the one
obtained above for 10 replicas.
By the same type of reasoning one can show that the RE efficiency depends on the location of the target temperature and of
the RE temperature range relative to a certain reference temperature T M , at which the temperature dependent folding–unfolding
time sf þ su assumes its minimum [the minimum shown in Fig. 2
is located at T M ¼ 18T 0 =ð18 ln 2Þ]. There are three cases: the target temperature can be (i) smaller, (ii) equal, or (iii) larger than T M .
For the cases (i) and (iii) RE guarantees a more efficient sampling if
the temperature T N of the CE replica N þ 1 is chosen just equal to
T M . In the case (iii) this choice is quite unusual because then T N
is smaller than the target temperature (‘‘cool replica sampling”).
In the case (ii) however, which is closely matched by our example
(T 0 T M ), an efficiency reduction cannot be avoided, because the
folding–unfolding times become larger both towards higher and
lower temperatures, i.e. the fastest sampling is achieved by CMC
at T M . Unfortunately, the temperature T M is generally unknown
so that an optimal set-up of an RE simulation is hard to guess. Thus,
by implicitly assuming T M to be much larger than the target temperature, one usually chooses the conventional ‘‘hot replica sampling” of case (i).
Despite the simplicity of our three-state model the limited
space of a letter prevents an exhaustive discussion of the parameter space. For instance, one could discuss a manifold of different
models by changing the energies and entropies of the states and,
thereby, aim at other systems than just hairpins. For example, consider a model in which all three states have the same entropy, the
states u and f have identically low energies, and the transition state
t has a much higher energy. In such a purely ‘‘enthalpic” case one
finds that T M ! 1. Therefore, the conventional RE set-up is always
more efficient than CMC (if the mixing condition applies). Note,
however, that with many replicas, the mixing condition can become invalid if the energy barriers measure only a few kcal/mol.
As argued in Ref. [19], in this case RE can become less efficient than
CMC.
Note that Nymeyer discussed the efficiency of RE in a most recent publication [21], which strongly supports the above analysis.
Both analyses require that the mixing condition applies and that
the generalized ensemble samples the equilibrium. Nymeyer’s analytical arguments additionally rest on the assumption that the
number of replicas is large. In contrast, our arguments preferentially apply to small numbers of replicas, because in this case the
key mixing condition has a wider range of validity.
For our simulation, which covers a time of 5 106 MC steps, the
conditions of mixing and equilibrium are obviously met (cf. Fig. 3).
The 5 106 MC steps approximately correspond to a time span of
50 ls in a real-time picture. This time span is by at least three orders of magnitude larger than those REMD simulation times, which
have been spent to describe the folding–unfolding equilibria of bhairpin peptides in solution [9–11]. In these REMD simulations,
the generalized ensemble will be initially far away from equilib-
rium. The necessity of an initial relaxation towards equilibrium
likewise applies to our REMC approach. Therefore, we can address
with our simulations the additional question to what extent shorttime REMD simulations can show convergence towards equilibrium. We will now look at this issue by scrutinizing the first 104
REMC steps (approximately modeling 100 ns of REMD).
Fig. 4a shows the time evolution of nf during these first 104 MC
steps. Here, the initial decay from five to three folded states occurring within the first 500 time steps is resolved. For the following
9500 time steps, neither unfolding nor folding events were observed. Thus, apart from a fast initial relaxation towards the expectation value of 0.89, no further relaxation processes happened to
occur. The next unfolding event in the generalized ensemble occurred much later, i.e. after about 105 MC steps corresponding to
1 ls (cf. Fig. 3a).
Fig. 4b shows the associated time evolution of DF. Since the unfolded state u did not appear in the CE replica at T 0 within the first
500 simulation steps, DF is not defined during this initial period.
After about 2000 time steps, DF becomes stationary at about
3kB T 0 . After 104 time steps, more than 450 transitions between
the states f and u were counted in the CE replica at T 0 due to the
configurational exchange. Nevertheless, convergence is by no
means reached because DF would then have to be zero (cf. Table
1) as indicated by the dashed line in Fig. 4b. Thus, after 104 steps
the simulation is still far away from a satisfactory sampling of
the folding–unfolding equilibrium ensemble.
In a usual simulation, in which the expectation value of DF is
unknown, the stationarity observed after a fast equilibration process and the frequent changes between f and u in the CE replica
at T 0 could lead to the erroneous assumption that DF is already
converged. We call such an observed stationarity ‘‘pseudo-convergence”. However, the erroneous interpretation of pseudo-convergence can be avoided by inspecting the ratio qðtÞ or a related
observable whose expectation value is a priori known. In our case,
this ratio is zero instead of one during the first 104 simulation steps
indicating the absence of real convergence.
On the other hand, one cannot be seduced to assume convergence of REMD simulations for peptide folding–unfolding equilib-
a
b
Fig. 4. Initial phase of the REMC simulation characterized by Fig. 3. Time evolutions
of (a) nf and (b) DF. See the text and the caption to Fig. 3 for further information.
248
R. Denschlag et al. / Chemical Physics Letters 458 (2008) 244–248
ria, if one compares in advance experimental folding times with
durations of the applied simulations. Whenever the folding time
sf has a given large value (e.g. ls) for all temperatures covered
by the replica exchange setting, one has to take care that the simulation times are at least of the same order of magnitude as sf for
convergence of DFðtÞ, independently of how the efficiency of replica exchange compares with a conventional simulation.
Summarizing, we would like to stress the most important results
of our analysis: (i) the replica exchange technique can offer a reduced
sampling efficiency for peptide folding–unfolding equilibria; (ii) a
sufficient sampling of the folding–unfolding equilibrium of a b-hairpin can hardly be reached with simulation times in the range of ns
[9–11]; (iii) replica exchange bears the danger of misinterpreting
certain stationarities as convergence which can be avoided by the
tracking and counting of transitions within the TE replicas and the
observation of recurrent states within these replicas.
Concerning the efficiency reduction that can be caused by replica exchange the key effect seems to be an anti-Arrhenius folding
behavior, which typically occurs if the energy of the low-entropy
transition state is below the energy of the high-entropy unfolded
state [16]. In the case of anti-Arrhenius folding behavior (and
Arrhenius like unfolding behavior), there is a specific temperature
T M for which the combined folding–unfolding time sf þ su is minimal. An efficiency reduction by the usage of replica exchange is
guaranteed if the temperature of interest is close to T M .
Although our discussion was based on a model mimicking the
folding–unfolding behavior of b-hairpin peptides, the phenomenon
of pseudo-convergence is more general. Pseudo-convergence may
occur in replica exchange whenever the simulation time is longer
than the time between exchange trials (1 ps) but shorter than
the timescale, on which the TE replicas switch between the relevant states (conformations). More frequent exchange trials solely
accelerate the appearance of pseudo-convergence. Thus, pseudoconvergence is expected to be a widespread phenomenon in short
REMD simulations. It bears the danger of wrong estimates of
ensemble averages and may falsely lead to the impression of a high
efficiency of replica exchange sampling.
Of course, the alleged merits of the replica exchange method
have already been scrutinized by various other authors. For exam-
ple, Zheng et al. [16] recently showed for a so-called kinetic network RE model that the efficiency of the replica exchange
approach breaks down, if the corresponding temperatures are too
far in the anti-Arrhenius temperature range. In the context of enhanced barrier crossing, Zuckerman and Lyman expressed doubt
whether, specifically for biomolecules, the commonly expected
gain in sampling speed at the target temperature T 0 can overcompensate the additional effort of simulating a large number of replicas [19]. Furthermore, Rhee and Pande [20] argued that short
REMD simulations generally cannot yield correct ensemble averages if folding times are much larger than the applied simulation
times. The sample system constructed by us now proves the correctness of their arguments.
Acknowledgement
This work was supported by the Deutsche Forschungsgemeinschaft (SFB 533/C1 and SFB 749/C4).
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
U.H.E. Hansmann, Chem. Phys. Lett. 281 (1997) 140.
K.Y. Sanbonmatsu, A.E. García, Proteins 46 (2002) 225.
F. Rao, A. Caflisch, J. Chem. Phys. 119 (2003) 4035.
J.W. Pitera, W. Swope, Proc. Natl. Acad. Sci. USA 100 (2003) 7587.
M. Cecchini, F. Rao, M. Seeber, A. Caflisch, J. Chem. Phys. 121 (2004) 10748.
G.S. Jas, K. Kuczera, Biophys. J. 87 (2004) 3786.
A. Villa, G. Stock, J. Chem. Theory Comput. 2 (2006) 1228.
A. Villa, E. Widjajakusuma, G. Stock, J. Phys. Chem. 112 (2008) 134.
R. Zhou, B.J. Berne, R. Germain, Proc. Natl. Acad. Sci. USA 98 (2001) 14931.
W.Y. Yang, J.W. Pitera, W.C. Swope, M. Gruebele, J. Mol. Biol. 336 (2004) 241.
P.H. Nguyen, G. Stock, E. Mittag, C.-K. Hu, M.A. Li, Proteins 61 (2005) 795.
T.E. Schrader, et al., Proc. Natl. Acad. Sci. USA 104 (2007) 15729.
C.D. Snow, E.J. Sorin, Y.M. Rhee, V.S. Pande, Annu. Rev. Biophys. Biomol. Struct.
34 (2005) 43.
D. Du, Y. Zhu, C.-Y. Huang, F. Gai, Proc. Natl. Acad. Sci. USA 101 (2004) 15915.
M. Karplus, J. Phys. Chem. 104 (2000) 11.
W. Zheng, M. Andrec, E. Gallicchio, R.M. Levy, Proc. Natl. Acad. Sci. USA 104
(2007) 15340.
K. Hukushima, K. Nemoto, J. Phys. Soc. Jpn. 65 (1996) 1604.
Y. Sugita, Y. Okamoto, Chem. Phys. Lett. 314 (1999) 141.
D.M. Zuckerman, E. Lyman, J. Chem. Theory Comput. 2 (2006) 1200.
Y.M. Rhee, V.S. Pande, Biophys. J. 84 (2003) 775.
H. Nymeyer, J. Chem. Theory Comput. 4 (2008) 626.
3 Optimierte Replica Exchange Protokolle
Im vorangegangenen Kapitel wurde ein System untersucht, dessen Faltungszeiten sich
durch Temperaturerhöhung verlangsamt. Verantwortlich für dieses sogenannte anti-Arrhenius Verhalten ist die freie Energiebarriere zwischen dem gefalteten und dem ungefalteten
Zustand, die ausgehend vom ungefalteten Zustand rein entropischer Natur ist. Für Systeme, deren Dynamik durch enthalpische Barrieren dominiert sind, ist die Replica Exchange
Methode jedoch gut geeignet, um die Abtasteffizienz einer Simulation zu erhöhen. Das
Maß der Effizienzsteigerung hängt wesentlich von der Paramtereinstellung der Replica
Exchange Methode ab.
3.1 Optimale Temperaturleitern
Im nachfolgenden Abdruck1 des Artikels
Robert Denschlag, Martin Lingenheil, Paul Tavan:
„Optimal temperature ladders in replica exchange simulations“
Chem. Phys. Lett. 473, 193-195 (2009),
den ich zusammen mit Martin Lingenheil und Paul Tavan verfasst habe, wird eine Formel
für eine optimale Temperaturleiter abgeleitet, die die Diffusionsgeschwindigkeit der Replikate im Temperaturraum maximiert, wobei eine konstante Wärmekapazität und die Verwendung des deterministischen „even-odd“(DEO) Austauschalgorithmus zugrunde gelegt
wird.
1
Mit freundlicher Genehmigung des Elsevier Verlags (Lizenznummer: 2415950020121).
31
Chemical Physics Letters 473 (2009) 193–195
Contents lists available at ScienceDirect
Chemical Physics Letters
journal homepage: www.elsevier.com/locate/cplett
Optimal temperature ladders in replica exchange simulations
Robert Denschlag, Martin Lingenheil, Paul Tavan *
Lehrstuhl für Biomolekulare Optik, Ludwig-Maximilians-Universität, Oettingenstr. 67, 80538 München, Germany
a r t i c l e
i n f o
a b s t r a c t
Article history:
Received 19 January 2009
In final form 20 March 2009
Available online 25 March 2009
In replica exchange simulations, a temperature ladder with N rungs spans a given temperature interval.
Considering systems with heat capacities independent of the temperature, here we address the question
of how large N should be chosen for an optimally fast diffusion of the replicas through the temperature
space. Using a simple example we show that choosing average acceptance probabilities of about 45% and
computing N accordingly maximizes the round trip rates r across the given temperature range. This result
differs from previous analyses which suggested smaller average acceptance probabilities of about 23%.
We show that the latter choice maximizes the ratio r=N instead of r.
Ó 2009 Published by Elsevier B.V.
1. Introduction
At given computer resources, the benefit of replica exchange [1–
3] (RE) simulations crucially depends on the choice of certain
parameters. Having chosen a temperature range ½T min ; T max Š, which
should be covered by the RE simulation, the optimal form of the
temperature ladder ðT 1 ¼ T min ; T 2 ; . . . ; T N ¼ T max Þ is an important
issue [4–10]. Aiming at a minimal average round trip time of the
replicas in the temperature space and assuming a constant heat
capacity C, which should approximately apply to explicit solvent
simulations [11], Nadler and Hansmann [9] have derived a formula
pffiffiffi
N 1 þ 0:594 C ln ðT max =T min Þ
ð1Þ
for the number N of rungs in the temperature ladder. In Eq. (1) the
(extensive) heat capacity C is given in units of the Boltzmann constant kB and refers to the potential energy part of the total energy.
As suggested by Okamoto et al. [12], from N one can determine the
temperature rungs T i ; i ¼ 1; . . . ; N, in the ladder by
T i ¼ T min ðT max =T min Þði
1Þ=ðN 1Þ
:
ð2Þ
This choice is generally expected [3] to provide equal exchange
probabilities pacc ðT i ; T iþ1 Þ ¼ pacc ðNÞ along the N-rung ladder.
Defining the function
aðNÞ ðT max =T min Þ1=ðN 1Þ ;
ð3Þ
one immediately finds that the temperature rungs are given by the
recursion
T iþ1 ¼ T i aðNÞ:
ð4Þ
Thus, for a given N, the ratio T iþ1 =T i is the constant aðNÞ. For such a
ladder and normally distributed potential energies, which is, along
* Corresponding author. Fax: +49 89 2180 9220.
E-mail address: tavan@physik.uni-muenchen.de (P. Tavan).
0009-2614/$ - see front matter Ó 2009 Published by Elsevier B.V.
doi:10.1016/j.cplett.2009.03.053
with a constant heat capacity, typical for explicit solvent simulation
systems, the average acceptance probabilities are very well approximated [6] by
pacc ðNÞ ¼ erfc
pffiffiffi
aðNÞ 1
;
C
aðNÞ þ 1
ð5Þ
pffiffiffiffi R 1
where erfcðx0 Þ ¼ 2= p x0 expð x2 Þdx is the complementary error
function.
In summary, for a ladder spanning the temperature range
½T min ; T max Š by the exponential spacing law Eq. (2), the temperature
rungs T i are uniquely given by N. Assuming a constant heat capacity and normally distributed potential energies, such a ladder then
actually provides equal average acceptance probabilities [Eq. (5)].
Therefore, temperature ladders obeying Eqs. (2) and (5) are uniquely specified by choosing either a certain number N of rungs
or a certain average acceptance probability pacc .
2. Methods and simulation set-up
To check whether the formula given in Eq. (1) and suggested by
Nadler and Hansmann [9] actually yields RE temperature ladders
with minimal round trip times, we have designed simple test systems suited for computationally inexpensive RE Monte Carlo
(REMC) simulations. The systems consist of d independent onedimensional and harmonic oscillators in the canonical ensemble
(we have chosen the same potential E ¼ x2 for all oscillators). At
each REMC step the coordinates of all d oscillators in a replica
are randomly drawn from the associated normal distributions,
the total energy Ei of the system at temperature T i is calculated,
and an exchange of systems at neighboring [3] temperatures is attempted with the Metropolis probability [13] pði; i þ 1Þ ¼
minf1; exp½ð1=kB T iþ1 1=kB T i ÞðEiþ1 Ei ފg. We employed the standard exchange scheme [1], which alternately attempts exchanges
between ‘even’ ðT 2i ; T 2iþ1 Þ and ‘odd’ ðT 2i 1 ; T 2i Þ replica pairs. Below
194
R. Denschlag et al. / Chemical Physics Letters 473 (2009) 193–195
we call this RE scheme, which combines the standard exchange
with the standard Metropolis criterion, the standard RE set-up.
Note the important fact that the heat capacity of our test system
is independent of the temperature and is given by C ¼ d=2. Therefore, it matches the conditions assumed in the derivation of Eq. (1).
Note furthermore that the force constants of the harmonic oscillator potentials are of no concern because the Metropolis probability
solely depends on the overlaps of the energy distributions.
First we consider the case d ¼ 100. As extremal temperatures
we choose T min ¼ 300 K and T max ¼ 800 K. With these parameters,
Eqs. (1), (2), and (5) yield N ¼ 5, the temperature ladder (300,
383.4, 389.9, 626.0, 800), and the average acceptance probability
pacc 22%, respectively. Previously also Kone and Kofke [6] and
Rathore et al. [5] have suggested an acceptance probability of
about 23% to be optimal. Thus, choosing the number of rungs
through Eq. (1) seems to yield a reasonable acceptance probability.
The question as to whether the above choice actually entails
minimal round trip times in REMC simulations can be addressed
by comparing the set-up outlined above with alternatives defined
by different choices of N. We tested ladders with
N 2 f3; 4; . . . ; 9; 10; 12; . . . ; 18; 20g each spanning the same temperature range [300 K, 800 K]. Every associated REMC simulation covered S = 500 000 MC steps. From each of these REMC simulations
we determined the number of round trips MðNÞ. Here, a round trip
was counted whenever a selected replica that started at T min subsequently reached T max and eventually returned to T min . Considering instead of the round trip time sðNÞ of a replica its inverse, the
round trip rate rðNÞ MðNÞ=S, we asked which acceptance probability pacc ðNÞ [cf. Eq. (5)] belongs to the maximal rate rðNÞ measured in any of our simulations.
3. Results
Fig. 1 shows the measured round trip rate r as a function rðpacc Þ
of the average acceptance probability pacc . Two data points are
additionally marked by the numbers of rungs in the associated ladders ðN ¼ 5; N ¼ 7Þ. According to the graph the round trip rate r is
maximal at pacc 0:42 belonging to the N ¼ 7 rung ladder. This result differs from the expectation voiced above that r should be
maximal at pacc 0:22 or N ¼ 5, respectively.
This surprising result raises the question why Eq. (1) yields a
prediction for the optimal N (or pacc ) differing from the measured
one. Nadler and Hansmann [9] started the derivation of Eq. (1)
by assuming for the round trip rate the plausible relation
rðpacc ; NÞ ¼ k pacc =NðN 1Þ with a certain constant k > 0. Using this
assumption, one predicts that the rate rð0:22; 5Þ ¼ k 0:22=
5ð5 1Þ 0:011 k should be larger than the rate rð0:42; 7Þ ¼
k 0:42=7ð7 1Þ 0:010 k, which is clearly at variance with the results of our simulation. Thus, the quoted relation does not yield
Fig. 1. Measured round trip rates r as a function of the average acceptance
probability pacc .
the correct round trip rate rðpacc Þ and, correspondingly, the choice
of the ladder size N through Eq. (1) does not maximize r, if the standard RE set-up is used.
To understand how r depends on pacc we introduce the average
(and relative) temperature move
jðT i ; T iþ1 Þ pacc ðT i ; T iþ1 Þ
T iþ1 T i
Ti
ð6Þ
of a replica per exchange trial (here one MC step). j measures the
average velocity of the replicas in a properly scaled temperature
space. This definition is motivated by our assumption that the average replica velocity j should be proportional to the round trip rate,
i.e. that j ¼ fr with a constant f > 0. Thus, we expect the largest
round trip rates r for the largest velocities j.
Using Eq. (4) and inverting Eq. (5) one finds for the average replica velocity
1
2erfc ðpacc Þ
jðpacc Þ ¼ pacc pffiffiffi
C
1
1
erfc ðpacc Þ
;
ð7Þ
where erfc denotes the inverse of the complementary error function. Thus, in contrast to the impression evoked by the definition in
Eq. (6), j is a constant within a given temperature ladder (because
pacc is a constant across each ladder).
The lines in Fig. 2 are the graphs of the function jðpacc Þ given by
Eq. (7) for systems with small and large heat capacities C ¼ d=2.
For d ¼ 100 (solid) the velocity j becomes maximal at pacc 0:42
(i.e. at N ¼ 7) and for d ¼ 1000 (dashed) at pacc 0:44 (i.e. at
N ¼ 22). Fig. 2 additionally displays scaled round trip rates fr measured for the small ðf ¼ 9:71Þ and the large system ðf ¼ 24:6Þ,
respectively. The good match of the scaled rates frðpacc Þ (circles/
squares) with the respective graphs jðpacc Þ verifies our assumption
that the average velocity jðpacc Þ of the replicas in scaled temperature space is proportional to the round trip rate rðpacc Þ. Note that
the value of the optimal average acceptance rate, at which the
round trip rate r becomes maximal, depends only weakly on the
system size d.
The noted weak dependence of the optimal average acceptance
probability pacc on the system size d ¼ 2C can be understood by
considering the limit of large systems ðC ! 1Þ. Using Mathematica
[14] we have determined the derivative j0 ðpacc Þ dj=dpacc and its
first order Taylor expansion j0 ðpacc Þ ¼ a0 þ a1 pacc þ Oðp2acc Þ at
pacc ¼ 0:5. In linear approximation j0 vanishes at p0acc ¼ a0 =a1
which is given by
pffiffiffi
p0acc 0:45 þ g= C
ð8Þ
Fig. 2. Average replica velocity j in scaled temperature space as a function of the
average acceptance probability pacc . The lines are graphs of jðpacc Þ calculated from
Eq. (7) for d ¼ 100 (solid) and d ¼ 1000 (dashed). The dots are the round trip rates r
of Fig. 1 scaled by the factor f ¼ 9:71. The squares are round trip rates scaled by
f ¼ 24:6 and resulting from REMC simulations of d ¼ 1000 oscillators. Here, the
rung numbers N 2 f10; 14; 18; 22; 30; 40g increase from left to right.
R. Denschlag et al. / Chemical Physics Letters 473 (2009) 193–195
195
ðT max =T min Þ1=ðN 1Þ 1 þ lnðT max =T min Þ=ðN 1Þ, which holds for large
N, one finds k pacc lnðT max =T min Þ=½NðN 1ފ such that k / pacc =
½NðN 1ފ. Recall now that Nadler and Hansmann erroneously assumed a relation of this kind for the round trip rate r (i.e. for j).
We would like to remark that our results are transferable to
simulated tempering [15,16] (ST) simulations. For ST, the average
acceptance probability is given by
pST
acc ðNÞ ¼ erfc
Fig. 3. k ¼ j=N as a function of the average acceptance probability pacc . The lines are
the graphs of Eq. (10) for d ¼ 100 (solid) and d ¼ 1000 (dashed).
with the constant g being of the order of 0.1. Thus for large systems the location p0acc of the maximum of j approaches 0.45 from
below. Note that we have also checked the limiting value 0.45 by
numerically analyzing Eq. (7) for very large C.
The thus established limiting value of the optimal average
acceptance probability leads to a new estimate
pffiffiffi
N 1 þ ð C =ð2 0:534Þ
1=2Þ lnðT max =T min Þ
ð9Þ
for the optimal number of rungs in the associated temperature
1
we have used Eq. (5), erfc
ladder. For computing Eq. (9)
pffiffiffi
pffiffiffi ð0:45Þ 0:534, and ln½1 þ 2 0:534=ð C 0:534ފ 2 0:534=ð C 0:534Þ
for large C.
In summary, to maximize the round trip rate or minimize the
round trip time, respectively, Eq. (9) has to be used instead of Eq.
(1). From a practical point of view, however, instead of aiming at
the maximal round trip rate, one may be content with a suboptimal rate if this choice is associated with a reduced computational
effort. Instead of maximizing j one may therefore consider the
quantity
kðpacc Þ jðpacc Þ=Nðpacc Þ;
ð10Þ
which exhibits a penalty linear in the number N of rungs. Fig. 3
shows the graphs of kðpacc Þ for d ¼ 100 (solid line) and d ¼ 1000
(dashed line). In both cases k is maximal at pacc 0:23. Applying
once again the reasoning used in the derivation of Eq. (8) one finds
that the optimal pacc approaches 0.234 for C ! 1. This result leads
to the estimate
pffiffiffi
N 1 þ ð0:594 C
1=2Þ lnðT max =T min Þ
ð11Þ
for the number of rungs optimizing the specific compromise
k ¼ j=N between the round trip rate r and the number N of replicas.
Note here that Eqs. (11) and (1) become identical for large C.
Thus, the number of rungs resulting from Eq. (1) effectively maximizes k instead of j (or, equivalently, a ladder size penalized round
trip rate r=N instead of the round trip rate r). That Nadler and
Hansmann [9] have effectively maximized k instead of r can be
alternatively understood by inserting the definition Eq. (6) into
(10). Using subsequently Eq. (4) and the approximation
pffiffiffiffiffiffiffiffiffi aðNÞ 1
C=2
:
aðNÞ þ 1
ð12Þ
which is obtained from the RE expression Eq. (5) through replacing
C by C=2 [5,17]. In the limit of large systems also C=2 becomes large
and the optimal value of pST
acc is likewise at 45% or 23% depending on
the optimized quantity. Similarly, the optimal N is given by Eqs. (9)
or (11), respectively, through replacing C by C=2.
4. Summary
For simulations employing the standard RE set-up, we have derived with Eqs. (9) and (11) two formulas for the optimal sizes N of
temperature ladders obeying Eqs. (2) and (5). Here, optimal means
that either the round trip rate r or the compromise r=N with computational effort is maximized. We have furthermore shown that
the suggestion Eq. (1) of Nadler and Hansmann [9] maximizes
the compromise r=N and not r, as claimed by the authors. An optimal r is obtained with average acceptance probabilities of about
45%, whereas an optimal r=N requires values of 23% matching earlier suggestions [6,5]. As a practical consequence of our study one
sees that average acceptance probabilities chosen in the range
from 20% to 45% are definitely ‘good’ choices featuring, however,
slightly different merits.
Acknowledgement
This work was supported by the Deutsche Forschungsgemeinschaft (SFB 533/C1 and SFB 749/C4).
References
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
K. Hukushima, K. Nemoto, J. Phys. Soc. Jpn. 65 (1996) 1604.
U.H.E. Hansmann, Chem. Phys. Lett. 281 (1997) 140.
Y. Sugita, Y. Okamoto, Chem. Phys. Lett. 314 (1999) 141.
C. Predescu, M. Predescu, C.V. Ciobanu, J. Chem. Phys. 120 (2004) 4119.
N. Rathore, M. Chopra, J.J. de Pablo, J. Chem. Phys. 122 (2005) 024111.
A. Kone, D.A. Kofke, J. Chem. Phys. 122 (2005) 206101.
S. Trebst, M. Troyer, U.H.E. Hansmann, J. Chem. Phys. 124 (2006) 174903.
W. Nadler, U.H.E. Hansmann, Phys. Rev. E 75 (2007) 026109.
W. Nadler, U.H.E. Hansmann, J. Phys. Chem. B 112 (2008) 10386.
W. Nadler, J.H. Meinke, U.H.E. Hansmann, Phys. Rev. E 78 (2008) 061905.
B. Paschek, H. Nymeyer, A.E. Garcia, J. Struct. Biol. 157 (2007) 524.
Y. Okamoto, M. Fukugita, T. Nakazawa, H. Kawai, Protein Eng. 4 (1991) 639.
N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, E. Teller, J. Chem.
Phys. 21 (1953) 1087.
Wolfram Research, Inc., Mathematica Version 5.1, Champaign, IL, 2004.
A.P. Lyubartsev, A.A. Martinovski, S.V. Shevkunov, P.N. Vorontsov-Velyaminov,
J. Chem. Phys. 96 (1992) 1776.
E. Marinari, G. Parisi, Europhys. Lett. 19 (1992) 451.
C. Zhang, J. Ma, J. Chem. Phys. 129 (2008) 134112.
3 Optimierte Replica Exchange Protokolle
3.2 Vergleich verschiedener Austauschschemata
Die im vorangegangenen Artikel entwickelte Regel für die optimale Wahl der Temperaturen setzte die Verwendung des DEO Austauschalgorithmus voraus. Der folgende Abschnitt ist ein Abdruck1 des Artikels
Martin Lingenheil, Robert Denschlag, Paul Tavan:
„Efficiency of exchange schemes in replica exchange“
Chem. Phys. Lett. 478, 80-84 (2009),
in dem dieses Austauschschema mit anderen in der Literatur vorgeschlagenen Austauschalgorithmen verglichen wird. Es zeigt sich, dass unter den untersuchten Verfahren DEO die
höchsten Diffusionsgeschwindigkeit der Replikate im Temperaturraum erzielt. Darüber
hinaus wird die Diffusionskonstante DDEO für das DEO Schema analytisch abgeleitet,
wodurch auch formal gezeigt wird, dass der DEO Austauschalgorithmus eine beschleunigte Zufallsbewegung der Replikate auf der Temperaturleiter bewirkt.
1
36
Mit freundlicher Genehmigung des Elsevier Verlags (Lizenznummer: 2415950243448).
Chemical Physics Letters 478 (2009) 80–84
Contents lists available at ScienceDirect
Chemical Physics Letters
journal homepage: www.elsevier.com/locate/cplett
Efficiency of exchange schemes in replica exchange
Martin Lingenheil, Robert Denschlag, Gerald Mathias, Paul Tavan *
Lehrstuhl für BioMolekulare Optik, Ludwig-Maximilians-Universität, Oettingenstr. 67, 80538 München, Germany
a r t i c l e
i n f o
a b s t r a c t
Article history:
Received 1 April 2009
In final form 9 July 2009
Available online 12 July 2009
In replica exchange simulations a fast diffusion of the replicas through the temperature space maximizes
the efficiency of the statistical sampling. Here, we compare the diffusion speed as measured by the round
trip rates for four exchange algorithms. We find different efficiency profiles with optimal average acceptance probabilities ranging from 8% to 41%. The best performance is determined by benchmark simulations for the most widely used algorithm, which alternately tries to exchange all even and all odd
replica pairs. By analytical mathematics we show that the excellent performance of this exchange scheme
is due to the high diffusivity of the underlying random walk.
Ó 2009 Elsevier B.V. All rights reserved.
1. Introduction
The replica exchange (RE) method [1–3] has become a standard
approach in molecular simulation to efficiently sample the rough
energy landscapes of biomolecules in solution at a target temperature T 1 (see e.g. Ref. [4]). In RE simulations, N simulation systems
(replicas) are parallely propagated in time using Monte Carlo (MC)
or molecular dynamics (MD) algorithms. For the standard temperature RE method in particular, the replicas i 2 f1; . . . ; Ng are
identical with the exception of the respective simulation temperatures T i . At a predefined temporal spacing an exchange between
two replicas i and j is attempted and is accepted with the Metropolis [5] probability
pij ¼ min 1; exp ðbj
bi ÞðEj
Ei Þ ;
ð1Þ
where Ei and Ej are the current potential energies of the replicas at
the corresponding inverse temperatures bi ¼ 1=kB T i and bj ¼ 1=kB T j ,
respectively. Here, kB denotes Boltzmann’s constant. The exchange
probability given by Eq. (1) satisfies the detailed balance condition
and therefore guarantees that the ensembles sampled by the individual replicas remain undisturbed by the exchange.
Due to the exchanges, each replica performs a random walk
through the temperature space ½T 1 ; T N Š. During the high temperature phases of its trajectory, a replica crosses potential energy
barriers more rapidly, leading in many cases (a relevant counter
example has been given in Ref. [6]) to a faster convergence, compared to a straight forward simulation, of the statistical sampling
at the lower temperatures. Here it is crucial for an optimal statistical sampling at the low temperatures that the replicas cycle
between low and high temperatures as frequently as possible
* Corresponding author. Fax: +49 89 2180 9220.
E-mail address: tavan@physik.uni-muenchen.de (P. Tavan).
0009-2614/$ - see front matter Ó 2009 Elsevier B.V. All rights reserved.
doi:10.1016/j.cplett.2009.07.039
[7–11]. As we will demonstrate below, the sizes of these round trip
rates strongly depend on the detailed algorithm by which the replica pairs are selected for attempting an exchange.
A widely used exchange scheme [12–14] divides the set
N fði; i þ 1Þ j i ¼ 1; . . . ; N 1g of next neighbors in the temperature ladder into the two subsets E N and O N, where E
contains all ‘even’ pairs ð2j; 2j þ 1Þ 2 N and O contains the ‘odd’
pairs ð2j 1; 2jÞ 2 N. Exchanges are attempted alternatingly for
the members of E and O. Because of the deterministic pattern of
exchange trials, we will call this method the deterministic even/
odd algorithm (DEO).
We will also consider a variant of DEO which, instead of alternatingly attempting exchanges among all even and all odd replica
pairs, randomly chooses with equal probability one of the subsets
E and O. Due to the stochastic selection of exchange sets, we call
this method the stochastic even/odd algorithm (SEO). As we will
show, the SEO scheme was implicitly assumed by a number of
authors when theoretically deriving rules for optimal temperature
ladders [7,11]. Another reason for analyzing the SEO algorithm is
that an exchange scheme equivalent to SEO is the straightforward
choice when implementing simulated tempering [15].
Besides DEO also other exchange schemes have been discussed
in the literature. The all-pair exchange (APE) method suggested by
Brenner et al. [16] considers all possible exchange pairs including
non-next neighbors. Finally, the very simple random next neighbor
(RNN) algorithm [16,17] chooses with equal probability at every
exchange step a single pair from the set N of next neighbors and
attempts an exchange for this pair.
Note that the DEO scheme does not permit a reverse move
immediately after a successful replica swap in contrast to SEO,
APE, and RNN. Therefore, it does not satisfy detailed balance and
one may ask whether DEO can interfere with canonical sampling.
However, Manousiouthakis and Deem [23] have shown that the
equilibrium statistics, which is generated by the intermittent MD
M. Lingenheil et al. / Chemical Physics Letters 478 (2009) 80–84
or MC simulation, is preserved if the sampling procedure satisfies a
less strict ‘balance condition’. This condition holds if each individual exchange trial satisfies local detailed balance, which is implied
by the Metropolis criterion Eq. (1) for even and for odd exchanges.
Thus, DEO represents a valid sampling strategy.
In this Letter we will systematically check as to how the different exchange algorithms affect the diffusion of the replicas through
the temperature space. This check will provide a rule for the optimal setup of RE simulations. For this purpose we will first introduce basic notions of the RE approach and a benchmark MC
system. Using the benchmark system we will then compare the
round trip rates obtained with the four different exchange
schemes. Because of the practical importance of the DEO algorithm, we will subsequently identify the reasons for its superior
performance by analytical mathematics.
2. Theoretical basics
It is general consensus that the distances of the N rungs T i within the temperature ladder should be chosen to yield equal average
acceptance probabilities hpi;iþ1 i ¼ pacc for the exchanges between
neighboring replicas i and i þ 1 [3] provided that the simulated
system does not undergo a phase transition within the range of
the temperature ladder [8,9]. If the system’s heat capacity C is constant, which is approximately true for explicit solvent systems
[18], then, following Okamoto et al. [19], the spacing law for equal
average acceptance probabilities is
T i ¼ T min ai 1 ;
ð2Þ
with the minimal temperature T min and a constant ratio a ¼ T iþ1 =T i
of neighboring temperatures.
Given a certain temperature range ½T min ; T max Š to be spanned by
a simulation, the choice of the number N of replicas automatically
determines the temperature ratio a through
aðNÞ ¼ ðT max =T min Þ1=ðN 1Þ :
ð3Þ
Next we assume Gaussian probability distributions
1
qðEi Þ ¼ pffiffiffiffiffiffiffiffiffiffi exp
2pC T i
"
ðEi
CT i Þ2
2CT 2i
#
ð4Þ
for the potential energies Ei at the various temperatures T i because
these distributions are as typical for explicit solvent simulations as
a constant heat capacity C. Note that, in Eq. (4), C denotes the
(extensive) heat capacity in units of Boltzmann’s constant kB and refers to the potential energy part of the total energy.
With the potential energy distributions given by Eq. (4), the
geometric temperature spacing by Eqs. (2) and (3), and the acceptance criterion by Eq. (1), the average acceptance probability
according to Kone and Kofke [7] is
pacc ¼ erfc
pffiffiffi
aðNÞ 1
;
C
aðNÞ þ 1
81
curiosity and led us to compare different exchange schemes using
a very simple benchmark simulation system.
3. Benchmark simulations
For each of the four algorithms, DEO, SEO, APE, and RNN, we
performed several RE Monte Carlo (REMC) benchmark simulations
with differing numbers N of replicas but with a fixed temperature
range T min ¼ T 1 ¼ 300 K to T max ¼ T N ¼ 800 K and with the temperatures T i spaced as given by Eq. (2). In these simulations we
drew the potential energies Ei of the replicas at each REMC step
from the distributions given by Eq. (4) choosing C ¼ 500 for the
heat capacity. Then, one of the four algorithms was used to decide
which exchanges should be considered, and the Metropolis criterion Eq. (1) was applied to evaluate the outcome of the exchange
attempts. Every simulation comprised S ¼ 107 REMC steps. A
round trip was counted if one of the replicas had traveled the complete way from T 1 to T N and back again. With the total number R of
round trips counted during a simulation, the round trip rate is
r R=NS.
Fig. 1 presents the measured round trip rates r as functions of
pacc ðNÞ. The shown efficiency profiles rðpacc Þ of the various algorithms are markedly different. The simple RNN algorithm (diamonds in Fig. 1) shows by far the weakest performance and has
its maximum round trip rate r max 10 4 at pacc 12%. Because
the RNN algorithm chooses only one pair from the set N of next
neighbors for an exchange trial and because this trial is successful
with an average acceptance probability pacc , the average number
nex of actual exchanges per REMC step is equal to pacc .
Compared with the RNN scheme, the more refined APE algorithm (squares in Fig. 1) yields much higher round trip rates with
a maximal performance r max 6:8 10 4 at pacc 9%. At this value
of pacc , the average APE number of exchanges (nex ¼ 0:55) exceeds
that of RNN (nex ¼ 0:09) by roughly a factor of 6. According to
Fig. 1, here the APE round trip rate r is about 6.5 times higher than
that of RNN. Hence, compared to RNN, the better performance of
APE is mainly due to the larger value of nex , and the inclusion of
non-next neighbor exchanges within APE seems to be of minor
importance.
At pacc 23% the round trip rate of the SEO algorithm assumes
its maximum value rmax ¼ 6:4 10 4 (triangles in Fig. 1). At this
point SEO exchanges nearly three times more pairs (nex ¼ 1:45)
per REMC step than APE at its respective rmax . Nevertheless, the
maximal APE rate is higher than that of SEO. Interestingly, for
SEO the position of rmax in Fig. 1 perfectly agrees with the 23%
acceptance probability predicted by the optimization formula of
Nadler and Hansmann [11,20] and with the point of maximal diffusivity predicted by Kone and Kofke [7].
ð5Þ
pffiffiffiffi R 1
where erfcðx0 Þ ¼ 2= p x0 expð x2 Þdx is the complementary error
function.
For a predefined temperature range ½T min ; T max Š, Nadler and
Hansmann [11] recently derived a formula to optimize an RE simulation setup with respect to the round trip rate r, i.e., to the average number of round trips a replica performs per unit time. In this
optimal ladder spanning the interval ½T min ; T max Š, the average acceptance probability pacc is about 23% [20]. Consistently, Kone and
Kofke [7] obtained the same value for pacc when optimizing the diffusion of a replica on the temperature ladder. Most recently, however, we observed in sample simulations employing the DEO
scheme that the allegedly optimal value pacc 23% led to suboptimal round trip rates [20]. This surprising observation sparked our
Fig. 1. The round trip rates r measured for the four different exchange schemes and
a system with C ¼ 500 as a function of the average acceptance probability pacc : RNN
(diamonds), APE (squares), SEO (triangles), DEO (circles). The dotted lines
connecting the symbols are a guide for the eye.
82
M. Lingenheil et al. / Chemical Physics Letters 478 (2009) 80–84
As demonstrated by the circles in Fig. 1 the closely related DEO
algorithm, performs everywhere better than SEO although both
algorithms feature the same number nex of exchanges per REMC
step for every choice of pacc ðNÞ. This improved performance of
DEO is particularly pronounced at large pacc , i.e. at large ladder
sizes N. According to Fig. 1 the maximal DEO round trip rate
rmax ¼ 9:0 10 4 is found at pacc 41%.
These findings suggest that the theory behind the optimizations
performed by Kone and Kofke [7] as well as by Nadler and Hansmann [11] does apply to SEO but not to the established and widely
used DEO scheme.
4. Diffusive properties
4.1. Elementary process
To understand why DEO performs better than SEO, we analyzed
the associated random walks performed by the replicas in the
temperature space. A random walk is a sequence of statistically
independent random experiments which we will call its elemenP
tary processes (EPs). After n EPs the displacement X ni¼1 Di of
the random walker is the sum of the displacements Di in the individual EPs. Since X is a sum of n identically distributed, statistically
independent random variables Di , its variance r2 ðXÞ is given by
nr2 ðDÞ [21], where r2 ðDÞ is the variance of the EP. If hdi is the average duration of the random walk’s EP, then its diffusivity, i.e. the
gain in variance per unit time, is given by
D ¼ r2 ðDÞ=hdi:
ð6Þ
4.2. Diffusivity of SEO
In the SEO scheme, the EP may have the following three outcomes: (i) The replica moves one step upward (D ¼ þ1) on the
temperature ladder with a total probability p pacc , where p ¼
0:5 is the probability of selecting the replica pair sets E or O,
respectively. (ii) The replica moves one step downward (D ¼ 1)
on the ladder with the same probability. (iii) The replica does not
move at all (D ¼ 0) with the rejection probability 1 pacc . Thus,
the variance of the EP is
h
i
r2 ðDÞ ¼ p pacc ðþ1Þ2 þ ð 1Þ2 þ ð1 pacc Þ 02 ¼ pacc
consists of alternating upward and downward marches of variable
lengths. Note that these lengths may also be zero if already the first
exchange fails.
We define the EP of DEO as a pair consisting of an upward
march and the following downward march. Let k 2 N0 denote the
0
number of successful upward steps, k 2 N0 the number of success0
ful downward steps, and l k þ k the sum of both. Fig. 2B shows a
typical example for such an EP. This EP causes a net displacement
0
0
Dk;l k k ¼ 2k l ¼ 1 and takes dk;l k þ k þ 2 ¼ l þ 2 ¼ 7
steps. The probability of the EP shown in Fig. 2B is the product of
all indicated probabilities belonging to the terminated upward
2
pacc ð1 pacc Þ and downward marches p3acc ð1 pacc Þ .
The general formula for the probability of k successful upward
0
steps, k successful downward steps, and two unsuccessful exchange trials is
0
pk;l pkþk
acc ð1
pacc Þ2 ¼ placc ð1
pacc Þ2 :
ð9Þ
Thus, the variance of the EP is
ð7Þ
and its average duration is hdi ¼ 1. According to Eq. (6), the diffusivity of the SEO algorithm is then
DSEO ¼ pacc :
Fig. 2. (A) DEO exchanges within a temperatures ladder. The rungs i; i þ 1; . . . of the
ladder are marked by black dots. The replica pairs which occupy the rungs joined by
the blue double arrows are considered for exchange in odd REMC steps. Red double
arrows refer to even REMC steps. (B) An example for an EP. The upward march
consists of k ¼ 2 successful steps, a failure in step 3, and of a downward march with
0
k ¼ 3 successful steps terminated by a failure in step 7. The probability of each step
is indicated at the corresponding arrow. (For interpretation of the references in
colour in this figure legend, the reader is referred to the web version of this article.)
ð8Þ
r2 ðDÞ ¼
1 X
l
X
l¼0 k¼0
pk;l D2k;l ¼
2pacc
ð1
pacc Þ2
;
ð10Þ
and the average duration of the EP is
1 X
l
X
pk;l dk;l ¼
2
:
pacc
Thus, in SEO a replica performs a simple random walk on the temperature ladder as is usually assumed in theoretical analyses of the
RE approach [7,9,11,22].
hdi ¼
4.3. Diffusivity of DEO
DDEO ¼
In contrast, the random walk imposed on a replica by the DEO
algorithm is more complex. Let us first assume an infinite temperature ladder. For this situation, Fig. 2A sketches the logic of the
exchanges. Assume that, in the odd REMC step j, the replica at rung
i moves along the blue arrow upwards to rung i þ 1. Then, in the
following even step j þ 1, it will be once again considered for an
upward exchange toward rung i þ 2 because DEO deterministically
selects the red arrows for this exchange trial. As long as the
exchange trials are successful, the replica keeps moving upward.
This movement stops with the first exchange failure and, subsequently, the direction of the DEO replica movement is reversed.
Thus, from the perspective of a single replica, the random walk
A comparison with Eq. (8) demonstrates that DDEO is by a factor of
1=ð1 pacc Þ > 1 larger than DSEO .
l¼0 k¼0
1
ð11Þ
From Eq. (6) we finally obtain for the DEO diffusivity
1
pacc
:
pacc
ð12Þ
4.4. Round trip rates
The above analysis has yielded diffusivities on infinite temperature ladders. However, what we want to know are round trip rates
r on finite ladders. Such a rate r is the inverse of 2s, where s is the
mean first passage time required for a replica to diffusively cross
the complete ladder. Therefore, r should be proportional [21] to
the diffusivity D as long as the EPs of the underlying random walk
are commensurate with the ladder size N. For the SEO scheme with
83
M. Lingenheil et al. / Chemical Physics Letters 478 (2009) 80–84
its step sizes of 0 or 1 this requirement is automatically fulfilled.
However, DEO features at decreasing probabilities also EPs of arbitrarily increasing step sizes. Therefore, the proportionality r DEO /
DDEO is expected to hold only within a certain approximation.
Nadler and Hansmann [9] have calculated r assuming a SEO
random walk directly from the corresponding master equation
[21]. Inspection of Eq. (8) demonstrates that their result for the rate
r SEO ¼
pacc
2NðN 1Þ
ð13Þ
actually exhibits the expected proportionality r SEO / DSEO ¼ pacc . If
we assume such a proportionality also for DEO we get r DEO ¼
r SEO DDEO =DSEO . Hence with Eqs. (8), (12), and (13), we expect for
DEO the round trip rate
r DEO ¼
ð1
pacc
pacc Þ2NðN
1Þ
:
ð14Þ
Fig. 4. Ratio of a measured round trip rate rRW (calculated with a fixed acceptance
probability pacc from extended DEO random walks on different ladder sizes N) to the
rate r DEO given by Eq. (14). Ratios are shown for pacc ¼ 0:2 (circles), pacc ¼ 0:4
(squares), pacc ¼ 0:6 (diamonds). The dotted lines connecting the symbols serve to
guide the eye.
4.5. Comparison with measured rates
In Fig. 3, we compare the round trip rates rðpacc Þ thus analytically calculated with rates measured in our test simulations (see
Fig. 1). The dashed line in Fig. 3 is the SEO prediction of Eq. (13),
where we have additionally used the unique relation between N
and pacc given by Eqs. (5) and (3). The agreement with the measured rates (triangles in Fig. 3) is perfect as expected. Also for
DEO our expectations are met. As expected, the approximation
Eq. (14) (solid line) deviates only a little from the measured rates
(circles).
The deviation between Eq. (14) and the measurements noted
for DEO must decrease with increasing ladder size, because for
large ladders all those EPs which try to reach beyond the ladder
boundary become less frequent. Fig. 4 illustrates this decrease for
three different average acceptance probabilities. The figure
compares a round trip rate rRW measured in extended DEO random
walks on ladders of increasing size N with the rate rDEO from
Eq. (14). Due to the decrease, the depicted ratio r RW =rDEO
approaches the value one with growing N for all values of pacc .
Fig. 4 suggests that Eq. (14) is an upper limit for the actual round
trip rate r RW and that the overestimate of rRW by rDEO increases with
pacc . We note that at N ¼ 10 and pacc ¼ 0:4 the deviation is less than
10%.
4.6. Optimal temperature ladders
The analytical approximation Eq. (14) for the DEO round trip
rate r together with the unique correspondence between pacc and
Fig. 3. Analytically calculated round trip rates r SEO (Eq. (13), dashed line) and rDEO
(Eq. (14), solid line) as functions of the average acceptance probability pacc . Values
measured in our test simulations are adopted from Fig. 1. The SEO values (triangles)
perfectly match Eq. (13), whereas the DEO values (circles) slightly deviate from Eq.
(14).
N (Eqs. (3) and (5)) enables us now to determine temperature ladders with a maximal rDEO . Here we assume that a temperature
range ½T min ; T max Š for an RE simulation and a system with known
heat capacity C are given. For our benchmark system, we find the
maximum of rDEO at the acceptance rate pacc ¼ 40:5%. This prediction closely agrees with our MC measurements (Figs. 1 and 3)
which identified the maximal r at the ladder size N ¼ 20 corresponding to an average acceptance probability pacc ¼ 41:4%.
In a previous study [20], we found empirically for the same
benchmark system that r is quite well approximated by the
expression
rj cjðpacc Þ c0 pacc =½Nðpacc Þ
1Š
ð15Þ
with system dependent constants c and c0 . From Eqs. (15), (3), and
(5) one finds that the maximum of rj is at pacc ¼ 44:4%, which is
also close to the MC value pacc ¼ 41:4%.
As we have seen, the two apparently different analytical
approximations Eqs. (14) and (15) yield quite similar predictions
for the optimal pacc . This puzzling result raises questions, which
are resolved by Fig. 5. The figure shows the ratio rj =r DEO ¼
c0 Nðpacc Þð1 pacc Þ as a function of pacc . It demonstrates that for large
pacc the two expressions become equivalent and remain close at
smaller pacc . This similarity is more pronounced for very small systems (dashed line) than for larger ones (solid line). We would like
to stress that the solid line is representative for all large systems
with heat capacities C > 500 (data not shown).
By Eq. (15), rj is proportional to 1=ðN 1Þ, which is characteristic for a directed motion with constant velocity, whereas rDEO is
Fig. 5. The ratio r j =r DEO as a function of the average acceptance probability pacc for
two systems with heat capacities C ¼ 50 (dashed line) and C ¼ 500 (solid line),
respectively.
84
M. Lingenheil et al. / Chemical Physics Letters 478 (2009) 80–84
by Eq. (14) proportional to 1=½NðN 1ފ, which is typical for a diffusive motion. If the temperature ladders become so small that the
EPs of the DEO random walk hit the ladder boundaries very frequently, then the approximation Eq. (14) is expected to become
worse. For such small ladders one can expect that the constant
velocity expression Eq. (15) becomes more accurate. To check this
expectation, we calculated the pacc of maximal r by Eq. (14) and Eq.
(15) for the small system (C ¼ 50). For this system the constant
velocity expression Eq. (15) predicts 42.6%, the diffusion expression Eq. (14) yields 43.5% whereas the target value obtained
through MC is 41.5% (corresponding to a ladder size N ¼ 7). Thus,
even for very small temperature ladders our analytical approximation rDEO still predicts the location of the maximal round trip rate at
nearly the same accuracy as the educated guess r j put forward in
Ref. [20], while it is somewhat better for larger systems.
Finally, we have shown that the educated guess Eq. (15), which
empirically had been determined to be a good optimization measure for the DEO round trip rates [20], yields results nearly equivalent to those of our analytical approximation Eq. (14).
5. Summary
[6]
[7]
[8]
[9]
[10]
[11]
[12]
We have investigated the round trip rate performance of four
different exchange algorithms. Our results demonstrate that the
DEO algorithm yields the highest round rates over a wide range
of average acceptance probabilities pacc . Thus, using the DEO algorithm is not only a common but also a good practice in the application of replica exchange. This conclusion not only applies to
replica exchange but also to simulated tempering [15]. Note, however, that according to our results (Fig. 1) the APE algorithm [16]
might be an interesting alternative if one is interested in high
round trip rates using minimally sized temperature ladders [20].
Examining the DEO random walk analytically we have shown
that the reason for the higher DEO round trip rates compared with
those of its randomized variant (SEO) is the intrinsically higher diffusivity of the corresponding random walk. Since the diffusivity
advantage of DEO over SEO becomes larger with increasing pacc ,
the acceptance probability maximizing the round trip rate is
shifted from pacc 20% (SEO) to pacc 40% (DEO).
Acknowledgement
This work was supported by the Deutsche Forschungsgemeinschaft (SFB 533/C1 and SFB 749/C4).
References
[1]
[2]
[3]
[4]
[5]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
K. Hukushima, K. Nemoto, J. Phys. Soc. Jpn. 65 (1996) 1604.
U.H.E. Hansmann, Chem. Phys. Lett. 281 (1997) 140.
Y. Sugita, Y. Okamoto, Chem. Phys. Lett. 314 (1999) 141.
D.J. Earl, M.W. Deem, Phys. Chem. Chem. Phys. 7 (2005) 3910.
N. Metropolis, A.W. Rosenbluth, M.N. Rosenbluth, A.H. Teller, E. Teller, J. Chem.
Phys. 21 (1953) 1087.
R. Denschlag, M. Lingenheil, P. Tavan, Chem. Phys. Lett. 458 (2008) 244.
A. Kone, D.A. Kofke, J. Chem. Phys. 122 (2005) 206101.
S. Trebst, M. Troyer, U.H.E. Hansmann, J. Chem. Phys. 124 (2006) 174903.
W. Nadler, U.H.E. Hansmann, Phys. Rev. E 75 (2007) 026109.
W. Nadler, J.H. Meinke, U.H.E. Hansmann, Phys. Rev. E 78 (2008) 061905.
W. Nadler, U.H.E. Hansmann, J. Phys. Chem. B 112 (2008) 10386.
T. Okabe, M. Kawata, Y. Okamoto, M. Mikami, Chem. Phys. Lett. 335 (2001)
435.
M.J. Abraham, J.E. Gready, J. Chem. Theory Comput. 4 (2008) 1119.
D. van der Spoel et al., Gromacs User Manual version 3.3, 2005.
www.gromacs.org.
R. Denschlag, M. Lingenheil, P. Tavan, G. Mathias, J. Chem. Theory Comput.,
submitted for publication.
P. Brenner, C.R. Sweet, D. VonHandorf, J.A. Izaguirre, J. Chem. Phys. 126 (2007)
074103.
F. Calvo, J. Chem. Phys. 123 (2005) 124106.
B. Paschek, H. Nymeyer, A.E. Garcia, J. Struct. Biol. 157 (2007) 524.
Y. Okamoto, M. Fukugita, T. Nakazawa, H. Kawai, Protein Eng. 4 (1991) 639.
R. Denschlag, M. Lingenheil, P. Tavan, Chem. Phys. Lett. 473 (2009) 244.
C.W. Gardiner, Handbook of Stochastic Methods, 2nd edn., Springer, Berlin,
1985.
W. Nadler, U.H.E. Hansmann, Phys. Rev. E 76 (2007) 065701(R).
V.I. Manousiouthakis, M.W. Deem, J. Chem. Phys. 110 (1999) 2753.
4 Simulated Solute Tempering
Die Geschwindigkeit, mit der die Replikate einer Replica Exchange Simulation durch
den Temperaturraum wandern, kann über eine optimale Parameterwahl hinaus durch eine geeignete Kraftfeldskalierung positiv beeinflusst werden. Eine mögliche Skalierung
des Kraftfeldes stellt das in der Einleitung beschriebene Solute Tempering Konzept dar,
welches erstmals von Liu und Co-Autoren in Gestalt der Replica Exchange with Solute
Tempering (REST) Methode eingeführt wurde [101] und eine deutliche Ausdünnung der
Temperaturleiter bewirkt.
Aus dem vorangegangenen√Kapitel geht hervor, dass (i) die ST Methode im Vergleich
zu RE eine um den Faktor 1/ 2 geringere Anzahl an Temperatursprossen benötigt und
dass (ii) die round-trip Zeit quadratisch mit der Anzahl der Temperatursprossen anwächst.
Hieraus ergibt sich, dass die round-trip Zeit einer ST Simulation nur die Hälfte der roundtrip Zeit einer entsprechenden RE Simulation beträgt. Diese Erkenntnis motivierte die im
nachfolgenden abgedruckte1 Publikation
Robert Denschlag, Martin Lingenheil, Paul Tavan und Gerald Mathias:
„Simulated Solute Tempering“
J. Chem. Theory Comput. 5, 2847 (2009),
die ich zusammen mit Martin Lingenheil, Paul Tavan und Gerald Mathias verfasst habe.
In dieser Arbeit wird die REST spezifische Kraftfeldskalierung auf Simulated Tempering
übertragen, woraus sich die neue Simulated Solute Tempering (SST) Methode ergibt. Neben einer ausführlichen Beschreibung der SST Methode, in der die akkurate Berechnung
der Gewichte mit eingeschlossen ist, wird mittels einer konkreten MD Simulation die theoretische Erwartung bestätigt, dass die SST Abtasttechnik aufgrund der kürzeren round-trip
Zeit den Konformationsraum effizienter abtastet als die REST Methode.
1
Mit freundlicher Genehmigung der American Chemical Society.
43
J. Chem. Theory Comput. 2009, 5, 2847–2857
2847
Simulated Solute Tempering
Robert Denschlag, Martin Lingenheil, Paul Tavan, and Gerald Mathias*
Lehrstuhl fu¨r Biomolekulare Optik, Ludwig-Maximilians-UniVersita¨t,
Oettingenstrasse 67, 80538 Mu¨nchen, Germany
Received May 26, 2009
Abstract: For the enhanced conformational sampling in molecular dynamics (MD) simulations,
we present “simulated solute tempering” (SST) which is an easy to implement variant of simulated
tempering. SST extends conventional simulated tempering (CST) by key concepts of “replica
exchange with solute tempering” (REST, Liu et al. Proc. Natl. Acad. Sci. U.S.A. 2005, 102,
13749). We have applied SST, CST, and REST to molecular dynamics (MD) simulations of an
alanine octapeptide in explicit water. The weight parameters required for CST and SST are
determined by two different formulas whose performance is compared. For SST only one of
them yields a uniform sampling of the temperature space. Compared to CST and REST, SST
provides the highest exchange probabilities between neighboring rungs in the temperature ladder.
Concomitantly, SST leads to the fastest diffusion of the simulation system through the
temperature space, in particular, if the “even-odd” exchange scheme is employed in SST. As a
result, SST exhibits the highest sampling speed of the investigated tempering methods.
Introduction
The generation of equilibrium ensembles for macromolecules
by all-atom Monte Carlo (MC) or molecular dynamics (MD)
simulations is a challenging task due to the huge computational effort which is generally necessary to guarantee ergodic
sampling of relevant observables. The required simulation
time depends on the number and on the depths of the local
minima in the free energy landscape because here the
simulation may get trapped for extended periods of time. If
the barriers between the minima are mainly of enthalpic
nature, generalized ensemble tempering techniques enable
faster barrier crossings and, therefore, alleviate the sampling
problem.1,2
Two important generalized ensemble tempering algorithms
are simulated tempering (ST) and replica exchange (RE).2
The RE method in its original form3-5 employs several
copies (replicas) of the investigated system at different
temperatures T0 < T1 < · · · < TN-1. Within this temperature
ladder, temperature exchanges between the replicas are
* Corresponding author phone: +49-89-2180-9228; fax: +4989-2180-9202; e-mail: gerald.mathias@physik.uni-muenchen.de.
Corresponding author address: LMU Mu¨nchen, Lehrstuhl fu¨r
BioMolekulare Optik, Oettingenstrasse 67, D-80538 Mu¨nchen,
Germany.
periodically attempted. The corresponding probability for an
exchange is given by a Metropolis criterion.6 On the other
hand, in simulated tempering,7,8 only one “replica” diffuses
along the temperature ladder, where temperature changes are
determined by a Metropolis criterion slightly modified with
respect to RE. Although both methods are closely related,9
RE has attracted much more attention than ST, which is
indicated by the large number of RE variants that have been
suggested.10-17 Furthermore, the RE efficiency has been the
subject of many studies,18-23 and numerous RE applications
to macromolecules have been presented.24-33 The main
reason for the apparent neglect of ST is that this approach
requires estimates for certain a priori unknown parameters,
the so-called weights, to ensure a uniform sampling of the
ensembles at all temperatures. In contrast, RE automatically
guarantees a uniform sampling of all temperatures and is,
therefore, much simpler to control.
Without prior knowledge on the properties of the simulated
system an unbiased tempering algorithm should uniformly
cover the chosen temperature space to generate an enhanced
statistics at the temperature of interest, which usually is the
lowest temperature T0. Therefore, the average time to shuttle
a replica between T0 and the maximal temperature TN-1 should
be by orders of magnitude shorter than the simulation time.
Addressing this issue, Abraham and Gready recently exam-
10.1021/ct900274n CCC: $40.75  2009 American Chemical Society
Published on Web 08/24/2009
2848
Denschlag et al.
J. Chem. Theory Comput., Vol. 5, No. 10, 2009
ined more than forty published RE simulations for their
ability to take full advantage of the tempering.34 In most
cases, they found that the simulation times were too short
for a sufficiently frequent shuttling of the replicas along the
temperature ladder. The reason for large shuttling times is
known: This time usually scales with the square of the
number of temperatures,35 which in turn grows with the
square root of the number of degrees of freedom3 (DOF) in
the simulation system. Thus, very long round trip times are
expected for simulation systems with many DOF. Typical
examples for such systems are macromolecules in explicit
solvent. Note that such systems may additionally undergo
slow phase transitions, like folding and unfolding, which may
drastically increase round trip times.
Various strategies aiming at increasing the shuttling
frequency were suggested for RE,14-16,35-38 which should
all be transferable to ST. A first class of strategies targets
the optimization of the parameters that characterize the setup
of an RE simulation. For example, Sindhikara et al.38 recently
recommended that the times between exchange trials should
be as small as possible, which, however, has been disputed.34
Additionally, rules for optimizing the temperature ladders
were suggested.35,39 A second class of strategies tries to
reduce the number of DOF that are relevant for the exchange
criterion, for example some strategies switch from an explicit
to an implicit solvent description for the exchange trials16
or by tempering only the areas of interest.15 With such a
reduced number of DOF, the temperature steps within the
ladder can be chosen larger and, thus, a given temperature
range can be covered by less rungs. As a result, the round
trip times are drastically shortened. However, the quoted
methods of the second class are not rigorous in terms of
statistical physics because here the ensemble is modified in
a somewhat arbitrary fashion. Recently, Liu et al.14 have
presented a rigorous strategy called “replica exchange with
solute tempering” (REST), which largely eliminates the
influence of the solvent DOF on the exchange probabilities
by a temperature dependent scaling of the Hamiltonian. In
this approach only the Hamiltonian at the target temperature
remains unscaled and renders the desired physically meaningful ensemble. This restriction is, however, of minor
importance for many applications.
An additional strategy to achieve increased shuttling
frequencies is the optimal choice of the tempering method
itself. For example, with accurately known weight parameters, ST is more efficient than RE because it provides larger
acceptance probabilities on the same temperature ladder.40-42
At given conditions one may equivalently state that ST
requires less rungs in the temperature ladder than RE if both
techniques are tuned to the same average acceptance
probabilities.
The weights necessary for ST can be estimated by short
preparatory simulations bearing the risk, however, that these
weights are of insufficient accuracy. Addressing this issue,
a recent study by Park and Pande9 suggests that controlling
an ST simulation may be less difficult than previously
assumed by Mitsutake and Okamoto.43 In contrast to the
latter authors Park and Pande9 did not stick to the rough
estimates for the weights derived from a set of short
preparatory simulations but instead updated the weights
during the subsequent ST production run.
Here, inspired by the works of Park and Pande9 and of Liu
et al.,14 we suggest a variant of ST called simulated solute
tempering (SST), which shares key concepts with REST and
its sequential variant SREST.13 As we will demonstrate, SST
has the advantages of a most simple implementation and of
reducing the required number of rungs within the temperature
ladder. Note, that SST should be easily transferrable to hybrid
methods which combine different tempering techniques43 or
tempering with other enhanced sampling methods.44
We start in “Theory and Methods” with an introduction
to replica exchange and its variant REST, which employs a
temperature dependent scaling of the Hamiltonian. Then, we
present SST along with two procedures for calculating the
weights and corresponding update schemes. Subsequently,
these techniques are applied to an alanine octapeptide
(8ALA) solvated in water. The enthalpic barriers of 8ALA
are small, and, hence, the benefit of tempering techniques is
limited.22 However, the small barriers provide fast conformational sampling which makes 8ALA ideally suited to
compare different tempering strategies among each other with
high statistical accuracy. In particular, we investigate the
sampling efficiency and convergence of conventional ST
(CST), SST, and REST. In addition, we test two temperature
exchange schemes for SST to improve the method further.
After the presentation and discussion of the results, we
conclude the paper summarizing the key messages.
Theory and Methods
We begin by sketching the replica exchange method and the
concept of solute tempering, which will lead us, when
combined with simulated tempering, to the SST method.
Replica Exchange. Within both conventional temperature
RE5 (CRE) and REST,14 N copies (replicas) of the system
are simulated at temperatures T0 < T1 < · · · < TN-1 and sample
the associated canonical ensembles. The set of replicas
constitutes a so-called generalized ensemble. After predefined
time intervals a temperature exchange between pairs of
replicas is tried. A Metropolis criterion6 determines the
exchange probability
Pij ) min[1, exp(∆ij)]
(1)
between replicas at Ti and Tj with
∆ij ) βi[Ei(xj) - Ei(xi)] + βj[Ej(xi) - Ej(xj)]
(2)
to preserve the canonical ensembles. Here, Ek(xl) is the value
of the potential energy function associated with the temperature Tk, which is evaluated at a configuration xl resulting
from the sampling at Tl; βk ) 1/kBTk is the inverse
temperature where kB is the Boltzmann constant. For CRE
the potential energy is independent of the temperature, and
∆ij reduces to
∆ij ) (βi - βj) · [E(xi) - E(xj)]
(3)
In the case of REST, in contrast, the potential energy
becomes temperature dependent and has the form
Simulated Solute Tempering
J. Chem. Theory Comput., Vol. 5, No. 10, 2009 2849
Ek(x) ) λk,0Epp(x) + λk,1Eps(x) + λk,2Ess(x)
pp
ps
(4)
ss
where E , E , and E are the solute-solute, solute-solvent,
and solvent-solvent parts of the potential energy function
at the target temperature T0 of the sampling; λk,h are
parameters depending on the temperature Tk. We choose
λk,0 ) 1, λk,1 ) √β0 /βk, λk,2 ) β0 /βk
(5)
where λk,1 is the geometric mean of λk,0 and λk,2 instead of
the arithmetic mean (β0 + βk)/2βk originally proposed by
Liu et al.14 With this choice, the required scaling of the
electrostatic energy and of the corresponding forces at Tk
can be achieved by simply scaling the partial charges of the
solvent by a factor (β0/βk)1/2. Similar considerations hold for
the Lennard-Jones interactions. Thus, this choice of the λk,i
is conveniently implemented.
The advantage of the REST approach becomes apparent
after a few algebraic operations. Inserting eqs 4 and 5 into
eq 2 one finds
pp
pp
∆ij ) (βi - βj)[E (xi) - E (xj)]
+ (√β0βi - √β0βj)[Esp(xi) - Esp(xj)]
(6)
Thus, the difference ∆ij, which determines the acceptance
probability eq 1, is exclusively calculated from the solute-solute
and solute-solvent energies, whereas the potential energy
Ess of the solvent cancels.
One can quantify the benefit of REST compared with CRE
by estimating the number of rungs eliminated from the
temperature ladder. Assuming that the solvent DOF do not
contribute to the exchange probability at all, the ratio NREST/
NCRE of the required rungs is estimated by the lower limit
(np/(ns + np))1/2, where ns and np count the DOF of the solvent
and solute, respectively. In the following we discuss how
simulated tempering can further reduce the required number
of rungs.
Simulated Tempering. In simulated tempering7,8 (ST), a
single system is simulated at a temperature Ti, which belongs
to a given temperature ladder T0 < · · · < TN-1. After given
time intervals it is checked whether the system temperature
Ti can be changed to Tj, where j is usually i ( 1. For i ∈
{0,N - 1} the transition to j ) -1 or j ) N is rejected. For
other transitions i f j, the acceptance probability41
Pij ) min[1, exp(∆ij)]
(7)
∆ij ) [βiEi(x) - wi] - [βjEj(x) - wj]
(8)
with
represents a Metropolis criterion similar to that of RE.
The weights wk introduced in eq 8 are commonly set to the
configurational parts βkF˜k ) -ln ∫exp [-βkE(x)]dx of the
dimensionless free energies βkFk of the simulation system
at the temperatures Tk.7,8 This choice leads to a uniform
sampling of all rungs within a given temperature ladder
because, in the ergodic limit, the expected ratio Fk ≡ tk/t of
the time tk spent by the simulation at temperature Tk to the
total sampling time t is given by the Boltzmann factor of
the generalized ensemble
lim Fk )
tf∞
exp[-(βkF˜k - wk)]
∑ exp[-(βjF˜j - wj)]
(9)
j
leading to lim Fk ) 1/N for wj ) βjF˜j. Since the wk are a
tf∞
priori unknown, one usually tries to estimate these weights
from short preparatory simulations. Note that the wj can be
chosen differently, if a nonuniform sampling is desired.42
In conventional simulated tempering (CST), the potential
energy function Ek(x) is independent of the temperature Tk,
and eq 8 reduces to
∆ij ) (βi - βj)E(x) - (wi - wj)
(10)
For the new SST method introduced here, we transfer the
solute tempering concept of REST to ST and use the energy
function given by eqs 4 and 5. Inserting these equations into
eq 8 yields
∆ij ) (βi - βj)Epp(x) + (√β0βj - √β0βi)Eps(x) (wi - wj) (11)
Thus, for SST the difference ∆ij is only calculated from the
solute-solute and solute-solvent energies, while the potential energy Ess of the solvent cancels. As we will show
below the solvent-solvent contributions cancel as well in
the computation of the weight differences wi - wj.
At first glance, ST (CST/SST) seems less attractive than
RE (CRE/REST) because ST requires the a priori unknown
weights wi. However, ST provides larger average acceptance
probabilities than RE40-42 for a given temperature ladder
because RE requires a simultaneous exchange of two replicas,
whereas only one replica has to be considered for ST. As a
result, ST needs only 1/2j times the number of rungs in
the temperature ladder than RE to cover a given temperature
range with the same acceptance probabilities.39 Now, we turn
to different approaches to determine the required weights
wi.
Determination of the Weights wk. As we have seen
above, a uniform sampling of the various temperatures Tk
requires that the weights wk in eq 8 are the configurational
parts βkF˜k of the dimensionless free energies, which can be
estimated from preparatory simulations. For CST, Park and
Pande have presented a formula which yields surprisingly
good estimates of the wk at a negligible computational effort
(e.g., by executing a single 10 ps MD simulation at each
Tk).9 These authors replaced the potential energy E(x) in eq
10 by the average potential energy 〈E〉i at Ti to get a “typical”
typ
typ
∆typ
ij and demanded that ∆ij ) ∆ji , which leads to the
estimate
wj - wi ≈ (βj - βi)
〈E〉i + 〈E〉j
2
(12)
and which has been further substantiated by Park.41 In the
Appendix we show an alternative derivation of the “trapezoid
rule” eq 12 that is based on the assumption of a constant
heat capacity CV and that ln[1 + (Tj - Ti)/Ti] can be
approximated by (Tj - Ti)/Ti.
2850
Denschlag et al.
J. Chem. Theory Comput., Vol. 5, No. 10, 2009
Replacing the temperature dependent potential energies
Ei(x) and Ej(x) in eq 8 by averages 〈Ei〉i and 〈Ej〉i transfers
the trapezoid rule of CST to SST. Here, 〈Ej〉i is the energy
function Ej evaluated for and averaged over the configurations sampled at Ti. Analogous to CST, also these SST
typ
averages yield “typical” differences ∆typ
ij . Equation 5 and ∆ij
typ
) ∆ji lead to
(βj - βi)(〈Epp〉i + 〈Epp〉j)
wj - wi ≈
2
(√β0βj - √β0βi)(〈Eps〉i + 〈Eps〉j)
+
2
(13)
L-1
exp{-βi
exp(-wi) )
∑ λi,jEj[xk(t)]}
j)0
∑ ∑ N-1
L-1
k)0 t)1
∑ nmexp{wm - βm ∑ λm,jEj[xk(t)]}
m)0
j)0
(14)
presented earlier by Kumar et al.45 for the “weighted
histogram analysis method” (WHAM) is “exact” for the
already sampled ensemble and provides an unbiased estimator for the true dimensionless free energies.46 Here, N is again
the number of temperatures Tk, nk is the number of
configurations xk(1),...,xk(nk) sampled at Tk, and L counts
potential energy terms contributing to the Hamiltonian. For
CST there is only one such term and one has L ) 1, E0(xk)
≡ E(xk), and λi,0 ) 1. The Hamiltonian of SST eq 4
distinguishes L ) 3 energy contributions E0(xk) ≡ Epp(xk),
E1(xk) ≡ Eps(xk), and E2(xk) ≡ Ess(xk). With the λi,j chosen
as given by eq 5 one finds that Ess cancels in eq 14 like it
did in the trapezoid rule eq 13. Equation 14 has to be solved
self-consistently and yields successively more and more
accurate weights as the statistics is improved by an ongoing
sampling. Note that Mitsutake and Okamoto40,43 previously
suggested to compute the free energies required for ST
through WHAM equations, which are based on energy
histograms. In contrast, eq 14 computes the free energies
directly from the sampled energies and, therefore, avoids the
errors introduced by the histogram discretization.45-47
The WHAM formula eq 14 can be used to identify the
errors ∆wi of the trapezoid rules eq 12 for CST and eq 13
for SST. In the limit of ergodic sampling, the errors ∆wi )
wi - wexact
yield through eq 9 the ratios lim Fi ) exp(∆wi)/
i
tf∞
∑j exp(∆wj). These ratios will deviate from 1/N and, therefore, measure deviations from the desired uniformity of the
sampling along the ladder of temperatures Ti. To measure
this deviation in our simulations, we introduce the quantities
χi ≡ N
ti
t
exp(∆wi)
χˆ i ) N N-1
(16)
∑ exp(∆wj)
j)0
As mentioned in the section “Simulated Tempering”, wj wi does not depend on the solvent-solvent interactions Ess.
Note that one can choose w0 ) 0 because only differences
wj - wi matter in eq 8.
Whereas the computations of the weights wi in CST and
SST by the trapezoid rules eqs 12 and 13, repectively, are
approximate, the relation
N-1 nk
A uniform sampling corresponds to χi ) 1.0 at all Ti. Thus,
the χi enable easy comparisons of the sampling uniformity
achieved with differently sized temperature ladders. If the
errors ∆wi are known, the long time limit of χi is
(15)
For finite simulations, the χi exhibit statistical fluctuations
which depend on the number of rungs N in the temperature
j ij, and on the
ladder, on the average exchange probabilities P
total number of exchange trials. For each set of these
parameters, one can model an actual ST simulation by a
computationally inexpensive MC simulation of a random
walk along the N rungs of the temperature ladder. We will
use large numbers of such MC simulations to estimate the
standard deviations σi of the χi for the respective simulations.
These values show to what extent one may expect convergence of sampling uniformity.
Update Schemes for the Weights wi. Initial guesses for
the weights required in ST can be obtained from short
preparatory simulations using the formulas presented in the
previous paragraph. The correspondingly limited statistical
accuracy of the initial weights may entail a strongly
nonuniform sampling along the temperature ladder in
subsequent ST production runs. However, one may improve
the initial guesses by utilizing the information accumulated
in the course of the production run. For this purpose different
adaptation schemes were suggested.48-51
We used a procedure based on the following considerations: Up to the first update the sampling along the
temperature ladder is expected to be far off from uniformity.
Consequently, a poor statistics of the potential energy
distribution is obtained at some temperature rungs. To avoid
an impact of this bad statistics on the estimated weights, we
determine the first update by wnew
) wold
+ ln(t0/ti) where
i
i
we set the ti for rungs that have not been visited at all to a
full exchange period. Thus, badly sampled rungs will be
preferentially sampled until the next update step. By construction this strategy leads to a uniform sampling but may
suffer from slow convergence. Therefore, we subsequently
switch to a periodical recomputation of the weights either
by the trapezoid rule (eqs 12 and 13) or by the WHAM
formula (eq 14). In these recomputations, which are executed
after each nanosecond of simulation, we exclusively consider
the data from the production run and discarded those of the
preparatory simulation.
Exchange Scheme. Having established the determination
of the weights steering the exchange probabilities, we now
sketch the exchange algorithms employed for ST. A straightforward exchange procedure is to choose randomly between
an upward or downward exchange trial with probabilities of
50%. We call this exchange procedure “stochastic even/odd”
(SEO) scheme because the corresponding exchange scheme
for replica exchange is a stochastic instead of a deterministic
choice between two groups of replica pairs.52 The first group
contains all “even” pairs (T2n,T2n+1) and the second one all
“odd” pairs (T2n-1,T2n).
Simulated Solute Tempering
J. Chem. Theory Comput., Vol. 5, No. 10, 2009 2851
Table 1. Overview of Simulations Conducteda
label
trajectory time span/ns
temperature range/K
no. of rungs
solvent scaling
determination of weights
exchange scheme
CRE
CST
REST/A
REST/B
REST/C
SST/A
SST/B
SST/C
SST/D
18 × 0.1
27
4 × 0.1
5 × 0.1
4 × 12
27
27
12
12
300-500
300-500
300-500
300-500
300-500
300-500
300-500
300-500
300-500
18
18
4
5
4
4
4
4
5
no
no
yes
yes
yes
yes
yes
yes
yes
trapezoid
trapezoid
WHAM
WHAM
WHAM
DEO
SEO
DEO
DEO
DEO
SEO
SEO
DEO
DEO
a
The weights for the ST simulations have been determined either by the trapezoid rule, eq 12 for CST and eq 13 for SST, respectively,
or by the WHAM formula eq 14.
However, the standard exchange scheme used in replica
exchange simulations is characterized by alternate exchange
trials between these two groups of replica pairs,3,34,52,53
which we have called the “deterministic even/odd” (DEO)
scheme.52 Formally, one can express the DEO exchange
scheme by a relation which combines the involved temperature indices i and i′ at the exchange attempt step s by i′ )
i + (-1)i+s. In the framework of simulated tempering DEO
alternately tries to shift the single replica at Ti to Ti+1
(upward) or Ti-1 (downward). In the case of a successful
exchange, however, the previous exchange direction (upward
or downward) is maintained for the next exchange trial and
so forth until a temperature exchange fails.
We apply both exchange schemes to investigate their
influence on the diffusion of the system through the temperature space. This diffusion can be measured in terms of
the average round trip time τ required to travel from T0 to
TN-1 and back to T0. For the SEO scheme applied to a
temperature ladder with uniform average acceptance probj ij ) P
j , τ in units of the time between exchange
abilities P
j
trials is related to the average acceptance probability P
35,52,54,55
by
j
τSEO ) 2N(N - 1)/P
(17)
For DEO, τ is given by
j)
τDEO ) τSEO(1 - P
(18)
in the limit of large N.52 Assuming this limit, DEO always
provides shorter round trip times than SEO or, equivalently,
higher round trip rates τ-1. Thus, the question is to what
extent this expectation is confirmed for finite ladder sizes
N.
Simulation System and Force Field. We have used MD
simulations of a poly alanine octapeptide (8ALA), saturated
with an acetyl group at the N-terminus and an N-methyl
group at the C-terminus, to investigate the benefits of the
various algorithms introduced above. The peptide was
described by the CHARMM22 force field56 and solvated in
a periodic orthorhombic dodecahedron of 18 Å inscription
radius containing 1112 water molecules. For the water
molecules we employed the transferable three-point intermolecular potential (TIP3P),57 modified as suggested by
MacKerell et al.56 for usage with CHARMM22. The initial
8ALA structure was generated using the Molden software58
by setting the backbone dihedral angles to the values φ )
-58° and ψ ) -47° to form an ideal R-helix.
MD Simulation Techniques. The software package EGOMMVI59 was used for all MD simulations. The electrostatic
interactions were treated combining structure-adapted multipole expansions60 with a moving-boundary reaction-field
approach.59 Here, the cutoff radius for the explicit evaluation
of the electrostatic interactions was 18 Å. Beyond this radius,
a dielectric continuum was assumed with a static dielectric
constant εs ) 80. The explicit van der Waals interactions
were calculated up to a distance of 10 Å and at larger
distances a mean-field approach was applied.61 A multipletime-step integration scheme62 with a fastest time step of 2
fs was used. For bonds that include hydrogen atoms, the
corresponding bond lengths were constrained using the
M-SHAKE algorithm63 with relative tolerance of 10-6.
System Preparation. Our simulation system was equilibrated for 100 ps with two Berendsen thermostats64 (coupling
times 0.1 ps) separately keeping the solute and the solvent
at 300 K. Additionally, a Berendsen barostat64 (coupling time
1 ps) steered the system to ambient pressure (1 bar). For the
subsequent tempering runs we switched from an NPT to an
NVT ensemble.
Simulation Runs. As listed in Table 1 we carried out three
short replica exchange simulations [CRE and REST/(A,B)]
serving to estimate the initial weights for the extended
simulated tempering simulations CST and SST/A-D. For
CRE and CST we used the temperature ladder 300 K, 308
K, 317 K, 326 K, 336 K, 346 K, 356 K, 367 K, 378 K, 390
K, 402 K, 415 K, 428 K, 442 K, 456 K, 470 K, 485 K, and
j i,i+1
500 K. We found average acceptance probabilities P
between 5% and 14% for CRE and between 21% and 31%
for CST. For REST/(A,C) and SST/A-C we used the ladder
j i,i+1
300 K, 350 K, 415 K, and 500 K. The corresponding P
range from 5% to 12% (REST) and from 19% to 28% (SST).
For REST/B and SST/D, the five temperatures 300 K, 340
j i,i+1
K, 387 K, 440 K, and 500 K were used yielding P
between 15% and 19% (REST) and between 38% and 41%
(SST). Exchanges were tried every 0.5 ps in all simulations.
Tables S1 and S2 in the Supporting Information provide
details about the average exchange probabilities along the
various ladders. These values were used for setting up the
MC simulations mentioned further above.
The extended simulations REST/C, CST, and SST/(A,B)
serve us for comparisons of methods and address, in
particular, the applicability of different adaptation schemes
2852
Denschlag et al.
J. Chem. Theory Comput., Vol. 5, No. 10, 2009
to SST. Furthermore, using the simulations SST/C and
SST/D we will study the effects of the chosen exchange
scheme (SEO vs DEO) and of the (overall) average exchange
j ) 〈P
j ij〉 on the round trip rates τ-1 and the
probability P
sampling speeds.
Sampling Speed. The main objective of tempering
methods is to enhance the sampling speed of the simulation.
A corresponding measure for the sampling speed is given
by an algorithm recently suggested by Lyman and Zuckerman,65 which we will denote as LZA. LZA integrates the
“volume” in configurational space sampled by a trajectory.
The average volume sampled during a given simulation time
span provides a measure for the sampling speed.
For 8ALA we define the conformational space by the eight
dihedral angles ψi spanned by the backbone units Ni Ci - CiR - Ni. From a trajectory of the eight-dimensional
tuples (ψ1,...,ψ8), LZA randomly chooses one tuple and
removes it from the trajectory together with all other tuples
lying within the sphere of predefined radius r around the
chosen tuple. This procedure is repeated until all tuples of
the initial trajectory have been removed. The number of steps
required is a dimensionless measure for the configurational
volume Vc sampled by the trajectory. Because this algorithm
is nondeterministic, it is repeated m times, and the corresponding average number nlza of required steps is calculated.
For our analyses we choose r ) 25°8j and m ) 50. For a
fair comparison between ST and RE, we compute the
sampling speed per replica, i.e. one for ST and N for RE.
Table 2. Weights Determined from the CRE Simulationa
i
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
Ti
300
308
317
326
336
346
356
367
378
390
402
415
428
442
456
470
485
500
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
K
wi (trapezoid)
wi (WHAM)
∆wi
χˆ i
0.0
482.58
991.97
1468.66
1963.47
2425.14
2856.65
3299.53
3712.36
4131.64
4521.49
4913.99
5278.33
5642.42
5980.14
6294.02
6606.46
6896.68
0.0
482.59
991.99
1468.69
1963.49
2425.17
2856.68
3299.56
3712.41
4131.73
4521.56
4914.07
5278.44
5642.50
5980.21
6294.09
6606.55
6896.80
-0.00
-0.01
-0.02
-0.03
-0.02
-0.03
-0.03
-0.03
-0.05
-0.09
-0.07
-0.08
-0.11
-0.08
-0.07
-0.07
-0.09
-0.12
1.06
1.05
1.04
1.03
1.04
1.03
1.03
1.03
1.00
0.97
0.99
0.98
0.95
0.98
0.99
0.99
0.97
0.94
a
The weights wi determined by the trapezoid rule eq 12 and by
the WHAM formula eq 14 together with the deviations ∆wi and the
correspondingly predicted (cf. eq 16) uniformity measures χˆ i. The
WHAM weights were employed as starting values for the CST
simulation.
Results and Discussion
At the start of an ST simulation, initial estimates for the
weight parameters wi are needed. We determined these
estimates from preparatory simulations using both the
approximate trapezoid rule eq 12 and the asymptotically
unbiased WHAM formula eq 14. Now, a first issue is the
reliability of the trapezoid rule, which we check using the
100 ps CST simulation.
Reliability of the Trapezoid Rule for CST. Table 2
compares the initial CST weights wi determined by the
trapezoid rule eq 12 from the preparatory CRE simulation
with the asymptotically unbiased values calculated by the
WHAM formula eq 14. For all Ti the wi obtained by the two
formulas agree quite well. The errors ∆wi of eq 12 never
exceed 12% of kBTi, and, correspondingly, the uniformity
measures χˆ i are all close to 1.0. Hence, one expects a nearly
uniform sampling even if the weights are determined by eq
12. Thus, adaptation schemes, which are based on the
trapezoid rule and on the WHAM formula, respectively,
should be nearly equivalent for the given system. In the CST
simulation we, therefore, applied the trapezoid rule for the
periodical recomputation of the wi.
Representative for the eighteen weights, Figure 1(a) shows
the deviation of the weights w8 and w17 from their initial
values as a function of the simulation time. The exceptional
first update wnew
) wold
k
k + ln(t0/tk), which can be seen as a
special case of the update scheme proposed by Zhang and
Ma,51 sizably reduces both weights and reflects the nonuniform sampling within the preceding first nanosecond of the
Figure 1. Uniformity of the temperature sampling in the CST
simulation. (a) Time evolution of the weights w8 and w17 with
respect to their initial values. (b) Uniformity measures observed (χi, eq 15, circles) and predicted (χˆ i, eq 16, squares)
after 27 ns at the temperatures Ti. The standard deviations
were estimated from MC trial simulations (see text for further
details). The dotted lines serve as a guide for the eye.
CST simulation. Here, the temperatures T8 ) 378 K and T17
) 500 K apparently have been visited more frequently than
T0 ) 300 K. The following updates, which rely on eq 12,
lead to considerable changes of the weights, which, however,
become smaller toward the end of the simulation. After
27 ns the weights seem to be converged within roughly (
0.1. This is approximately the same magnitude of error as
the one introduced by the trapezoid rule.
Figure 1(b) shows the measured (circles) and expected
(squares) uniformity measures χi and χˆ i extracted from the
last 25 ns of the CST simulation as functions of the
temperature. In contrast, the uniformity data shown in Table
Simulated Solute Tempering
2 had been extracted from the much shorter preparatory CRE
simulation. According to eq 16 the observables χˆ i reflect the
average deviations ∆wi between the trapezoid weights used
during the CST simulation and WHAM weights calculated
a posteriori from that simulation. As one sees in the figure,
the expectation values χˆ i are close to one demonstrating that
the trapezoid rule induces only small errors into the weights.
These data confirm the claim9 that the trapezoid rule is
appropriate for choosing the weights in conventional ST
simulations.
Despite the expected nearly uniform sampling of the
sampling along the temperature ladder the values χi measured
for the CST simulation deviate substantially from one,
yielding a root-mean-square deviation (RMSD) from uniformity of 16%. Because the weights are calculated with a
reasonable accuracy, this nonuniformity of the sampling must
be due to a too short CST simulation time. We checked this
issue by the simple MC model for the CST simulation
described in Theory and Methods, because here the expected
deviations from a uniform sampling can be reliably determined.
The gray bars in Figure 1(b) measure the standard
deviations σi of the χi resulting from 1000 MC model
simulations covering the same number of exchange trials as
our CST simulation. One would now expect that erf(σ) )
68% of the χi are found at smaller deviations than σi. In
fact, 12 of the 18 χi are within the corridor marked by the
σi, which nicely reproduces the expected statistics. Quite
clearly the standard deviations σi can be reduced by extending
the simulation time, which will then also lead to a CST
sampling close to uniformity. Now the question is whether
one can estimate the simulation time required for a reasonably uniform sampling. This issue can be addressed by
considering the round trip rate τ-1 given by eq 17.
CST Round Trip Rates. From the MC model simulations
we calculated an average round trip rate τ-1 of 0.83 ns-1
with a standard deviation of 0.1 ns-1. Equation 17 gives an
exact expression for τ applying to the SEO exchange scheme
used in the CST simulation. This expression rests on the
j for
assumption of identical acceptance probabilities P
j
exchanges along the ladder. The P determined from the CST
simulation is about 26%, and the resulting value τ-1 ) 0.85
ns-1 is very close to the MC result.
However, the round trip rate observed in the CST
simulation is sizably smaller measuring 0.64 ns-1. This
deviation suggests that the time interval of 0.5 ps between
subsequent exchange trials is too short to yield statistically
independent configurations, i.e. that the autocorrelation time
of the energy exceeds 0.5 ps. Thus, the system still has some
memory of the previous exchange trial, which, however, is
tolerable for most practical purposes. Furthermore, a round
trip rate of 0.64 ns-1 means that only 16 round trips were
counted during the CST simulation which is the main cause
for the observed 16% RMSD from uniform sampling. To
half this RMSD, a 4-fold number of round trips and, thus, a
4-fold simulation time would be necessary. Accordingly, one
can estimate the number of round trips needed to achieve a
desired level of uniform sampling. In turn, one can a priori
estimate the required simulation time by multiplying this
number by the predicted round trip time given in eq 17.
J. Chem. Theory Comput., Vol. 5, No. 10, 2009 2853
Table 3. Weights Determined from the REST/A
Simulationa
i
Ti
wi (trapezoid)
wi (WHAM)
∆wi
χˆ i
0
1
2
3
300K
350K
415K
500K
0.0
-16.91
-35.64
-56.16
0.0
-16.89
-35.52
-55.83
0.00
-0.02
-0.12
-0.33
1.12
1.09
0.99
0.80
a
The weights wi were determined from the initial 100 ps of the
REST/A simulation by the trapezoid rule eq 13 and the WHAM
formula eq 14 together with the deviations ∆wi and the
corresponding uniformity measures χˆ i. The WHAM weights serve
as starting values for the SST/A and SST/B simulation.
Next, we will study the sampling behavior of SST for
which we will additionally examine the adaptation scheme
based on the WHAM formula.
Reliability of the Trapezoid Rule for SST. Table 3
shows initial weights wi determined from the short REST/A
simulation. Because the solvent-solvent interactions do not
contribute to the partition function of SST, these weights
are tiny compared to those given in Table 2. Furthermore,
the deviations ∆wi between the trapezoid rule eq 13 and the
WHAM formula eq 14 are much larger than those listed in
Table 2. The associated uniformity measures χˆ i predict that
errors of this size will lead to a considerable nonuniformity
of the SST sampling if the wi are calculated by the trapezoid
rule. Recall here that this rule can be derived based on two
assumptions: (i) the heat capacity at consecutive temperatures
Ti and Ti+1 is constant and (ii) the logarithm ln(1 + ∆Ti+1,i/
Ti) is well approximated by its first order Taylor expansion.
These conditions are harder to fulfill for SST than for CST
because here the temperature steps ∆Ti+1,i are larger.
Figure 2 compares the effects of applying the trapezoid
and WHAM rules, respectively, for updating the wi during
SST simulations. Figure 2(a) shows the deviation of the
weight w3 belonging to T3 ) 500 K during the SST/A (gray
line, trapezoid) and SST/B (black line, WHAM) simulations
from the initial value. The first update drastically changes
w3 in both simulations indicating that w3 has been poorly
estimated by the preparatory simulation. The subsequent
updates reduce the large initial change to a final deviation
of about 1.4 in both cases.
At first glance, the small difference of the resulting w3
values suggests that the errors of the trapezoid rule are much
smaller than predicted by the preparatory REST/A simulation. To check this issue, we have recalculated the weights
wi of the SST/A simulation a posteriori by the WHAM
formula. The resulting time evolution of w3 is depicted in
Figure 2(a) by the gray dotted line. The difference of
0.22 ( 0.02 between the dotted gray and the solid gray lines
is nearly constant during the simulation. Obviously, the
trapezoid rule systematically underestimates w3. A similar
underestimate appears already in the initial guess for ∆w3
given in Table 2. Because of this systematic error of the
trapezoid rule, the uniformity of the temperature sampling
is expected to be suboptimal in SST/A.
Figure 2(b) shows the measured (circles) and predicted
(squares) uniformity measures χi and χˆ i of the SST/A
simulation. As indicated by the squares, the average deviations ∆wi between the trapezoid and the WHAM rules predict
deviations of up to 13% from uniformity. The measured χi
2854
J. Chem. Theory Comput., Vol. 5, No. 10, 2009
Figure 2. Uniformity of temperature sampling in SST simulations. (a) Deviation of the weight w3 from its initial value in
simulation SST/A (gray) and SST/B (black), respectively. The
dotted line shows w3′ calculated a posteriori from SST/A using
the WHAM expression eq 14. The broken w3 axis serves to
simplify the comparison with Figure 1(a). (b) Measured and
predicted uniformity measures χi and χˆ i of SST/A and (c) of
SST/B.
(circles) essentially follow these expectations but show even
larger deviations from uniformity, yielding an RMSD of
14%. For instance, χ3 happens to deviate by about two
standard deviations from the respective expectation value χˆ 3.
Like for CST, the standard deviations shown as gray bars in
the figure were determined from additional MC simulations.
Figure 2(c) compares the uniformity measures of the
SST/B simulation. Because the WHAM formula is the
reference, the errors ∆wi vanish and eq 16 predicts a uniform
sampling χˆ i ) 1.0 at all temperatures. In fact, the measured
χi are close to 1.0 and show an RSMD of only 3%. Thus,
SST/B exhibits an almost perfectly uniform sampling implying that the WHAM formula should be used in SST
simulations for updating the wi. The remaining deviations
from uniformity are consistent with the narrow range of the
statistical fluctuations estimated by our separate MC simulations. Compared with CST, the much smaller deviations of
the χi from the predictions χˆ i indicate that many more round
trips must have occurred during the SST simulations.
SST Round Trip Rates. Equation 17 predicts a round
trip rate of 20 ns-1 for the simulations SST/(A,B), if the
j ) 24% is used.
measured average acceptance probability P
Our MC models of SST/(A,B) have reproduced this rate.
For the MD simulations SST/A and SST/B, however, we
found round trip rates of only 15.8 ns-1 and 16.5 ns-1,
respectively. Thus, the SST simulations apparently display
Denschlag et al.
Figure 3. Volumes Vc sampled within 0.25 ns, 0.5 ns, 1 ns,
and 2.5 ns by the simulations CST, SST/B, and REST/C. Due
to the linearity of the shown Vc(t) curves, the values at 1 ns
represent the sampling speeds S in units of nlza/ns. (a)
Volumes Vc(t) sampled by the trajectories at all temperatures.
According to the inset the Vc(t) curves of REST/C and SST/B
are so close that they cannot be distinguished in the main
plot. (b) Volume Vc(t) sampled at the target temperature T0 )
300 K. Note that the statistical errors of the measured volumes
are smaller than the symbol sizes.
the same memory effect which was already observed in the
CST simulation and which reduces the round trip rates by
20%. Nevertheless, the SST round trip rates are by a factor
of 26 larger than the CST rates. Correspondingly, the χi are
much better converged in SST than in CST.
The large round trip rates and the nearly uniform sampling
achieved in simulation SST/B lead to the expectation that
this simulation setting leads for a peptide in solution to an
improved statistics. We now will address this issue for our
sample peptide 8ALA in TIP3P water.
Sampling Speed of CST, SST, and REST. The purpose
of any tempering algorithm is to increase the sampling speed
within the conformational space of the studied system. For
our various simulation settings we determined the sampling
speed using the LZA algorithm described in the Theory and
Methods section. This iterative algorithm measures the
volume Vc(t) of the configuration space sampled within a
simulation time t through an average number nlza of iterations.
The simulation speed S(t) is then given by the time derivative
of Vc(t).
Figure 3(a) shows the volumes Vc sampled by the
simulations CST, SST/B, and REST/C at all temperatures
as functions of the simulation time t. The respective sampling
speeds S are the constants Vc(t)/t at t ) 1 ns. The Vc curves
of REST/C and SST/B cannot be graphically distinguished
at the given scale as is documented by the inset in Figure
3(a). Apparently, CST provides the highest overall sampling
speed of the three simulations. The sampling speeds of SST/B
and REST/C are by about 10% smaller, which may be caused
Simulated Solute Tempering
by the scaling of the solvent part of the Hamiltonian
corresponding to an effectively cooler environment.
We have checked the latter conjecture by two MD
simulations at 500 K (data not shown) with and without
solvent scaling. Here, the effectively cooler solvent indeed
reduces the sampling speed of 8ALA by about 20%. For
lower temperatures we expect this effect to be correspondingly smaller. However, this small effect is tolerable if the
sampling speed at the target temperature T0 ) 300 K is
sufficiently enhanced, which is after all the aim of solute
tempering methods.
Figure 3(b) compares the sampling speeds at T0 ) 300 K
for the three methods. In contrast to the sampling speed of
the generalized ensemble, at 300 K REST/C samples the
peptide conformations 2.1 times faster than CST, and SST/B
outperforms CST even by a factor of 2.8. Thus, SST/B
samples also faster than REST/C, although the two simulations employ the same temperature ladder and the same
solvent scaling. An explanation of this speedup is given by
the different round trip rates of 11.1 ns-1 for REST/C and
16.4 ns-1 for SST/B. Due to the higher rate, SST delivers
the structural information that is gathered at higher temperatures faster to the target temperature implying an enhanced
speed of conformational sampling at T0.
The reduced round trip rate of REST/C compared to SST/B
directly results from the fact that for a given temperature
j RE of RE
ladder the average acceptance probabilities P
13
methods (including their sequential versions ) are smaller
j ST of the corresponding ST methods.40-42
than the probabilities P
The reason is that in RE the configurations of two replicas
must simultaneously meet a certain energy criterion instead
of only one replica in ST. Therefore, the average acceptance
j RE should be approximately the square of P
j ST.
probability P
For example, the average acceptance probability of SST/B
is about 26%. Thus, we expect a probability of 7% (0.262 ≈
0.07) for REST/C which is close to the measured value of
9%.
Optimal Exchange Scheme. Because the acceptance
probability of REST/C is much smaller than that of SST/B,
eq 17 predicts likewise different round trip rates. Compared
with that expectation the round trip rate measured for
REST/C (11.1 ns-1) seems to be too high compared to SST/B
(16.5 ns-1). This large REST/C rate illustrates the advantage
of the employed DEO exchange scheme compared to the
SEO scheme of SST/B.52 Furthermore, the optimal exchange
j which yield the highest round trip rates are
probabilities P
j is
different for these two schemes. For SEO the optimal P
j is between 40%
23%,52,66 whereas for DEO the optimal P
and 45% depending on the ladder size.39,52
To investigate the effects of the exchange scheme and the
acceptance probability on the SST round trip rates we have
carried out the two 12 ns simulations SST/C and SST/D.
SST/C switches from SEO to DEO, and SST/D additionally
uses five instead of four rungs to span the temperature range
from 300 K up to 500 K (see Table 1). SST/D thereby
j to about 40%. For SST/C we found a round trip
increases P
rate of 20.0 ns-1 which is about 20% larger than that of SST/
B. Thus, the DEO exchange scheme indeed speeds up the
J. Chem. Theory Comput., Vol. 5, No. 10, 2009 2855
Figure 4. Sampled volumes for the simulations SST/B, SST/
C, and SST/D at T0 ) 300 K by SST/C. Compared to SST/B
the DEO exchange scheme increases the sampling speed
by about 15%. This increase corresponds to the enhancement
of the round trip rate. Interestingly, the sampling speed of
SST/D is smaller than that of SST/B despite the much larger
round trip rate. Here, the lower sampling speed is caused by
the 25% reduced sampling time at 300 K due to the additional
temperature rung, which is not compensated by the higher
round trip rate.
Table 4. Round Trip Rates and Sampling Speeds
Measured at Temperature T0 ) 300 K
label
round trip rates/ns-1
speed/nlza/ns
CST
REST/C
SST/A
SST/B
SST/C
SST/D
0.63
11.1
15.8
16.5
20.0
23.8
68
146
204
190
216
183
j in simulation
round trips sizably. An additional increase of P
SST/D leads to a still larger round trip rate of 23.8 ns-1.
Figure 4 shows the effects of the round trip rates, which
increase in the sequence SST/B, SST/C, and SST/D, on the
sampling speed at 300 K. The highest sampling speed is
achieved by SST/C. Compared to SST/B the DEO exchange
scheme increases the sampling speed by about 15%. This
increase corresponds to the enhancement of the round trip
rate. Interestingly, the sampling speed of SST/D is smaller
than that of SST/B despite the much larger round trip rate.
Here, the lower sampling speed is caused by the 25% reduced
sampling time at 300 K due to the additional temperature
rung, which is not compensated by the higher round trip rate.
Finally, Table 4 summarizes the round trip rates and
sampling speeds measured in our simulations. Using REST
instead of CST speeds up the sampling by a factor of 2,
whereas SST yields a speedup factor of 3. Note, however,
that these factors are conservative estimates because of the
particular choice of the 8ALA system. In this system the
enthalpic barriers are small, and, therefore, the benefit of
tempering methods is limited (see Introduction).22 Correspondingly, the sampling speeds do not increase very much
upon heating the system from the lowest to the highest
temperature. However, target applications of tempering
methods feature large enthalpic barriers14,67 for which the
sampling has to be accomplished mainly at high temperatures. Correspondingly, the sampling speed at T0 should
depend much stronger on the round trip rates. As a key result,
2856
Denschlag et al.
J. Chem. Theory Comput., Vol. 5, No. 10, 2009
SST in combination with the DEO exchange scheme should
generally show much better sampling properties than REST
or CST.
Conclusion
We have introduced simulated solute tempering (SST) which
combines the (serial) simulated tempering method with solute
tempering, i.e. the key idea of the REST approach. SST poses
an efficient alternative to conventional simulated tempering
and replica exchange, including REST and its sequential
version SREST because it offers the largest acceptance
probabilities for a given temperature ladder.
From a practical point of view, it is gratifying to note that
SST can be easily implemented. For example, for rigid
models of the solvent molecules only the partial charges and
the van der Waals parameters have to be scaled to generate
the modified Hamiltonians at higher temperatures. Furthermore, SST enables a parallel sampling of many replicas even
on heterogeneous computer clusters, because all replicas
travel independently through temperature space.
The necessary ingredients of SST are the weights, i.e. the
dimensionless free energies of the system at the rungs of
the temperature ladder. The trapezoid rule recently suggested
by Park and Pande9 for the computation of the weights is
not accurate enough for SST but well suited for CST. Our
rederivation of this rule has shown that it is only accurate
for temperature ladders featuring small temperature differences, which is the case for CST. In SST a few rungs suffice
to span a large temperature range. Due to the failure of the
trapezoid rule, the SST weights should be updated using the
more complex but asymptotically unbiased WHAM formula
of Kumar et al.45 Then an almost perfectly uniform sampling
of the temperature rungs is guaranteed, if the simulation time
exceeds the average round trip time by about 2 orders of
magnitude.
Our comparison of different sampling methods (REST,
CST, SST) applied to an octapeptide in explicit water has
demonstrated that the SST sampling is the most efficient one
as was shown by the largest round trip rate and the highest
sampling speed at T0. Finally we have shown that the round
trip rates can be maximized by using the DEO instead the
SEO exchange scheme and by choosing a temperature ladder
that provides acceptance probabilities close to 45%.
In conclusion, the sampling efficiency of SST as well as
its ease of implementation and application nourishes the hope
that simulated tempering will become more popular and that
we may see many exciting applications in the future.
Acknowledgment. This work was supported by the
Deutsche Forschungsgemeinschaft (Grants SFB 533/C1 and
SFB 749/C4). Computer time provided by Leibniz Rechenzentrum (project uh408) is gratefully acknowledged.
expansion of the logarithm as ∆Sji ≈ CV∆Tji/Ti. With CV )
∆Uji/∆Tji one gets
∆Sji ≈
∆Uji
Ti
(19)
With the Helmholtz free energy F ) U - TS, where U
denotes the internal energy, the free energy difference ∆Fji
between the systems at Tj and Ti can be written as
∆Fji ) ∆Uji - Ti∆Sji - Si∆Tji - ∆Tji∆Sji
(20)
Inserting eq 19 one immediately finds
Fj ≈ Fi - Si∆Tji - ∆Uji
∆Tji
Ti
(21)
With eq 21, the dimensionless free energy difference ∆φji
) Fj/kBTj - Fi/kBTi can be written as
∆φji ≈ ∆βji(Ui + ∆Uji)
(22)
Interchanging i and j one obtains an equally valid estimate
∆φji ≈ ∆βji(Uj - ∆Uji)
(23)
where we have used ∆Xij ) -∆Xji. An even better approximation is then given by the arithmetic mean
∆φji ≈ ∆βji
Ui + Uj
2
(24)
where ∆Uji cancels. Restricting the internal energy U to its
configurational part, i.e., to the average potential energy 〈E〉,
yields the “trapezoid” rule eq 12.
Supporting Information Available: Average accepj i,i ( 1 for the various simulations (Tables
tance probabilities P
S1 and S2). This material is available free of charge via the
Internet at http://pubs.acs.org.
References
(1) Mitsutake, A.; Sugita, Y.; Okamoto, Y. Biopolymers (Peptide
Sci.) 2001, 60, 96.
(2) Okamoto, Y. J. Mol. Graphics Modell. 2004, 22, 425.
(3) Hukushima, K.; Nemoto, K. J. Phys. Soc. Jpn. 1996, 65,
1604.
(4) Hansmann, U. H. E. Chem. Phys. Lett. 1997, 281, 140.
(5) Sugita, Y.; Okamoto, Y. Chem. Phys. Lett. 1999, 314, 141.
(6) Metropolis, N.; Rosenbluth, A. W.; Rosenbluth, M. N.; Teller,
A. H.; Teller, E. J. Chem. Phys. 1953, 21, 1087.
(7) Lyubartsev, A. P.; Martinovski, A. A.; Shevkunov, S. V.;
Vorontsov-Velyaminov, P. N. J. Chem. Phys. 1992, 96, 1776.
(8) Marinari, E.; Parisi, G. Europhys. Lett. 1992, 19, 451.
(9) Park, S.; Pande, V. S. Phys. ReV. E 2007, 76, 016703.
Appendix
For a canonical ensemble with a heat capacity CV
independent of the temperature, eq 12 can be derived by the
following physical considerations. Using the shorthand
notation ∆Xji ≡ Xj - Xi, the entropy difference ∆Sji ) CV
ln(1 + ∆Tji/Ti) can be estimated by a first order Taylor
(10) Sugita, Y.; Kitao, A.; Okamoto, Y. J. Chem. Phys. 2000,
113, 6042.
(11) Fukunishi, H.; Watanabe, O.; Takada, S. J. Chem. Phys. 2002,
116, 9058.
(12) Affentranger, R.; Tavernelli, I. J. Chem. Theory Comput.
2006, 2, 217.
Simulated Solute Tempering
J. Chem. Theory Comput., Vol. 5, No. 10, 2009 2857
(13) Hagen, M.; Kim, B.; Liu, P.; Friesener, R. A.; Berne, B. J. J.
Phys. Chem. B 2007, 111, 1416.
(40) Mitsutake, A.; Okamoto, Y. Chem. Phys. Lett. 2000, 332,
131.
(14) Liu, P.; Kim, B.; Friesner, R. A.; Berne, B. J. Proc. Natl.
Acad. Sci. U.S.A. 2005, 102, 13749.
(41) Park, S. Phys. ReV. E 2008, 77, 016709.
(15) Kubitzki, M. B.; de Groot, B. L. Biophys. J. 2007, 92, 4262.
(43) Mitsutake, A.; Okamoto, Y. J. Chem. Phys. 2004, 121, 2491.
(16) Xu, W.; Lai, T.; Yang, Y.; Mu, Y. J. Chem. Phys. 2008,
128, 175105.
(44) Bussi, G.; Gervasio, F. L.; Laio, A.; Parrinello, M. J. Am.
Chem. Soc. 2006, 128, 13435.
(17) Lyman, E.; Ytreberg, F. M.; Zuckerman, D. M. Phys. ReV.
Lett. 2006, 96, 028105.
(18) Rao, F.; Caflisch, A. J. Chem. Phys. 2003, 119, 4035.
(19) Zhang, W.; Wu, C.; Duan, Y. J. Chem. Phys. 2005, 123,
154105.
(20) Rick, S. W. J. Chem. Theory Comput. 2006, 2, 939.
(21) Periole, X.; Mark, A. E. J. Chem. Phys. 2007, 126, 014903.
(22) Zuckerman, D. M.; Lyman, E. J. Chem. Theory Comput.
2006, 2, 1200.
(42) Zhang, C.; Ma, J. J. Chem. Phys. 2008, 129, 134112.
(45) Kumar, S.; Bouzida, D.; Swendsen, R. H.; Kollman, P. A.;
Rosenberg, J. M. J. Comput. Chem. 1992, 13, 1011.
(46) Shirts, M. R.; Chodera, J. D. J. Chem. Phys. 2008, 129,
124105.
(47) Kobrak, M. N. J. Comput. Chem. 2003, 24, 1437.
(48) Berg, B. A. J. Stat. Phys. 1996, 82, 323.
(49) Bartels, C.; Karplus, M. J. Comput. Chem. 1997, 18, 1450.
(50) Park, S.; Ensign, D. L.; Pande, V. S. Phys. ReV. E 2006, 74,
066703.
(23) Denschlag, R.; Lingenheil, M.; Tavan, P. Chem. Phys. Lett.
2008, 458, 244.
(51) Zhang, C.; Ma, J. Phys. ReV. E 2007, 76, 036708.
(24) Zhou, R.; Berne, B. J.; Germain, R. Proc. Natl. Acad. Sci.
U.S.A. 2001, 98, 14931.
(52) Lingenheil, M.; Denschlag, R.; Mathias, G.; Tavan, P. Chem.
Phys. Lett. 2009. in press (doi:10.1016/j.cplett.2009.07.039).
(25) Sanbonmatsu, K. Y.; Garcı´a, A. E. Proteins 2002, 46, 225.
(53) Okabe, T.; Kawata, M.; Okamoto, Y.; Mikami, M. Chem.
Phys. Lett. 2001, 335, 435.
(26) Pitera, J. W.; Swope, W. Proc. Natl. Acad. Sci. U.S.A. 2003,
100, 7587.
(27) Cecchini, M.; Rao, F.; Seeber, M.; Caflisch, A. J. Chem. Phys.
2004, 121, 10748.
(28) Jas, G. S.; Kuczera, K. Biophys. J. 2004, 87, 3786.
(29) Yang, W. Y.; Pitera, J. W.; Swope, W. C.; Gruebele, M. J.
Mol. Biol. 2004, 336, 241.
(30) Nguyen, P. H.; Stock, G.; Mittag, E.; Hu, C.-K.; Li, M. A.
Proteins 2005, 61, 795.
(31) Villa, A.; Stock, G. J. Chem. Theory Comput. 2006, 2, 1228.
(32) Schrader, T. E.; Schreier, W. J.; Cordes, T.; Koller, F. O.;
Babitzki, G.; Denschlag, R.; Renner, C.; Lweneck, M.; Dong,
S.-L.; Moroder, L.; Tavan, P.; Zinth, W. Proc. Natl. Acad.
Sci. U.S.A. 2007, 104, 15729.
(33) Villa, A.; Widjajakusuma, E.; Stock, G. J. Phys. Chem. 2008,
112, 134.
(34) Abraham, M. J.; Gready, J. E. J. Chem. Theory Comput.
2008, 4, 1119.
(35) Nadler, W.; Hansmann, U. H. E. J. Phys. Chem. B 2008,
112, 10386.
(54) Gardiner, C. W. Handbook of Stochastic Methods, 2nd ed.;
Springer, Berlin, 1985.
(55) Nadler, W.; Hansmann, U. H. E. Phys. ReV. E 2007, 75,
026109.
(56) MacKerell, A. D.; et al. J. Phys. Chem. B 1998, 102, 3586.
(57) Jorgensen, W. L.; Chandrasekhar, J.; Madura, J. D.; Impey,
R. W.; Klein, M. L. J. Chem. Phys. 1983, 79, 926.
(58) Schaftenaar, G.; Noordik, J. J. Comput.-Aided Mol. Des.
2000, 14, 123.
(59) Mathias, G.; Egwolf, B.; Nonella, M.; Tavan, P. J. Chem.
Phys. 2003, 118, 10847.
(60) Niedermeier, C.; Tavan, P. J. Chem. Phys. 1994, 101, 734.
(61) Allen, M. P.; Tildesley, D. J. Computer Simulations of
Liquids; Oxford University Press: Oxford, 1987.
(62) Eichinger, M.; Grubmu¨ller, H.; Heller, H.; Tavan, P. J. Comput. Chem. 1997, 18, 1729.
(63) Kraeutler, V.; van Gunsteren, W. F.; Hu¨nenberger, P. H.
J. Comput. Chem. 2001, 22, 501.
(36) Calvo, F. J. Chem. Phys. 2005, 123, 124106.
(64) Berendsen, H. J. C.; Postma, J. P. M.; van Gunsteren, W. F.;
Dinola, A.; Haak, J. R. J. Chem. Phys. 1984, 81, 3684.
(37) Brenner, P.; Sweet, C. R.; VonHandorf, D.; Izaguirre, J. A.
J. Chem. Phys. 2007, 126, 074103.
(65) Lyman, E.; Zuckerman, D. M. Biophys. J. 2006, 91, 164.
(38) Sindhikara, D.; Meng, Y.; Roitberg, A. E. J. Chem. Phys.
2008, 128, 024103.
(39) Denschlag, R.; Lingenheil, M.; Tavan, P. Chem. Phys. Lett.
2009, 473, 193.
(66) Kone, A.; Kofke, D. A. J. Chem. Phys. 2005, 122, 206101.
(67) Reichold, R.; Fierz, B.; Kiefhaber, T.; Tavan, P. Submitted
for publication.
CT900274N
5 Relaxation eines lichtschaltbaren
Peptides
In den vorangegangenen Kaptiteln standen theoretische Methoden zur Bestimmung der
kanonischen Gleichgewichtsensembles von Peptid-Lösungsmittel-Systemen im Vordergrund. Wie ich bereits in der Einleitung angesprochen habe, war der Grund für diese
Untersuchungen ganz praktischer Natur, da ich zur Simulationsbeschreibung der NichtGleichgewichtsdynamiken ausgewählter Modellpeptide deren Gleichgewichtsensembles
benötigte. Der folgende Abschnitt ist ein Nachdruck1 des Artikels
Robert Denschlag, Wolfgang J. Schreier, Benjamin Rieff, Tobias E.
Schrader, Florian O. Koller, Luis Moroder, Wolfgang Zinth, Paul Tavan:
„Relaxation time prediction for a light switchable peptide by molecular
dynamics“
Phys. Chem. Chem. Phys. 12, 6204 - 6218 (2010),
den ich zusammen mit Paul Tavan und den genannten Autoren verfasst habe. Mittels Computersimulationen und spektroskopischen Methoden werden Struktur und Dynamik eines
lichtschaltbaren Peptids namens cAPB untersucht. Im Mittelpunkt der Arbeit steht die
Charakterisierung der durch die cis-trans Isomerisierung des Farbstoffes ausgelösten Relaxationsdynamiken. Unter Verwendung der REST Methode2 werden die strukturellen cisund trans-Gleichgewichtsensembles simuliert, welche anschließend als Referenz zur Bestimmung des Fortschritts der lichtinduzierten Relaxation dienen. Die Ergebnisse werden
mit Daten aus zeitaufgelöster IR-Spektroskopie verglichen, wobei es erstmals gelungen
ist, mit Simulationen längere als die experimentell zugänglichen Zeitskalen abzudecken.
Die berechnete 23 ns Relaxationszeit für den cis-trans-Übergang ist mithin eine Vorhersage der Theorie.
1
2
Mit freundlicher Genehmigung der Royal Society of Chemistry.
Zu Beginn dieses Projektes war die SST Abtasttechnik noch nicht ausgearbeitet.
57
PAPER
www.rsc.org/pccp | Physical Chemistry Chemical Physics
Relaxation time prediction for a light switchable peptide by molecular
dynamicsw
Robert Denschlag,a Wolfgang J. Schreier,b Benjamin Rieff,a Tobias E. Schrader,b
Florian O. Koller,b Luis Moroder,c Wolfgang Zinthb and Paul Tavan*a
Received 20th October 2009, Accepted 25th February 2010
First published as an Advance Article on the web 14th April 2010
DOI: 10.1039/b921803c
We study a monocyclic peptide called cAPB, whose conformations are light switchable due to
the covalent integration of an azobenzene dye. Molecular dynamics (MD) simulations using the
CHARMM22 force field and its CMAP extension serve us to sample the two distinct
conformational ensembles of cAPB, which belong to the cis and trans isomers of the dye, at room
temperature. For gaining sufficient statistics we apply a novel replica exchange technique. We find
that the well-known NMR distance restraints are much better described by CMAP than by
CHARMM22. In cAPB, the ultrafast cis/trans photoisomerization of the dye elicits a relaxation
dynamics of the peptide backbone. Experimentally, we probe this relaxation at picosecond time
resolution by IR spectroscopy in the amide I range up to 3 ns after the UV/vis pump flash.
We interpret the spectroscopically identified decay kinetics using ensembles of non-equilibrium
MD simulations, which provide kinetic data on conformational transitions well matching the
observed kinetics. Whereas spectroscopy solely indicates that the relaxation toward the
equilibrium trans ensemble is by no means complete after 3 ns, the 20 ns MD simulations
of the process predict, independently of the applied force field, that the final relaxation into the
trans-ensemble proceeds on a time scale of 23 ns. Overall our explicit solvent simulations cover
more than 6 ms.
Introduction
Predicting the native states and the corresponding folding
pathways of proteins from their amino acid sequences are
the two key challenges in the computational biology of
protein-folding.1 In principle, all-atom molecular dynamics
(MD) simulations employing equilibrium and non-equilibrium
a
Theoretische Biophysik, Department fu¨r Physik,
Ludwig-Maximilians-Universita¨t, Oettingenstr. 67, 80538 Mu¨nchen,
Germany. E-mail: tavan@physik.uni-muenchen.de;
Fax: +49-89-2180-9202; Tel: +49-89-2180-9220
b
Lehrstuhl fu¨r Biomolekulare Optik and Munich Center for Integrated
Protein Science CIPSM, Ludwig-Maximilians-Universita¨t,
Oettingenstr. 67, 80538 Mu¨nchen, Germany
c
Max Planck Institut fu¨r Biochemie, Am Klopferspitz 18a, 82152
Martinsried, Germany
w Electronic supplementary information (ESI) available: 7 tables,
7 figures, and a text explaining the additional material. The convergence
of the REST simulations is illustrated by Fig. S12 and the differences
induced into the free-energy landscapes of cAPB at 300 K (cf. Fig. 3)
by the two force fields are discussed. The proton distances relevant for
comparisons of NMR and REST simulation data are listed in Tables
S4 and S5. The force field employed for the APB chromophore and its
linkage to the peptide is specified through Fig. S13 and Tables S6–S9.
Fig. S14 and S15 provide illustrations for the arguments on the
temperature dependence of the RMSV contained in a corresponding
section. Fig. S16 documents the temperature independence of the
helicity measure H2. Fig. S17 shows the results of our simulations
on the cooling kinetics, and Table S10 adds quantitative data to the
temporally resolved free energy landscapes in Fig. S11 by specifying
2(t). Finally, Fig. S18
1(t) and H
the associated average helicities H
presents the relaxation data shown in Fig. 10 once again but now on a
logarithmic time scale to more clearly reveal the fast processes. See
DOI: 10.1039/b921803c
6204 | Phys. Chem. Chem. Phys., 2010, 12, 6204–6218
settings can tackle these two problems with a degree of spatial
and temporal resolution unreachable by other methods.
Unfortunately, the formation of the tertiary structures
building up the native state requires several microseconds in
the case of fast folding proteins and at least milliseconds in
other cases.2,3 These time scales are ten to twelve orders of
magnitude larger than a single MD integration step of about
1 fs used in standard simulations. Due to the progress of
computer power, sampling times of small proteins in
explicit solvent covering several microseconds are nowadays
achievable.4 However, because MD simulations are single
molecule experiments, a multiple of the expected folding time
has to be simulated for meaningful statistics.3,5
More statistics with the same computational effort can be
produced by modeling the solvent environment implicitly
instead of explicitly. Models of the ‘‘generalized Born’’
(GB) type6 are particularly popular representatives of such
approaches,7–9 but an implicit solvent description may entail
oversimplifications.10,11 For example, GB is known to overstabilize salt bridges12 and its lack of solvent viscosity can lead
to wrong estimates of timescales.9 Moreover, the foundation
of GB methods lacks theoretical rigor, because it definitely
does not provide for solvated proteins a solution of the
Poisson equation, which is required for well-founded continuum approaches.13–15
On the other hand, it is not clear whether explicit
solvent descriptions, which should be more accurate than
implicit solvent models,2,16,17 are precise enough for reliable
predictions.17,18 To gain an access toward scrutinizing the
This journal is
c
the Owner Societies 2010
quality of the force fields employed in such all-atom MD
simulations it is necessary to compare computational with
experimental results. However, in order to actually bridge the
gap between experiment and simulation, the sample system
under consideration and the applied experimental techniques
have to meet a series of conditions.
To be specific, the experimental techniques should offer a
temporal and spatial resolution close to that of MD (fs, A˚) and
the sample system should be measured at (or at least near)
physiological conditions. As of today, there is no experimental
technique which combines a fs time resolution with an atomistic
spatial resolution. On the other hand, ultra-fast pump–probe
spectroscopy19 and NMR spectroscopy20 are techniques which
separately achieve either a temporal or a spatial resolution
close to that of MD. Here, ultrafast pump–probe spectroscopy
enables studies of non-equilibrium dynamics and NMR
spectroscopy yields structural information on equilibrium
ensembles.
For ultrafast spectroscopy to work at its best, one must be
able to elicit non-equilibrium peptide dynamics by a light pulse
thus providing a temporally well-defined starting point. An
established approach serving this purpose is a laser-induced
temperature jump.21–23 However, such a jump causes shock
waves and may thus entail non-physiological system conditions.24 A more direct way consists in covalently integrating
a fast light switch into the backbone of a peptide.25–32 The
ultrafast cis–trans photoisomerization (200 fs) and the large
geometric changes of azobenzene dyes qualify these molecules
as particularly suitable photoswitches.31 Below we will denote
the corresponding constructs as azopeptides.
The limited computational resources strongly restrict the
size and complexity of a chosen sample peptide, because the
simulation of the necessarily large33 solvent environment
demands a huge computational effort. Thus, nowadays MD
can deliver a sufficient statistics only for quite simple peptides
in solution. A restricted complexity of the system is additionally
important for a clear-cut interpretation of experimental data.
Further below, we will also have to tackle the difficulties posed
by such interpretations.
The design and synthesis of different azopeptides26–30 has
provided the material for a series of experimental25–27,34–39 and
theoretical38–43 investigations. These studies aimed at characterizing (i) the conformational equilibria in the cis and trans
states of the azobenzene switch and (ii) the light-induced
dynamics transforming the perturbed cis ensemble into the
trans equlibrium ensemble (or vice versa).
However, not all azopeptides are equally suited for simulation descriptions. For example, in light-switchable b-hairpin
models29,32,44 the folding process can be induced by the trans
to cis photoisomerization of the switch. Like in any b-hairpin
peptide, the folding process takes at least a few micro-seconds
until completion.38,45 Currently, such folding times cannot be
covered by MD descriptions with sufficient statistics (if one
employs an explicit solvent setting). Moreover, for b-hairpins
the use of enhanced sampling techniques like replica exchange46
may even slow down the sampling as recently shown by
Denschlag et al.47
Much better suited for simulation descriptions is a class of
azopeptides, which contains octapeptide fragments cyclized by
This journal is
c
the Owner Societies 2010
Fig. 1 Chemical structure of the model azopeptide cAPB. Here, the
sequence Ala-Cys-Ala-Thr-Cys-Asp-Gly-Phe, which is part of the
binding pocket of thioredoxin reductase (residues 134–141), is cyclized
by (4-amino)phenylazobenzoic acid (APB). The formation of a
disulfide bridge is prevented by protective S-tert-butylthio-groups
(StBu) at the two cysteins.
an azobenzoic derivative.26,28 One of these peptides is bicyclic.26
Here, a disulfide bridge additionally restricts the conformational
flexibility thus reducing the computational effort required for a
statistically sufficient MD sampling. Correspondingly the
bicyclic models have been extensively studied by MD.41–43
However, when one searches for models of the conformational
dynamics in native peptides, the lacking conformational
flexibility clearly qualifies the bicyclic azopeptides as suboptimal.
In this work we study the more flexible monocyclic azopeptide
cAPB, whose chemical structure is depicted in Fig. 1.
Previous investigations of cAPB by NMR26 and by MD40
showed that the isomeric state of the chromophore, which is
either cis or trans, defines two distinctly different structural
ensembles of the peptide backbone. In the trans ensemble the
backbone is restricted to extended conformations. In contrast,
the cis ensemble features a much larger number of conformational substates. Although MD predicted multiple openloop conformations for the cis ensemble, which were missing
in the refined NMR structures, the MD ensembles were shown
to comply with the NMR distance restraints. As a technical
critique we note that the MD simulations were performed at
500 K (and thus at a much larger temperature than the NMR
measurements) for the sole reason of collecting sufficient
statistics at a manageable computational effort.
Furthermore, the relaxation dynamics of cAPB during the
first nanosecond after cis/trans photoisomerization has been
monitored by time-resolved pump–probe spectroscopy in the
visible spectral range supplemented by an MD analysis of
the energy relaxation.39 In this study the UV/vis absorbtion of
the azobenzene switch was used as a probe for the relaxation
of the cyclic peptide. The study demonstrated fast and strongly
driven structural changes in the peptide chain after the isomerization. The kinetics of the energy relaxation obtained by MD
was shown to quantitatively agree with the experimental results.
For the first nanosecond, within which the main processes of
conformational dynamics occur near the switch, the optical
spectrum of this dye was shown to be a sensitive probe of the
ongoing relaxation. At a delay time of 1 ns the optical spectrum
of the dye closely resembles that of the relaxed target state. For
that point in time the MD simulations predicted that the
relaxation of the peptide backbone is still far from being
complete.39 Thus, UV/vis pump probe spectroscopy monitoring
the chromophore appears to loose its sensitivity after 1 ns.
Phys. Chem. Chem. Phys., 2010, 12, 6204–6218 | 6205
In summary, the quoted earlier studies left several questions
unanswered. First, one may wonder whether the reported agreement between MD structures and NMR distance restraints
prevails, if the MD simulations serving to determine the conformational ensembles are carried out at a temperature of 300 K,
which matches the conditions of the NMR measurements much
more closely. Here one could speculate that lowering the
temperature in the MD simulations from 500 to 300 K leads to
more compact peptide structures thus modifying the prediction
of open-loops as the dominant motifs in the cis-cAPB ensemble.40 Concerning the relaxation of the peptide one may ask how
long it actually takes for the peptide to reach the equilibrium
ensemble of the trans-state when starting from the cis-ensemble.
Addressing these questions we have reinvestigated the cAPB
peptide by applying various novel or revised methods:
Instead of sampling the cis and trans ensembles by plain
MD at the elevated temperature of 500 K40 we have applied a
variant48 of the Hamilton replica exchange (HRE) approach
originally suggested by Liu et al.,49 which can yield the equilibrium ensembles at ambient temperature with a manageable
computational effort.
Instead of employing the total energy39 of the peptide as a
probe for the relaxation dynamics simulated by MD we now use
a structural observable sensitive for the backbone conformation. Here, the equilibrium ensembles of the HRE simulations
provide measures for judging the progress of the relaxation.
Instead of using the UV/vis spectrum of the chromophore
as a probe, we now apply ultrafast infrared (IR) absorption
spectroscopy in the region of the amide I band of the peptide
allowing us to directly and more sensitively monitor the
relaxation of the backbone.
As a more technical issue we additionally check to what
extent the MD results depend on the applied force field.
Instead of solely using the CHARMM22 force field,50 we
have also employed its CMAP extension.51 Comparison with
experimental data will then allow a judgement on the relative
performance of these two force fields. Altogether, we have
spent a simulation time of more than 6 ms to cAPB solvated in
dimethyl-sulfoxide (DMSO).
Theoretical methods
Replica Exchange
To sample the conformational space of the cis and trans
ensembles of cAPB at 300 K we have applied a variant48 of
replica exchange with solute tempering (REST).49 Within
REST, N + 1 copies (replicas) of the system are simulated
at different temperatures T0 o T1 o o TN. The canonical
ensembles thus generated at the temperatures Ti constitute a
so-called generalized ensemble. After fixed time intervals,
one checks whether the temperatures of replica pairs can be
exchanged. A Metropolis criterion52 determines the exchange
probability
Pij = min[1,exp( Dij)]
(1)
between replicas at Ti and Tj with
Dij = bi[E(xj, Ti)
E(xi, Ti)] + bj[E(xi, Tj)
E(xj, Tj)] (2)
6206 | Phys. Chem. Chem. Phys., 2010, 12, 6204–6218
to preserve the canonical ensembles. Here, E(x, T) is the
potential energy of the configuration x at the temperature
T and b = 1/kBT where kB is the Boltzmann constant.
Within REST, the potential energy of a replica at the
temperature Ti is
E(x, Ti) = li,0Epp(x) + li,1Ess(x) + li,2Eps(x)
pp
ss
(3)
ps
where E , E and E are the solute–solute, solvent–solvent,
and solute–solvent parts of the (unscaled) potential energy
function at T0 and li,h are suitable parameters depending on
Ti. We choose
pffiffiffiffiffiffiffiffiffiffiffiffi
li;0 ¼ 1; li;1 ¼ Ti =T0 ; li;2 ¼ Ti =T0 :
ð4Þ
Note that the form of li,2 slightly differs from the form used in
the original work49 because our choice is particularly handy
for implementations.48 When using the CHARMM force
field50 and a rigid solvent model, e.g., solely the partial charges
and the Lennard-Jones
of the solvent molecules have
ffiffiffiffiffiffiffi
pffiffiffiffiffienergies
to be multiplied by Ti =T0 and Ti/T0, respectively, to achieve
the scaling required by eqn (3) and (4), which then includes
even the mean field contributions33,53 to the energy function.
The advantage of the REST approach described above
becomes apparent after a few algebraic operations. With
eqn (3) and (4), eqn (2) reduces to
bj Þ ½E pp ðxj Þ E pp ðxi ފ
pffiffiffiffiffiffiffiffi qffiffiffiffiffiffiffiffiffi
þ ð b0 bi
b0 bj Þ ½E sp ðxj Þ
Dij ¼ðbi
E sp ðxi ފ:
ð5Þ
Thus, the difference Dij, which determines the acceptance
probability (1), is solely calculated from the solute–solute
and solute–solvent energies, while the potential energy Ess of
the solvent cancels. As a consequence, the number of replicas
needed to cover a given temperature range is drastically
reduced.
Simulation system and force-fields
The cAPB model peptide was studied by molecular dynamics
(MD) simulations. As starting structures for the cis and trans
states of cAPB we have chosen the respective NMR structures
of lowest energy.26 Each initial structure was placed into the
center of an orthorhombic dodecahedron of 24 A˚ inner radius
and was surrounded by 649 DMSO molecules. Thus each
system contained N = 2744 atoms. For the peptide moiety
either the CHARMM22 all-atom force-field,50 which we will
denote by the shortcut ‘‘C22’’ from now on, or its CMAP
extension51 were employed. Note here that the CMAP extension of C22 solely modifies the potentials of the f/c dihedral
angles within the peptide backbone. The force field required for
the chromophore and for its linkages to the peptide backbone
were adopted from ref. 54. These parameters are reproduced in
the electronic supplementary information (ESI). A united-atom
model was chosen for the DMSO molecules.55
MD simulation techniques
The software package EGO-MMVI33 was used for the MD
simulations. The electrostatic interactions were treated combining structure-adapted multipole expansions56 with a movingboundary reaction-field approach.33 Here, the electrostatic
This journal is
c
the Owner Societies 2010
After minimizing the energy of the four simulation systems
(cis and trans cAPB with and without CMAP) they were
equilibrated for 400 ps by a three step MD simulation as
follows: (i) within the first 100 ps the systems were tuned to
ambient temperature (300 K) and pressure (1 bar) while the
peptide was kept rigid so that the DMSO molecules could
adapt to the solute; (ii) during subsequent 50 ps simulations
the positions of the Ca atoms were softly constrained to the
initial locations by harmonic restraints; (iii) the equilibrations
were completed by 250 ps unconstrained simulations in which
our Berendsen thermostat was separately applied to each of
the two subsystems, the solute and the solvent. In all subsequent simulations serving for data collection we switched from
the initial NpT ensemble to an NVT ensemble by deactivating
the barostat and maintaining the resulting volume V. In these
simulations we furthermore switched off the thermostat of the
solute thus implementing a non-invasive solute tempering.60
ladder specified below. Subsequently, exchange trials were
attempted every 20 ps using a deterministic even-odd exchange
scheme.46 Data for analysis were saved every 2 ps. The REST
simulations covered the temperature range from 300 K to
570 K by ten rungs with the following ladder: 300, 323, 347,
372, 399, 428, 460, 495, 532, 570 K. This temperature ladder
leads to an average acceptance probability (AP) of about
40% (5%) and, thus, yields the highest round trip rates
possible within the highly efficient even-odd exchange scheme
employed.61,62 In our REST simulations the replicas executed
random walks yielding a nearly uniform sampling of the
temperature ladder (data not shown). Thus, sampling problems
like those observed earlier63 in relatively short (5 ns) REST
simulations of other peptides were absent in our peptide-solvent
systems.
To elucidate the relaxation dynamics initiated by the cis-totrans isomerization of the chromophore, we have carried out
two sets of non-equilibrium MD simulations denoted as I/C22
and I/CMAP (cf. Table 1). Each of these sets contains 50
separate runs differing by the initial structures randomly
chosen from the REST simulations R/C/C22 and R/C/CMAP,
respectively, and by the choice of the force field, of course.
After a preparatory period of one picosecond, the chromophore was driven from cis to trans by activating a potential
designed to mimic the photoisomerization.39,54 After 10 ps this
potential was switched off without causing perturbations,
because the chromophore reaches the trans state much more
rapidly (o1 ps). The thus initiated relaxation dynamics of the
backbone was monitored for the following 20 ns. Data for
analysis were saved every 100 fs within the first 10 ps, every
picosecond within the time span from 10 ps to 100 ps, and
every 10 ps thereafter.
Simulations
Helicity elongation score
Table 1 summarizes our MD simulations. The equilibrium
properties of cAPB were probed by four REST simulations:
The simulations R/C/C22 and R/T/C22 served to explore the
conformational ensembles of cis (C) and trans (T) cAPB as
predicted by the standard C22 force field. In the simulations
R/C/CMAP and R/T/CMAP the CMAP extension of that
force field was employed. The REST simulations R/C/C22,
R/T/C22, and R/T/CMAP covered a simulation time of 100.5 ns
for each replica, whereas for the simulation R/C/CMAP this
time was extended to 150.5 ns.
During the first 0.5 ns no exchanges were attempted. This
preliminary period served to generate different initial structures
for the replicas residing at the various rungs of the temperature
To characterize the conformational ensembles sampled by the
peptide backbone of cAPB in the cis and trans states of the
chromophore and to measure the progress of the peptide’s
relaxation from the initial cis to the final trans ensemble, we
employed the so-called helicity elongation score (HELO). This
score can distinguish ‘‘extended’’ and ‘‘helical’’ conformations
of the backbone. The latter class comprises, e.g., the 310-, a-,
and p-helices as well as various types of turns, whereas the
‘‘extended’’ conformations cover, e.g., b-strands and polyproline I/II structures. Whenever we call a peptide structure
a-helical, this narrower classification is based on applying the
dictionary of protein secondary structure (DSSP).64 However,
we will generally not show the corresponding DSSP data.
For an explanation of the HELO score we note that residues
involved in an a-helical structure have c-angles typically near
ca = 471, whereas in an extended b-strand the angles are
near cb = ca + 1801 = 1331. Fig. 2 defines an a-b-scoring
function helo(c), which linearly decreases from 1 to 1 as c
changes from c = ca to c = cb. As a measure for the helicity
of a sequence portion A covering several residues i A A we
define the HELO score
interactions were explicitly evaluated up to a distance of about
24 A˚ to fulfill the minimum image convention.53 Beyond this
distance, a dielectric continuum was assumed with a static
dielectric constant es = 45.8. The van-der-Waals interactions
were explicitly evaluated up to a distance of 10 A˚; at larger
distances a mean-field approach was applied.53 The dynamics
was integrated by a multiple-time-step scheme57 building upon
a basic time step of 1 fs. The lengths of all bonds involving
hydrogen atoms were constrained using the M-SHAKE
algorithm58 with relative tolerance of 10 6. Temperature T
and pressure p were controlled by a Berendsen59 thermostat
and barostat with coupling constants of 0.5 and 5 ps,
respectively.
System preparation
Table 1
Label
Overview over the MD simulations
a
Durationb
R/C/C22
R/C/CMAP
R/T/C22
R/T/CMAP
I/C22
I/CMAP
10
10
10
10
50
50
100.5
150.5
100.5
100.5
20 ns
20 ns
ns
ns
ns
ns
Exchangec
Ranged
20
20
20
20
—
—
300
300
300
300
300
300
ps
ps
ps
ps
K–570
K–570
K–570
K–570
K
K
K
K
K
K
a
Name. b Simulation times. c Time between exchange trials. d Temperature range covered by the simulation.
This journal is
c
the Owner Societies 2010
HELO ðAÞ ¼
P
helo ðci Þ
jAj
i2A
ð6Þ
Phys. Chem. Chem. Phys., 2010, 12, 6204–6218 | 6207
mean square deviation, we introduce the root mean square
violation (RMSV)
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
1 X
RMSVðt0 ; tÞ ¼
max½0; dij ðt0 ; tÞ dijexp Š2 ; ð8Þ
jMj ðijÞ2M
by which a simulation result violates the NOE distance
restraints.25 Here, M is the set of all proton pairs (ij), for
which distances dexp
ij were experimentally determined, and |M|
is the number of these distances. Note that the ESI contains a
section explaining the properties and particularly the temperature dependence of the RMSV.
Fig. 2 The a-b-scoring function helo(c).
Experimental methods
as the average a-b-score of the set of c-dihedral angles
{ci: i A A}.
For our analysis of the cAPB peptide conformations we will
use the sets A1 = {1,2,8} and A2 = {3,4,5,6}, where 1 denotes
alanine, 2 cysteine and so on (cf. Fig. 1). Thus, H1 HELO(A1)
reflects the structure of the peptide in the vicinity of the
chromophore and H2 HELO(A2) the structure within
the core of the peptide moiety. Note, that we have omitted
the residue 7 (glycine) in A1 because of its high flexibility.
Sample preparation
Free energy maps
Stationary IR spectroscopy
The conformational coordinates H1 and H2 introduced above
were used for characterizing the conformational ensembles of
cis and trans cAPB by free energy maps G(H1,H2). To compute
these maps the rectangle [ 1, 1] [ 1, 1] was divided into
20 20 bins and statistics over the HELO scores encountered
in the REST simulations yielded bin-counts m(H1,H2). Up to
an arbitrary constant, the free energy is given by G(H1,H2) =
kBT ln[m(H1,H2)/Mmax], where Mmax is the maximal bin-count.
Note that this choice for G guarantees that the minimum of G
is zero. Because empty bins in the histogram density estimate
would lead to infinite free energies, an upper energy cutoff
Gmax = kBTln(1/Mmax) was introduced and G was set to
Gmax at all empty bins. A nearest neighbor smoothing was
applied before generating contour plots of G(H1,H2).
Steady state absorption spectra were recorded using a Fourier
transform infrared (FTIR) spectrometer IFS66 from Bruker
(Ettlingen, Germany). No indications for sample degradation
were found from stationary FTIR spectroscopy during the
measurements. The trans-azo conformation is the thermally
stable form of cAPB. At room temperature the cAPB molecules
reach the trans-azo conformation at a timescale of some days.
As a consequence, dark adapted molecules (yielding a concentration of about 100% trans isomers) were used for the
study of the trans ensemble of cAPB. For the investigation of
the cis ensemble the sample was converted to the cis-azo
conformation by continuous UV illumination of the trans
pp*-absorption band with the light of a HgXe arc lamp
emitting around 370 nm (LOT, Darmstadt, Germany). The
lamp was equipped with filters from Schott (UG11, WG320,
GG375). Taking the extinction coefficients for the cis and trans
isomers into account one can estimate that about 90% of all
molecules were in the cis form during the measurements.
Proton distances
To compare the equilibrium ensembles computed by our REST
simulations with the well-known NMR distance restraints
derived from NOEs of the cis and trans isomers of cAPB,25
we have calculated proton–proton interaction distances
dij from proton–proton distances rij sampled by the simulations. Because a NOE is caused by a dipole–dipole interaction, we have calculated the interaction distances dij by the
prescription65
dij(t0,t) = (hrij 6i[t0,t])
1/6
.
(7)
Here, hi[t0,t] denotes the average over the simulation time
interval [t0,t] with t > t0. For chemically equivalent protons
{i1,. . .,in} the geometrical center was used to compute the
distance rij to proton j. In analogy to the well known root
6208 | Phys. Chem. Chem. Phys., 2010, 12, 6204–6218
The cAPB peptides were prepared as described in ref. 26. The
sample was dissolved in dimethylsulfoxide (DMSO) from
Merck (Darmstadt, Germany) at a concentration of about
7 mM. In the time resolved pump probe experiments the sample
was circulated through home made flow cells (pathlength 0.1 mm)
with CaF2 windows. This closed cycle system ensures the exchange
of the illuminated sample volume between consecutive excitation
laser pulses.
Femtosecond IR spectroscopy
The structural dynamics of the cis to trans reaction have been
investigated by time resolved UV pump IR probe spectroscopy. A detailed description of the experimental setup is given
in ref. 66. In brief, we used the pump and probe technique
with single pulses from a Ti-sapphire laser-amplifier system
operated at 1 kHz. Second harmonic generation was used for
excitation at 404 nm with an energy of about 2 mJ. In order to
reduce excessive nonlinearities induced by the intense excitation pulses the duration of the pump pulses was increased by a
quartz rod (15 cm) in front of the sample. The pump pulses
had a duration of about 700 fs and were focused to a spot size
of about 150 mm (FWHM) at the sample position. The IR
probe pulses were generated using a two stage BBO optical
This journal is
c
the Owner Societies 2010
parametric amplifier. The resulting near infrared pulses were
used for difference frequency mixing in a AgGaS2 crystal66,67
yielding tuneable pulses in the mid IR. The resulting mid IR
probe pulses had a temporal width of about 150 fs and were
focused to a spot size of about 90 mm (FWHM). With a
spectral bandwidth of about 150 cm 1 the probe pulses were
tuned to several discrete central wavelengths to cover the
range of the amide I and amid II bands between 1450 cm 1
and 1800 cm 1. After the sample the probe pulses were split
into two beams by a Ge plate and focused on the entrance slit
of two spectrographs. Transient spectra were recorded with
identical 32-element MCT detectors at a spectral resolution of
about 3 cm 1. In the experiment transient absorption signals
with perpendicular and parallel polarizations were recorded
simultaneously. All pump–probe signals shown in the figures
correspond to magic angle polarization conditions calculated
from the parallel and perpendicular signals. For the measurement of the time dependence of the absorption change the
pump pulses were delayed with respect to the probe pulses
by means of an optical delay stage allowing delay times of up
to 3.5 ns.
Results and discussion
As outlined in the Introduction, the relaxation dynamics of
the cAPB backbone, which is generated by the light-induced
cis/trans isomerization of the APB chromophore, is in the
focus of this study. The intended comparisons of a corresponding simulation description with time-resolved spectroscopic data obtained by pumping a femtosecond light flash
into a room temperature equilibrium ensemble of solvated
cis-cAPB peptides require a careful computational characterization of the equilibrium ensembles of the cAPB peptides
in their cis and trans states, respectively. Here, a statistically
valid description of the cis ensemble is required to have a proper
model for the experimental ensemble present before arrival of
the laser flash. A corresponding model of the trans equilibrium
ensemble is necessary to gain a reference, which will allow to
judge the progress of the simulated non-equilibrium cis/trans
relaxation towards its final goal, i.e. the trans ensemble.39,40,54
Equilibrium ensembles at 300 K
As also mentioned in the Introduction, there has been a
previous attempt to characterize the cis and trans ensembles
of cAPB by MD simulation.40 For gaining at least some
statistics on the variety of conformational states sampled by
cAPB in DMSO, Carstens et al.40 were forced to elevate the
temperature to 500 K in their 50 ns MD simulations. The
resulting conformational ensembles showed a good agreement
with NOE data obtained from NMR measurements.25 Because
the quoted work applied the same force field for DMSO
and cAPB as we do in our approach (e.g. C22 for the
peptide portion of cAPB), we could have employed these
earlier structural ensembles for our much more extended nonequilibrium relaxation simulations.
However, it is an open question as to whether such 50 ns
MD simulations of cAPB in DMSO at 500 K actually
yield sufficiently accurate models for the experimental 300 K
ensembles. Furthermore, the quality of the force field is always
This journal is
c
the Owner Societies 2010
Fig. 3 Free energy landscapes G(H1,H2) of cis (left) and trans (right)
cAPB in DMSO at 300 K as obtained from the extended REST
simulations characterized in Table 1. G is represented on the plane
spanned by the helix-elongation measures Hi, i = 1,2, which are
defined by eqn (6). The top row refers to the C22 and the bottom
row to the CMAP force field, respectively. The measures Hi and
the computation of G(H1,H2) are explained in Methods. With the
nomenclature of Table 1 the subgraphs were extracted from the
REST simulations: (a) R/C/C22, (b) R/T/C22, (c) R/C/CMAP, and
(d) R/T/CMAP.
an issue of concern.17,68 In the case of the C22 force field,50 for
instance, the CMAP extension51 meanwhile has been argued
to provide substantially improved descriptions of peptide
structures.69 Therefore, we decided to additionally address
the questions of temperature and of force field quality in our
following in-depth reinvestigation of the issue.
Starting with the questions of how the computed conformational landscapes of cis and trans cAPB are shaped at 300 K
and to what extent they are affected by the applied peptide
force field we first consider Fig. 3. The figure shows four free
energy landscapes G(H1,H2) , which summarize the results of
the four REST simulations specified in Table 1. The conformational states sampled by the cis (left) and trans (right)
cAPB peptides are characterized by the two conformational
coordinates H1 and H2. As explained in Methods, these so-called
helicity scores can distinguish helical from extended peptide
conformations in different parts of the backbone. Here, values
close to one signify helical structures and values close to minus
one extended structures (cf. Fig. 2 and its discussion in the text).
In particular, H1 is the helicity score of the peptide backbone near the covalent linkages to the chromophore (cf. Fig. 1).
Correspondingly, values H1 E 1 indicate the presence of turns
at these locations. A comparison of the graphs at the right
hand side of Fig. 3 referring to trans with those at the left hand
side (cis) demonstrates that such turns are predominantly
found in the simulated trans ensembles. In contrast, the cis
ensembles are seen to exhibit also many extended structures
near the linkages (H1 E 1) which are completely absent for
trans.
H2, on the other hand, measures the helicity in the core of
the peptide strand (A3-T4-C5-D6). The two distinct minima of
G(H1,H2), which are located at values H2 E 1 and are marked
by arrows in the left graphs of Fig. 3 (cis), thus indicate the
Phys. Chem. Chem. Phys., 2010, 12, 6204–6218 | 6209
occasional presence of a well-structured a-helix in the cis
ensemble. In sharp contrast the trans graphs reveal that this
ensemble is strongly dominated by extended structures in the
core of the peptide (H2 E 1). In addition, the conformational
space occupied by the cis ensemble is seen to be much larger
than in the trans case. The latter two results of our reinvestigation confirm previous findings25,40 which were based on NMR
and MD, respectively.
Note here, that the peptide portion of cAPB forms an
a-helix in the native environment of the thioredoxin reductase
from which its sequence was taken.70 Originally, this native
a-helical folding pattern had inspired the hope that the cis/trans
isomerization of cAPB could provide a minimal model for
force-driven helix-unfolding.26 However, the previous 500 K
MD study by Carstens et al.40 had predicted that the cis
ensemble mainly consists of various open loop structures but
did not identify any conformational substates of a-helical
character. Likewise, a-helical structures were absent in the
set of refined NMR structures.25 In contrast and independently
of the applied force field, our 300 K REST simulations
(cf. Fig. 3a and c) now confirm the existence of an a-helical
population in the cis-cAPB ensemble as had been expected26
from the thioredoxin reductase case.70 Of course, the a-helical
character of this population has been independently checked
using the DSSP classifier.64
For a most simple characterization of the differences among
the ensembles shown in Fig. 2, Table 2 lists the associated
i,a (i = 1,2; a A {c,t}). Comparing first for
average helicities H
2 of the peptide’s core the changes
the average helicity H
caused by the cis/trans isomerization, the table shows a large
2 E 0.6 for both force fields clearly reflecting the
decrease DH
stretching of the peptide’s core toward extended conformations. Concerning the change of the average peptide helicity
1 in the linkage regions, which is enforced by the cis/trans
H
isomerization of the azobenzene dye, the two force fields
surprisingly disagree. Whereas C22 predicts a strong increase
1 E 0.8 expressing a removal of extended and the
of DH
appearance of turn structures at the linkages, the CMAP
extension assigns nearly vanishing changes to the observable
2 is an observable, by which one can clearly
1. Thus solely H
H
distinguish the simulated cis and trans ensembles. To gain
deeper insights into the surprising difference between C22 and
CMAP just revealed by inspection of Table 2, one has to
reconsider the free energy landscapes G(H1,H2) shown in
Fig. 3.
A corresponding detailed analysis of the differences between
the conformational ensembles predicted by C22 and CMAP,
respectively, is presented in the ESI together with the proof
Table 2
that the apparent differences are statistically significant, indeed
(cf. Fig. S12 and the associated discussion).w This detailed
analysis shows that the force fields describe the core of the
peptide as measured by the observable H2 in a very similar
fashion (both for cis and trans) and predict slight but distinct
differences solely for the peptide regions near the linkage to the
chromophore, which are monitored by H1.
Based on the above results we can start now to address the
open question, which we raised further above, to what extent
50 ns MD simulations with C22 at 500 K40 can yield valid
models for the cis and trans ensembles of cAPB in DMSO
at 300 K.
Conformational landscapes at high temperature
For a first answer compare Fig. 4, which shows the free energy
landscapes G(H1,H2) of the REST ensembles at 570 K, with
the 300 K landscapes of Fig. 3. One immediately recognizes
that raising the temperature greatly reduces the number of local
minima exhibited by G(H1,H2), i.e. reduces the number of
distinct conformational substates. In particular, the a-helical
state present at 300 K in the cis ensembles of C22 and CMAP
disappears. When analyzing the strucures (data not shown)
associated to the 570 K minima of the cis landscapes one
recovers the open loop structures described by Carstens et al.40
for their 500 K cis ensembles. The inspection of Fig. 4 additionally
shows that the conformational space sampled by thermal
fluctuations increases both for cis and trans. Furthermore,
despite the smoothing of the free energy surfaces caused by the
increased temperature, the differences of the C22 (top) and
CMAP (bottom) landscapes, which were identified above by
visual inspection of the 300 K landscapes, still persist in a
weakened form at 570 K (as one can convince oneself by
repeating the analysis in the ESI for the data in Fig. 4).
Interestingly, however, while CMAP predicted for 300 K
1
that the cis/trans isomerization leaves the average helicity H
of the linkage region nearly invariant, this invariance is gone
at 570 K. For this temperature CMAP now also assigns an
1 = 0.3 of turn character to the linkage region,
increase DH
1 = 0.5 at
which is nearly as large as the C22 increase DH
1 = 0.8
570 K. Because the latter value is smaller than the DH
1 and H
2 at 300 K
Average helicities H
Observablea
C22b
CMAPc
1,c
H
1,t
H
2,c
H
2,t
H
0.09
0.67
0.10
0.56
0.38
0.40
0.03
0.54
a
The subscripts ‘‘c’’ and ‘‘t’’ label the cis- and trans-ensembles,
respectively, obtained by b The REST simulations with the C22
or c The CMAP force field listed in Table 1.
6210 | Phys. Chem. Chem. Phys., 2010, 12, 6204–6218
Fig. 4 Free energy landscapes G(H1,H2) of cis (left) and trans (right)
cAPB in DMSO at 570 K as obtained from our REST simulations
(cf. Table 1). The top row refers to the C22 and the bottom row to
CMAP. The subgraphs were extracted from the REST simulations:
(a) R/C/C22, (b) R/T/C22, (c) R/C/CMAP, and (d) R/T/CMAP.
This journal is
c
the Owner Societies 2010
predicted by C22 at 300 K one concludes that raising the
temperature sizably diminishes the differences between the
conformational landscapes calculated by C22 and CMAP.
Thus it seems that the CMAP extension affects the low-energy
parts of the potential energy landscape, which are sampled at
300 K, much more strongly than the higher energy parts,
which are sampled at 570 K.
Comparison with NMR data
As mentioned further above, the 500 K ensembles generated
by MD with C22, which are similar to our 570 K REST
ensembles (Fig. 4, top), were previously shown to explain the
NMR distance restraints25 quite well leading to the question
how our 300 K ensembles perform in this respect. As a
measure for the match between the computed conformational
ensembles and the observed NMR distance restraints25 we
employ the root mean square violation (RMSV) between the
NMR distance restraints and REST ensembles defined in
Methods by eqn (7) and (8).
Fig. 5 shows the RMSVs for the two force fields C22 (squares)
and CMAP (dots) for the cis (a) and trans (b) ensembles
obtained by REST at 300 K as functions of the respective
simulation time. All RMSVs are seen to level off after about
80 ns (cis) or 50 ns (trans) of REST simulations, respectively. To
demonstrate that convergence is reached after 80 ns also for the
more extended cis conformational space, we have extended the
REST simulation R/C/CMAP to 150 ns. The associated RMSV
curve (dots) in Fig. 5a is extremely flat in the time range beyond
100 ns indicating that convergence has been reached, indeed.
The most striking feature of Fig. 5 is that the RMSVs
generated with the C22 force field are much larger than those
produced with CMAP indicating that the CMAP ensembles
agree much better with the NMR data than the C22 ensembles.
Recall in this context that the conformational landscapes as
Fig. 5 RMSV(0,t) between the NMR distance restraints25 and the
corresponding REST ensembles at 300 K as a function of the simulation time t [cf. eqn (8)]; the squares mark C22 and dots CMAP results;
the solid (cis) and dashed (trans) lines connecting the respective
symbols serve as guides for the eye; (a) cis and (b) trans ensembles.
This journal is
c
the Owner Societies 2010
characterized by Fig. 3 showed only minor differences upon
exchange of the force field. Thus these small differences
apparently suffice to bring the CMAP ensemble much closer
to the evidence provided by NMR. According to the analysis
of the properties of the RMSV presented in the ESI these small
differences must consist in a few conformational sub-states,
which feature small proton–proton distances for proton pairs
with measured NOEs and which are present in the CMAP but
absent in the C22 ensembles (both for cis and trans). This
finding is in line with the analyses of van Gunsteren et al.68
which imply that minor modifications of conformational
ensembles may have large impacts on violations of NOE
distance restraints. Furthermore, this finding is remarkable
as it demonstrates that the CMAP extension of C22, which
was derived mainly by quantum chemistry on small isolated
model peptides,51 provides a more realistic description for
a model peptide in solution which, due to the cyclization
constraints, samples unusual local conformations.
The noted strong violation of the NMR data by the 300 K
ensembles obtained with C22 is a surprise and seems to
contradict the previous results of Carstens et al.40 who stated
that their 500 K ensembles, which were also calculated
with C22, agree very well with the NMR distance restraints
particularly for cis-cAPB while exhibiting larger violations in
the trans case (data reproduced in the ESI; see column 3 of
Tables 4 and 5). To gain an understanding of this apparent
contradiction we have calculated the RMSVs for the various
ensembles obtained at all rungs of the REST temperature
ladder. This allows to draw the RMSVs as functions of the
simulation temperature. Note in this context that in REST
only the peptide is ‘‘hot’’ whereas the surrounding solvent
remains effectively ‘‘cool’’ due to the scaling of its Hamiltonian.
In our following analysis we thus assume that the effect of the
nonphysical solvent on the conformational landscape of the
peptide is small. This assumption is supported by recent
findings of Reichold.69
Fig. 6 shows that the RMSVs monotonously decrease with
increasing peptide temperature for each force field as well as
for cis- and trans-cAPB. As is shown in section ‘‘Temperature
dependence and other properties of the RMSV’’ of the ESI, a
monotonous decrease upon increasing temperature is a generic
property of the RMSV when one considers structures with a
small thermal expansion coefficient. For our cAPB peptide the
cyclic closure apparently entails a sufficiently small thermal
expansion.
The decrease of the RMSV in Fig. 6 is more pronounced for
cis than for trans and stronger for C22 than for CMAP. For
CMAP it had to be expected that the decrease is small, because
the associated RMSVs are already small at 300 K and are
bounded from below. For C22 and cis-cAPB the heating
brings the RMSV from 0.81 down to 0.29, i.e., close to the
values 0.34 predicted by CMAP at 300 K and 0.23 calculated
by us from the 500 K cis data of Carstens et al. (cf. Table S4 in
the ESI).w As a result, the high-temperature cis ensembles seem
to perform better than the low temperature ones (particularly
for the C22 description of cis-cAPB). For the combination
C22 and trans a sizable RMSV (0.45) remains at higher
temperatures, which once again matches the previous 500 K
MD result of 0.47 and is much larger than the CMAP result of
Phys. Chem. Chem. Phys., 2010, 12, 6204–6218 | 6211
Fig. 6 RMSVs of the REST ensembles as functions of the temperature; solid lines mark cis and dashed lines trans ensembles; squares
refer to C22 and dots to CMAP.
0.13 at 300 K. Thus, the strongly constrained trans-cAPB
yields the decisive evidence for the superiority of CMAP
over C22.
The smaller RMSVs observed at higher temperatures do not
mean that these ensembles are more ‘‘realistic’’. As explained
in the ESI, at these temperatures the conformational landscapes artificially extend over much larger portions of the conformational spaces and, therefore, have higher probabilities to
cover also structures with vanishing NOE violations. Even a
small number of such structures can strongly reduce the highly
non-linear RMSV measure [cf. eqn (7) and (8), the ESI, and
ref. 68]. Because of these high-temperature artifacts the 300 K
ensembles are much more realistic. Note once again that they
cover for cis-cAPB the native-like a-helical conformation,
which disappears in a melting transition upon increase of the
simulation temperature (cf. Fig. 3 and 4 above).
As a result, our above analysis has demonstrated that the
REST simulations with the CMAP force field have provided
models for the 300 K equilibrium ensembles of cis- and transcAPB which comply with the NMR data very well. The C22
ensembles show larger deviations although they have a considerable overlap with the CMAP ensembles. We conclude
that a few conformational substates allowed by CMAP and
precluded by C22 cause the much better match of the CMAP
ensembles with the NMR data.
Distinguishing cAPB isomers
According to our discussion of Table 2 the isomeric states of
2 within
cAPB can be distinguished by the average helicity H
the core of the peptide. Thus, measuring H 2(t) for an ensemble
of non-equilibrium simulations can provide access to the
kinetics of the photoinduced relaxation processes. In a similar
way also the time resolved UV/vis pump and IR probe
spectroscopy addressing this kinetics must be capable to
clearly distinguish the cis and trans ensembles of cAPB.
The amide bands in the IR spectra of peptides, which
originate from the normal modes of the highly polar amide
groups making up the backbone, are known to change their
shapes and spectral locations with the structure of the peptide71–73
and the polarity of the solvent (see Schultheis et al.74 for
explanations). Thus, the amide bands of cAPB in DMSO
should be capable to distinguish cis- and trans-cAPB.
In the dark, the trans conformation is the equilibrium state
of cAPB.26 Its IR spectrum is shown in Fig. 7a (dashed line).
6212 | Phys. Chem. Chem. Phys., 2010, 12, 6204–6218
Fig. 7 IR absorption of cAPB in the spectral region of the amide I
and II bands. (a) Absorption Ac(~
n ) of pure cis (solid line) and At(~
n ) of
pure trans ensembles (dashed line), respectively. (b) Absorption
changes DA˜c-t(~
n ) induced by cis/trans (solid line) and DA˜t-c(~
n ) by
trans/cis (dashed line) isomerization of the azobenzene photoswitch
elicited by illumination at l > 400 nm and l B 370 nm, respectively.
Illuminating this conformation with light of wavelengths
l E 370 nm yields a photostationary ensemble containing
about 90% cis-cAPB. Because the IR spectrum At(~
n ) of
the pure trans state is known, one easily constructs an IR
spectrum Ac(~
n ) of pure cis-cAPB (Fig. 7a, solid line).
The spectra of the trans- and the cis-conformations are
very similar. Nevertheless the differences between the two
spectra can be accurately determined by recording the cis/trans
difference spectra induced by illumination with light at longer
wavelength (l B 400 nm) (Fig. 7b, solid line). The observed
absorption change is of the same order as found for other
chromopeptides.34,37,75 When the trans isomer is illuminated at
370 nm one can measure the trans/cis difference spectrum
(Fig. 7b, dashed line). The reversibility of the photoreaction is
demonstrated by the fact that the two difference spectra are
(up to the sign) basically identical.
In the region of the amide I band near 1660 cm 1 the
trans/cis conversion is seen to be associated with a shift of
IR intensity toward lower frequencies. Such a redshift can be
explained by amide groups becoming more strongly exposed
to the polar DMSO environment.74 In fact, according to our
MD data, in trans the backbone is more tightly folded and,
thus, less solvent accessible than in cis.
Monitoring the cis/trans relaxation by IR spectroscopy
Fig. 8 displays transient and stationary absorption spectra of
cAPB. In the transient experiment the photostationary cis
ensemble was excited by a 404 nm laser flash with a duration
of about 700 fs. Before arrival of the laser flash the time
resolved spectrum is identical to the original spectrum, thus
the difference spectrum DAc-t(~
n ,t - N) vanishes. At very
late times after excitation, converted molecules will reach
the spectrum of the trans state. Hence, the time resolved
cis/trans difference spectra DAc-t(~
n ,t) will approach the
stationary difference spectrum DA˜c-t(~
n ) shown as a solid line
in Fig. 7b/8c.
Fig. 8b shows a series of time resolved cis/trans difference
spectra DAc-t(~
n ,t) obtained at selected time points t/ps A {2, 10,
20, 100, 500, 3000} after the 404 nm laser flash inducing the
isomerization. From the picosecond to the nanosecond time
This journal is
c
the Owner Societies 2010
Fig. 8 IR spectra of cAPB in the amide I range given in arbitrary
units. (a) IR absorption of cis-cAPB in DMSO. (b) Transient spectra
DAc-t(~
n ,t) induced by cis/trans photoisomerization of the azobenzene
n ).
switch. (c) Stationary difference spectrum DA˜c-t(~
range the transient spectra DAc-t(~
n ,t) are seen to undergo
drastic changes. Immediately after the excitation of the azobenzene
n ,t) at t = 2 ps) a redshift of the amide I band
unit (see DAc-t(~
is observed as demonstrated by the bleach between 1650 and
1700 cm 1 and the increased absorption between 1625 and
1645 cm 1. It is well established that such a redshift can
be interpreted as a signature of a hot peptide exhibiting a
non-thermal distribution of vibrational excitation within the
amide I modes and among low frequency modes that are
anharmonically coupled to the amide I modes.34,76 The thermal
excess energy within the peptide thus identified by the transient
spectrum at t = 2 ps stems from the UV/vis photon initially
absorbed by the chromophore and is generated by internal
conversion (while the dye relaxes from the electronically
excited state into the ground state). The noted redshift disappears
within 5–10 ps and is seen to be absent after 20 ps implying
that cAPB has dissipated the thermal excess energy into the
DMSO environment by that time.
In the remaining time span the depicted transient spectra
DAc-t(~
n ,t) solely reflect the conformational relaxation of
cAPB’s peptide moiety on its way from the cis toward the
trans equilibrium ensemble. The spectra are seen to approach
the stationary target spectrum DA˜c-t(~
n ) in Fig. 8c. However,
they clearly do not reach this target completely within the
recorded time span of 3 ns. As a result, after the cis/trans
photoisomerization of the azobenzene switch, the relaxation of
cAPB into the trans ensemble takes much more time than
previously assumed.39
In this context it is important to note that the differences
between the stationary spectrum DA˜c-t(~
n ) and the time
n ,t) at t = 3 ns are still sizable
resolved spectrum DAc-t(~
and are much more clearly detectable than corresponding
differences previously identified for the chromophore absorption in the UV/vis spectral range.39 Thus, as expected, our IR
spectroscopy of the amide I region is actually a much more
This journal is
c
the Owner Societies 2010
Fig. 9 Time transients DAc-t(~
n ,t) showing the kinetics at the indicated
spectral positions n~. The solid lines are the result of a multichannel
analysis comprising frequencies n~ regularly distributed (~
n E 3 cm 1)
over the amide I region with a sum of four exponential functions. The fit
functions are drawn for delay times t Z 9 ps, because our study focuses
on the slow processes of conformational relaxation within the peptide
part of cAPB.
sensitive probe to structural relaxations within the peptide
moiety of cAPB than the UV/vis spectroscopy, which employs
the changing chromophore absorption as a probe.
The structural relaxation dynamics encoded in the time
resolved transient spectra DAc-t(~
n ,t) for the first 3 ns after
photoisomerization can be kinetically characterized by a global
P
fit with a sum of exponentials, i.e. with
n )exp( t/tj).
jaj(~
Fig. 9 demonstrates the quality of such a fit for which four
exponential functions have been chosen by comparing the fit
n ,t) at three selected
with the time dependence of DAc-t(~
frequencies n~/cm 1 A {1639, 1658, 1672}. For the time period
starting 10 ps after the laser flash, within which the peptides
should have dissipated most of their initial thermal excess
energy, the fits are seen to close reproduce the experimental
data at all chosen sample frequencies. The thus determined
relaxation times ti (i = 1,. . .,4) will be presented and discussed
further below in connection with corresponding simulation data.
Monitoring the cis/trans relaxation by MD simulation
To obtain a description of cAPB’s structural relaxation after
the cis/trans isomerization of the chromophore we have
carried out the two sets I/C22 and I/CMAP of non-equilibrium
simulations listed in Table 1 and described in Methods. Each
of the 50 simulations contained in one of the two sets
starts with a simulated photoisomerization39 depositing the
energy of a UV/vis photon into a cis-cAPB structure, which
was randomly chosen from the corresponding simulated cis
equilibrium ensemble. Then each simulation describes the
structural relaxation of this solvated molecule over a time
span of 20 ns. The simulated time span is by a factor of 7 larger
than the one covered experimentally by us and by a factor
20 larger than the one of the previous MD simulations.39
The total simulation time spent for acquiring our new data
covers 2 ms.
As shown further above, the ensemble average helicity
2 can distinguish the cis and trans equilibrium
measure H
Phys. Chem. Chem. Phys., 2010, 12, 6204–6218 | 6213
2(t)
ensembles for both force fields. A time-resolved version H
of this measure is obtained at each (analysis) time point t after
the simulated photoisomerization by averaging over all
50 values H2(t) contained in the respective set I/C22 or I/CMAP.
2(t) resulting for the two force
The time dependent functions H
fields can then be approximated with fit functions
X
2;t
2 ð0Þ H
2;t Š
aj expð t=tj Þ þ H
ð9Þ
h2 ðtÞ ¼ ½H
j
2,t is the average over the respective equilibrium trans
where H
2(0) the average over the ensemble of the 50 cis
ensemble, H
P
starting structures, and jaj = 1.
For the two employed force fields Fig. 10 compares the
2(t) with fits h2(t) utilizing three exponensimulation results H
tials. The fits are seen to reproduce the simulation data very
well for both force fields. They do not require a component
with an infinite relaxation time implying that the fit functions
h2(t) exactly approach the respective ensemble average values
2,t determined by the REST simulations of the equilibrium
H
trans ensembles at 300 K. Consequently, the longest time
constants determined by the fits are the longest time constants
in the simulated non-equilibrium ensembles which, in the long
time limit, decay exponentially towards the trans equilibrium
ensembles.
According to Fig. 10 the trans equilibrium ensembles are
quite obviously not yet reached even after 20 ns. Nevertheless,
at this point of time the relaxation is to about 75% complete.
In fact, the longest time constant t3 entering the fits h2(t) is
found to be about 23 ns for both force fields predicting that the
process should be at least to 58% complete after 20 ns.
Therefore, faster initial relaxations have moved the ensembles
already a bit closer to their respective targets. Because we have
n ,t) and the
modeled both the time resolved IR spectra DAc-t(~
2(t) by exponential mixtures, one can try to
simulation data H
compare the thus obtained relaxation times.
Kinetics of the cis/trans relaxation
Table 3 compares the four time constants tj (j = 0,. . .,3)
resulting from the global fit to the transient IR spectra
DAc-t(~
n ,t) with the three time constants determined from
2(t). We provide this
the fits h2(t) to the simulation data H
tentative comparison of kinetic constants although it rests on
the ad hoc assumption that the amide I spectra and the helicity
2(t) of the central peptide portion of cAPB are
score H
similarly sensitive to the ongoing conformational relaxation.
It would have been much more appropriate to compute the
changes of the amide I bands directly from the MD trajectories
using a reliable and sufficiently cost-effective method and to
predict the kinetics from the calculated spectra. However,
sufficiently accurate descriptions of amide bands are currently
accessible only through hybrid methods combining density
functional theory with molecular mechanics (see ref. 77 for a
review) and such DFT/MM computations are extremely
costly. In a previous analysis of the time-resolved unzipping
of a light-switchable b-hairpin peptide, which also combined
ultrafast IR spectroscopy with theoretical descriptions,38 the
computation of the amide I bands for a few snapshots of the
solvated peptide took us several months of computer time on a
linux cluster. Such DFT/MM computations are by far too costly
for the given purpose of computing temporally resolved IR
spectra of cAPB’s cis-trans relaxation in DMSO. Furthermore,
Stock et al.43 have critically discussed the unclear relation
between kinetics derived from MD conformational coordinates
and time-resolved amide I spectra by considering a cyclic
peptide related to the one studied by us. Here they cautiously
attempted to use one of the existing (and notoriously unreliable74)
empirical models for estimating amide I band shapes. The lack
of a cost effective and accurate computational method has
inspired some of us to develop a new type of polarizable force
field for amide groups that aims at the (time-resolved) computation of amide bands.74 However, currently this method is
not yet fully established and ready to use. Therefore we have
to resort to a plausible but by no means rigorously founded
comparison of kinetics derived from different observables.
The shortest relaxation time t0 = 11 ps found experimentally
and listed in Table 3 describes the cooling kinetics of the
initially hot peptide. This time constant has no correspondence
in the simulation results for the simple reason that the observable
2 is nearly independent of the temperature. In fact, when
H
2 for the REST ensembles as a
calculating the averages H
function of the temperature one finds only very small changes
(Fig. S16 in the ESI documents this fact).w For the MD
Table 3 Decay times tj from fits with n exponentials
Fig. 10 Simulated relaxation dynamics induced by the cis/trans
isomerization of the cAPB chromophore using (a) the C22 and
(b) the CMAP force field. The relaxation is monitored by the average
2(t) determined from the data sets I/C22 and I/CMAP
helicity score H
comprising 50 non-equilibrium MD simulations each. The light gray
lines are fits with the functions h2(t) specified in the text. The
2 over the cis and trans ensembles
horizontal lines mark the averages H
obtained by the respective REST simulations. The ESI displays in
Fig. S18 the above data additionally on a logarithmic time scale to
resolve the fast processes more clearly.
6214 | Phys. Chem. Chem. Phys., 2010, 12, 6204–6218
timesa
IRb
C22c
CMAPd
t0
t1
t2
t3
11
137
1370
N
—
401
1370
23094
—
49
981
22610
a
Times are given in ps. b The time resolved IR difference spectra
2(t)
DAc-t(~
n ,t) were fitted with n = 4. c The average helicity H
calculated from simulation I/C22 was fitted with n = 3. d And that
from I/CMAP with n = 2.
This journal is
c
the Owner Societies 2010
description of the fast cooling one needs other observables,
e.g., the total energy39 or directly the temperature of cAPB
(see the ESI for data and discussion).
According to Table 3 the next slower relaxation time
t1 = 0.14 ns determined by IR spectroscopy differs from the
fastest time constants t1 calculated for the relaxation of the
2(t). Simulation I/C22 yields t1 = 0.40 ns and
core helicity H
I/CMAP t1 = 0.05 ns, respectively. The statistical errors of the
calculated time constants t1–t3 are about 40%, 20%, and
10%, respectively, as one can estimate by dividing the two
sets of 50 relaxation trajectories into subsets comprising
25 trajectories each, by computing the fit functions eqn (9),
and by evaluating standard deviations. Thus, at short times the
statistical errors of the computed time scales t are quite large.
As opposed to the case of the short 100 ps trajectories used
to monitor the fast processes of heat dissipation with a very
good statistics (cf. the ESI), computational limitations forced
us to strongly limit the number of extended 20 ns simulations.
Thus, instead of 500 only 50 trajectories could be calculated
for each force field. Due to the multitude of conformational
substates offered by cis-cAPB (cf. Fig. 3), the set of only 50
starting structures cannot adequately represent the complex cis
ensemble perturbed by the laser flash. The a-helical substate,
for instance, will be represented in this set by only very few
samples (in fact, for C22 we counted two and for CMAP one
a-helix). Since the unfolding of such a structure is a random
experiment, the corresponding kinetics can be determined
from one or two unfolding events only with huge uncertainties.
Inspecting Fig. 10 once again one sees that the traces of
2(t) show sizable fluctuations (caused by the small number
H
of samples), which—particularly on short time scales—may
distort the measurement of decay times.
Furthermore, the limited statistics of only 50 relaxation
simulations for each force field does not allow us to generate
smooth time dependent free energy landscapes G[H1,H2,t] by
simple counting. However, by considering each trajectory
as a representative of a class, which is normally distributed
in the helicity plane at each time point t (standard deviation
s = 0.1), we can expand the number of data points
[H1(t),H2(t)]r, r = 1,. . .,50, delivered by the simulations
through throwing the dice. Fig. 11 shows the resulting landscapes G[H1,H2,t] at the time points t/ns A {0,0.2,2,20}. They
are compared with the equilibrium ensembles obtained by
REST for cis (top) and trans (bottom). The left row refers to
the C22 force field, the right row to CMAP. Note that Table
S10 in the ESIw lists for each of the distributions shown in
1(t) and H
2(t),
Fig. 11 the average linkage and core helicities H
respectively.
The landscapes G[H1,H2,t] depicted in Fig. 11 underline
our above statement on the incomplete representation of the
cis ensemble by the 50 randomly drawn starting structures
(compare the two top rows). In addition, the inspection of the
figure strongly suggests that the strain-driven unfolding of the
a-helical substate is one of the fastest conformational transitions elicited by the chromophore’s cis/trans isomerization. This sparsely populated substate is seen to disappear
within the first 0.2 ns after the photoisomerization for both
force fields. Because the a-helical conformation is associated
with the largest value of H2 found in the cis ensemble, its
This journal is
c
the Owner Societies 2010
Fig. 11 Temporal evolution of ensembles. For explanation see
the text.
disappearance will leave, despite its small population, clear
2(t). Correspondingly, the helix
traces in the time course of H
unfolding provides a major contribution to the fastest time
constant t1 predicted by MD. The extremely limited statistics,
by which this process is sampled (one or two events), then
immediately explains the large mutual deviations of the MD
values listed in Table 3 for t1. On the other hand, the 0.14 ns
kinetics observed experimentally may well reflect the forcedriven unfolding of an a-helical substate.
For the next slower relaxation kinetics described by t2 E 1 ns
the IR and MD results exhibit a remarkable (and in part
accidental) agreement (cf. Table 3). This decay constant is
2(t)
determined at a better statistics from the simulation data H
than t1 E 0.1 ns. To identify the processes underlying t2 recall
the results of Spo¨rlein et al.39 These authors have shown that
the UV/vis spectrum of the chromophore, which is reached
after 1 ns, closely resembles the stationary reference spectrum
of the trans state. The spectral similarity indicates that the
mechanical strain exerted by the chromophore on the peptide
moiety is largely gone at that point of time. Therefore, the
decay time t2 E 1 ns may reflect the slowest force driven
structural transitions serving to reduce the mechanical strain
Phys. Chem. Chem. Phys., 2010, 12, 6204–6218 | 6215
within the cyclic azopeptide. Comparing in Fig. 11 the timeresolved free energy landscapes reached after 2 ns with the
initial landscapes one sees that all core helicities H2 > 0.5
vanish within the first two nanoseconds. Within this time span
the more extended trans-chromophore thus substantially
stretches the originally quite helical peptide core.
The time-resolved IR measurements were limited to 3 ns.
Because the relaxation of the peptide moiety is far from being
complete after 3 ns, the associated spectral fits contain a fourth
time constant t3 = N. In contrast, the fits h2(t) to the
2(t) do not require an additional offset for
simulation data H
2,t
t - N because the known limiting value h2(t - N) = H
has been built into the fit function eqn (9). As a result
and independently of the chosen force field, the simulation
data definitely predict that the relaxation toward the trans
ensemble is complete on the computed time scale t3 E 23 ns
(cf. Table 3).
According to the arguments given above, the slow 23 ns
kinetics, by which the equilibrium trans ensemble is reached,
most likely does not anymore serve to relieve mechanical
stress. Instead, it should be associated to a stochastic search
within the conformational space that transforms the cis-related
non-equilibrium ensemble, which is reached after a few nanoseconds, eventually into the equilibrium trans ensemble
(cf. Fig. 11). This stochastic search involves thermally driven
flips of the amide groups around the dihedral angles ci at the
Ca atoms of the residues i. As a result, the helicities H1 in the
linkage regions and H2 in the core of the peptide will approach
the trans distribution which should resemble the distributions
shown at the bottom of Fig. 11b for C22 (left) and for CMAP
(right).
The surprising similarity of the time constant t3 determined
with the two force fields then indicates that the energy barriers,
which are associated with the mentioned flips, have similar
heights. This result corroborates our earlier hypothesis that
the CMAP extension hardly modifies the high energy regions
of the C22 potential energy landscape (see our discussion of
Fig. 3 and 4).
Conclusion
We have reinvestigated the cis/trans photoisomerization of the
monocyclic azopeptide cAPB in DMSO by combining ultrafast UV/vis pump IR probe spectroscopy with nonequilibrium
MD simulations. Because these simulations require the knowledge of cAPB’s conformational equilibria in the cis and trans
states of the APB chromphore, we have calculated these
equilibria by applying the novel REST technique.48,49
Here, we have compared two different force fields. We
found that the CMAP extension51 describes the well-known
NMR distance restraints of cis- and trans-cAPB25 much better
than the original C22 force field.50 Furthermore we have
demonstrated that the CMAP extension mainly modifies the
low-energy regions of the C22 energy landscape while leaving
the sizes of the energy barriers nearly invariant. Both force
fields predict the presence of an a-helical substate in the 300 K
ensembles of the cis conformation, whose predominance had
been one of the objectives during the design of cAPB.26 This
a-helical substate had been absent in the set of refined NMR
6216 | Phys. Chem. Chem. Phys., 2010, 12, 6204–6218
structures25 and in the high-temperature cis ensemble
computed earlier by a 50 ns MD simulation.40
We would like to stress however that, despite the differences
of cAPB’s conformational cis and trans ensembles identified
by us between low (300 K) and high (E500 K) temperatures,
the key conclusions of the earlier MD study remain valid. The
study correctly explains the shortcomings of the usual NMR
structure refinement when applied to flexible peptides.
Here the refinement erroneously predicts artificially compact
structures.40,68
As shown by us, the cis and trans conformational ensembles
of cAPB at 300 K can be experimentally distinguished by the
shapes of the amide I bands in the IR spectra. In the simulation data they can be distinguished by considering the ensemble
average helicity in the core region (A3-T4-C5-D6) of the
peptide moiety. The transient IR difference spectra, which
monitor the peptide relaxation over a 3 ns time span after the
photoisomerization, reveal a cooling process with a decay time
of 11 ps which is overlooked by the helicity measure because
the latter is largely independent of temperature. In addition,
the observed time-resolved IR spectra and the simulated
helicity time series identify two further relaxation time constants t1 E 0.1 ns and t2 E 1 ns. The MD simulations suggest
that both time constants belong to force driven relaxations
relieving the mechanical strain that had been built up within
the cyclic azopeptide by the stretching of the chromophore
during cis/trans photoisomerization. Here, t1 is associated
with the fast stretching of peptide conformations, which
feature very high core helicities and include the a-helical
substate mentioned above. Furthermore, t2 represents the
time scale at which the forces exerted by the chromophore
on the peptide are finally relieved.
After the first few nanosconds of strain reduction a nonequilibrium conformational ensemble remains. The thermal
relaxation of this ensemble toward the target equilibrium
ensemble of the trans state is outside the limited time window
of our experimental setup. The ensembles of MD simulations,
however, which have been generated for the C22 and the
CMAP force fields, respectively, and cover the time span of
20 ns, unanimously predict that this relaxation proceeds with a
decay time t3 E 23 ns. It will be interesting to see whether this
prediction can be confirmed by temporally more extended
measurements.
When trying to bridge the gap between theory and experiment in the field of biomolecular simulation one usually has
been confronted with the problem that the time scales covered
by the experimental studies are too large for simulation descriptions because of computational limitations. Interestingly, the
small cAPB model compound has now been demonstrated to
provide an example for which the reverse is true: The simulation
could cover time scales inaccessible to the experimental setup
due to technical restrictions.
Acknowledgements
This work was supported by the Deutsche Forschungsgemeinschaft (Grants SFB 533/C1, SFB 749/A5/C4, Forschergruppe
526). Computer time provided by Leibniz Rechenzentrum (project
uh408) is gratefully acknowledged.
This journal is
c
the Owner Societies 2010
References
1 A. Liwo, M. Khalili and H. A. Scheraga, Proc. Natl. Acad. Sci.
U. S. A., 2005, 102, 2362–2367.
2 Y. Duan and P. A. Kollman, Proc. Natl. Acad. Sci. USA, 1998,
282, 740–744.
3 J. Kubelka, J. Hofrichter and W. A. Eaton, Curr. Opin. Struct.
Biol., 2004, 14, 76–88.
4 P. L. Freddolino, F. Liu, M. Gruebele and K. Schulten, Biophys.
J., 2008, 94, L75–L77.
5 C. Dobson and M. Karplus, Curr. Opin. Struct. Biol., 1999, 9,
92–101.
6 W. C. Still, A. Tempczyk, R. C. Hawley and T. Hendrickson,
J. Am. Chem. Soc., 1990, 112, 6127–6129.
7 B. Zagrovic, E. J. Sorin and V. Pande, J. Mol. Biol., 2001, 313,
151–169.
8 D. Satoh, K. Shimizu, S. Nakamura and T. Terada, FEBS Lett.,
2006, 580, 3422–3426.
9 H. Lei and Y. Duan, J. Mol. Biol., 2007, 370, 196–206.
10 R. Zhou and B. J. Berne, Proc. Natl. Acad. Sci. U. S. A., 2002, 99,
12777–12782.
11 H. Nymeyer and A. E. Garcı´ a, Proc. Natl. Acad. Sci. U. S. A.,
2003, 100, 13934–13939.
12 R. Geney, M. Layten, R. Gomperts, V. Hornak and C. Simmerling,
J. Chem. Theory Comput., 2006, 2, 115–127.
13 B. Egwolf and P. Tavan, J. Chem. Phys., 2003, 118,
2039–2056.
14 M. Stork and P. Tavan, J. Chem. Phys., 2007, 126,
165105.
15 M. Stork and P. Tavan, J. Chem. Phys., 2007, 126, 165106.
16 M. Levitt and R. Sharon, Proc. Natl. Acad. Sci. U. S. A., 1988, 85,
7557–7561.
17 P. Tavan, H. Carstens and G. Mathias, Protein Folding Handbook.
Part I., Wiley-VCH, Weinheim, 2005, ch. 33, pp. 1170–1195.
18 H. J. C. Berendsen, Science, 1998, 282, 642–643.
19 J. Breton, J. Martin, A. Migus, A. Antonetti and A. Orszag, Proc.
Natl. Acad. Sci. U. S. A., 1986, 83, 5121–5125.
20 K. Wu¨thrich, NMR of Proteins and Nukleic Acids, Wiley,
New York, 1986.
21 S. Williams, T. P. Causgrove, R. Gilmanshin, K. S. Fang,
R. H. Callender, W. H. Woodruff and R. B. Dyer, Biochemistry,
1996, 35, 691.
22 R. M. Ballew, J. Sabelko and M. Gruebele, Proc. Natl. Acad. Sci.
U. S. A., 1996, 93, 5759–5764.
23 W. A. Eaton, V. Munoz, S. J. Hagan, G. S. Jas, L. J. Lapidus,
E. R. Henry and H. J., Annu. Rev. Biophys. Biomol. Struct., 2000,
29, 327–359.
24 P. Hamm, J. Helbing and J. Bredenbeck, Annu. Rev. Phys. Chem.,
2008, 59, 291–317.
25 C. Renner, R. Behrendt, S. Spo¨rlein, J. Wachtveitl and
L. Moroder, Biopolymers, 2000, 54, 489–500.
26 R. Behrendt, C. Renner, M. Schenk, F. Q. Wang, J. Wachtveitl,
D. Oesterhelt and L. Moroder, Angew. Chem., Int. Ed., 1999, 38,
2771.
27 J. R. Kumita, O. S. Smart and G. A. Woolley, Proc. Natl. Acad.
Sci. U. S. A., 2000, 97, 3803–3808.
28 L. Ulysse, J. Cubillos and J. Chmielewski, J. Am. Chem. Soc., 1995,
117, 8466–8467.
29 S.-L. Dong, M. Lo¨weneck, T. Schrader, W. Schreier, W. Zinth,
L. Moroder and C. Renner, Chem.–Eur. J., 2006, 12, 1114–1120.
30 A. Aemissegger and D. Hilvert, Nat. Protoc., 2007, 2,
161–167.
31 C. Renner, U. Kusebauch, M. Lo¨weneck, A. G. Milbradt and
L. Moroder, J. Pept. Res., 2005, 65, 4–14.
32 A. Aemissegger, V. Kra¨utler, W. F. van Gunsteren and D. Hilvert,
J. Am. Chem. Soc., 2005, 127, 2929–2936.
33 G. Mathias, B. Egwolf, M. Nonella and P. Tavan, J. Chem. Phys.,
2003, 118, 10847–10860.
34 J. Bredenbeck, J. Helbing, A. Sieg, T. Schrader, W. Zinth, C. Renner,
R. Behrendt, L. Moroder, J. Wachtveitl and P. Hamm, Proc. Natl.
Acad. Sci. U. S. A., 2003, 100, 6452–6457.
35 J. Wachtveitl, S. Spo¨rlein, H. Satzger, B. Fonrobert, C. Renner,
R. Behrendt, D. Oesterhelt, L. Moroder and W. Zinth, Biophys. J.,
2004, 86, 2350–2362.
36 G. A. Woolley, Acc. Chem. Res., 2005, 38, 486–493.
This journal is
c
the Owner Societies 2010
37 J. A. Ihalainen, J. Bredenbeck, R. Pfister, J. Helbing, L. Chi and I.
H. M. van Stokkum, Proc. Natl. Acad. Sci. U. S. A., 2007, 104,
5383–5388.
38 T. E. Schrader, W. J. Schreier, T. Cordes, F. O. Koller,
G. Babitzki, R. Denschlag, C. Renner, M. Lo¨weneck,
S.-L. Dong, L. Moroder, P. Tavan and W. Zinth, Proc. Natl.
Acad. Sci. U. S. A., 2007, 104, 15729–15734.
39 S. Spo¨rlein, H. Carstens, H. Satzger, C. Renner, R. Behrendt,
L. Moroder, P. Tavan, W. Zinth and J. Wachtveitl, Proc. Natl.
Acad. Sci. U. S. A., 2002, 99, 7998–8002.
40 H. Carstens, C. Renner, A. G. Milbradt, L. Moroder and
P. Tavan, Biochemistry, 2005, 44, 4829–4840.
41 P. H. Nguyen, Y. MU and G. Stock, Proteins: Struct., Funct.,
Bioinf., 2005, 60, 485–494.
42 P. H. Nguyen and G. Stock, Chem. Phys., 2006, 323, 36–44.
43 P. H. Nguyen, R. D. Gorbunov and G. Stock, Biophys. J., 2006,
91, 1224–1234.
44 M. Erdelyi, A. Karlen and A. Gogoll, Chem.–Eur. J., 2006, 12,
403–412.
45 D. Du, Y. Zhu, C.-Y. Huang and F. Gai, Proc. Natl. Acad. Sci.
U. S. A., 2004, 101, 15915–15920.
46 K. Hukushima and K. Nemoto, J. Phys. Soc. Jpn., 1996, 65,
1604–1608.
47 R. Denschlag, M. Lingenheil and P. Tavan, Chem. Phys. Lett.,
2008, 458, 244–248.
48 R. Denschlag, M. Lingenheil, P. Tavan and G. Mathias, J. Chem.
Theory Comput., 2009, 5, 2847–2857.
49 P. Liu, B. Kim, R. A. Friesner and B. J. Berne, Proc. Natl. Acad.
Sci. U. S. A., 2005, 102, 13749–13754.
50 A. D. MacKerell, D. Bashford, M. Bellott, R. L. Dunbrack,
J. D. Evanseck, M. J. Field, S. Fischer, J. Gao, H. Guo, S. Ha,
D. Joseph-McCarthy, L. Kuchnir, K. Kuczera, F. T. K. Lau,
C. Mattos, S. Michnick, T. Ngo, D. T. Nguyen, B. Prodhom,
W. E. Reiher, B. Roux, M. Schlenkrich, J. C. Smith, R. Stote,
J. Straub, M. Watanabe, J. Wiorkiewicz-Kuczera, D. Yin and
M. Karplus, J. Phys. Chem. B, 1998, 102, 3586–3616.
51 A. D. MacKerell, M. Feig and C. L. Brooks, III, J. Comput.
Chem., 2004, 25, 1400–1415.
52 N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller
and E. Teller, J. Chem. Phys., 1953, 21, 1087–1092.
53 M. P. Allen and D. J. Tildesley, Computer Simulations of Liquids,
Oxford University Press, Oxford, 1987.
54 H. Carstens, Dissertation, Fakulta¨t fu¨r Physik, Ludwig-MaximiliansUniversita¨t Mu¨nchen, 2004.
55 P. Bordat, J. Sacristan, D. Reith, S. Girard, A. Glattli and
F. Mu¨ller-Plathe, Chem. Phys. Lett., 2003, 374, 201–205.
56 C. Niedermeier and P. Tavan, J. Chem. Phys., 1994, 101,
734–748.
57 M. Eichinger, H. Grubmu¨ller, H. Heller and P. Tavan, J. Comput.
Chem., 1997, 18, 1729–1749.
58 V. Kraeutler, W. F. van Gunsteren and P. H. Hu¨nenberger,
J. Comput. Chem., 2001, 22, 501–508.
59 H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren,
A. Dinola and J. R. Haak, J. Chem. Phys., 1984, 81, 3684–3690.
60 M. Lingenheil, R. Denschlag, R. Reichold and P. Tavan, J. Chem.
Theory Comput., 2008, 4, 1293–1306.
61 R. Denschlag, M. Lingenheil and P. Tavan, Chem. Phys. Lett.,
2009, 473, 193–195.
62 M. Lingenheil, R. Denschlag and P. Tavan, Chem. Phys. Lett.,
2009, 478, 80–84.
63 X. Huang, M. Hagen, B. Kim, R. A. Friesner, R. Zhou and
B. J. Berne, J. Phys. Chem. B, 2007, 111, 5405–5410.
64 W. Kabsch and C. Sander, Biopolymers, 1983, 22, 2577–2637.
65 X. Daura, K. Gademann, H. Scha¨fer, B. Jaun, D. Seebach and
W. F. van Gunsteren, J. Am. Chem. Soc., 2001, 123, 2393–2404.
66 T. Schrader, A. Sieg, F. Koller, W. Schreier, Q. An, W. Zinth and
P. Gilch, Chem. Phys. Lett., 2004, 392, 358.
67 P. Hamm, R. A. Kaindl and J. Stenger, Opt. Lett., 2000, 25, 1798.
68 W. F. van Gunsteren, D. Bakowies, R. Baron, I. Chandrasekhar,
M. Christen, X. Daura, P. Gee, D. P. Geerke, A. Gla¨ttli,
P. H. Hu¨nenberger, M. A. Kastenholz, C. Oostenbrink,
M. Schenk, D. Trzesniak, N. F. A. van der Vegt and H. B. Yu,
Angew. Chem., Int. Ed., 2006, 45, 4064–4092.
69 R. Reichold, Dissertation, Fakulta¨t fu¨r Physik, Ludwig-MaximiliansUniversita¨t Mu¨nchen, 2009.
Phys. Chem. Chem. Phys., 2010, 12, 6204–6218 | 6217
70 T. N. Gustafsson, T. Sandalova, J. Lu, A. Holmgren and
G. Schneider, Acta Crystallogr., Sect. D: Biol. Crystallogr., 2007,
63, 833–843.
71 D. M. Byler and H. Susi, Biopolymers, 1986, 25, 469–487.
72 F. Siebert, Methods Enzymol., 1995, 246, 501–526.
73 A. Barth and C. Zscherp, Q. Rev. Biophys., 2002, 35, 369–430.
74 V. Schultheis, R. Reichold, B. Schropp and P. Tavan, J. Phys.
Chem. B, 2008, 112, 12217–12230.
6218 | Phys. Chem. Chem. Phys., 2010, 12, 6204–6218
75 J. Bredenbeck, J. Helbing, J. R. Kumita, G. A. Woolley and
P. Hamm, Proc. Natl. Acad. Sci. U. S. A., 2005, 102,
2379–2384.
76 P. Hamm, S. M. Ohline and W. Zinth, J. Chem. Phys., 1997, 106,
519.
77 M. Schmitz and P. Tavan, Modern Methods for Theoretical
Physical Chemistry of Biopolymers, Elsevier, Amsterdam, 2006,
ch. 8, pp. 157–177.
This journal is
c
the Owner Societies 2010
Supplementary Material for PCCP
This journal is © The Owner Societies 2010
Supporting information to the manuscript
Relaxation time prediction for a light switchable peptide by
molecular dynamics
Robert Denschlag∗ , Wolfgang J. Schreier† , Benjamin Rieff∗ ,
Tobias E. Schrader† , Florian O. Koller† , Luis Moroder‡ , Wolfgang Zinth† ,
and Paul Tavan*§
∗ Theoretische
† Lehrstuhl
Biophysik, Department für Physik,
für Biomolekulare Optik and Munich Center for Integrated Protein Science CIPSM,
∗ †Ludwig-Maximilians-Universität,
‡ Max
Oettingenstr. 67, 80538 München, Germany
Planck Institut für Biochemie, Am Klopferspitz 18a, 82152 Martinsried, Germany
§ corresponding
author, email: tavan@physik.uni-muenchen.de,
phone: +49-89-2180-9220, fax: +49-89-2180-9202
S1
Supplementary Material for PCCP
This journal is © The Owner Societies 2010
Convergence of REST free energy maps
To study the convergence of the REST simulations listed in table 1 we have divided for each simulation the set
of 10 replicas into two subsets each containing five replicas. 300 K data were collected for each of these subsets,
whenever one of its replicas happened to visit the 300 K temperature rung. Thus, the two replica swarms κ = 1, 2
generated two independent 300 K data sets for which the conformational landscapes Gκ (H1 , H2 ) were calculated
as described in Methods.
Already a first visual comparison of the various graphs labeled with the subscripts 1 and 2 in Figure 12 demonstrates a close similarity between the free energy landscapes Gκ (H1 , H2 ) extracted from the two different swarms
κ = 1, 2 at 300 K. For the C22 force field (Figs. 12aκ ,bκ ) the match between the landscapes Gκ (H1 , H2 ) associated
to the swarms κ is very good apart from small differences between the depths of the various local minima. For
CMAP (Figs. 12cκ ,dκ ) the match of the data from the two subsets is not quite as impressive but still pretty good.
The modifications of the landscapes induced by the application of different force fields, which are discussed in the
paper in connection with Fig. 3, are clearly retained in the swarm landscapes. For instance, both C22 swarms predict for the cis ensemble nearly no occupancy in the region H1 > 0.6 (cf. Figs. 12a1 ,a2 ) and substantial occupation
of the region H1 > −0.6, whereas the CMAP swarms show the opposite behavior (cf. Figs. 12c1 ,c2 ). Thus, the
differences of the conformational landscapes attributed in the paper to differences between the force fields are definitely not artifacts of insufficient statistics. Furthermore, the sampling of the conformational spaces as expressed
by the complete data sets seems to be pretty exhaustive.
Figure S12: Free energy landscapes Gκ (H1 , H2 ) obtained at 300 K from two different swarms κ = 1, 2 covering
five replicas each. Swarm 1 contains the replicas with initial temperatures in the range [300 K,399 K], and swarm
2 those from [428 K,570 K]. Results of swarm κ are depicted in the graphs labeled with the subscript κ . With
the nomenclature of table 1 the graphs refer to the following simulations: (aκ ) R/C/C22, (bκ ) R/T/C22, (cκ )
R/C/CMAP, and (dκ ) R/T/CMAP.
Differences of the force fields. The C22 and CMAP force fields yield different predictions for 300 K equilibrium
ensembles of cAPB in the cis- and trans-states. Because Fig. S12 has demonstrated that all our REST simulations
(cf. table 1) yield well-converged the free energy landscapes Gκ (H1 , H2 ), the differences between the two predictions can be identified by a visual comparison between the top (C22) and bottom (CMAP) rows of Fig. 3 (or
equivalently of Fig. S12). All differences, which are discussed in detail below and are detectable in Fig. 3, are
statistically significant, indeed.
For the trans ensemble (Figs. 3b,d or Figs. 12bκ ,dκ ) both force fields apparently agree that the central part of
the peptide backbone is largely extended (H2 < 0). Correspondingly, the averages H¯ 2,t have very similar values (cf.
S2
Supplementary Material for PCCP
This journal is © The Owner Societies 2010
table 2). Slight conformational differences are predicted by C22 and CMAP, respectively, for the peptide backbone
near the covalent linkages to the chromophore. Within the dominant conformational substate resulting from C22,
this portion of the backbone is seen to exhibit sharper turns (H1 ≈ 0.8) than in the corresponding CMAP state
(H1 ≈ 0.3). Similarly the ensemble average H¯ 1 is by about 0.3 larger for C22 than for CMAP.
With respect to the conformational coordinate H1 the cis ensembles show the opposite behavior: CMAP predicts that highly populated conformational substates are located in the region H1 > 0.6, which corresponds to
substantial turns at the linkages (cf. Fig. 3c). Thus also the average H¯ 1 = 0.38 is positive and close to the value of
0.40 found for the CMAP trans ensemble. In contrast, C22 predicts that the ”turn” region H1 > 0.6 of the conformational space is essentially empty (cf. Fig. 3a). Instead the region H1 < −0.6 signifying extended structures at
the linkages is well-populated. As a result, for C22 the average value H¯ 1 is shifted to the smaller value of −0.09.
In contrast, CMAP assigns only a very small population to the ”extended” region H1 < −0.6. Concerning the
helicity of the core of the peptide (as measured by H2 ) the Figs. 3a,c reveal no significant differences for the two
force fields. This visual impression is validated by the values of H¯ 2,c which are very similar indeed (cf. table 2).
As a result, the two force fields predict slightly different conformational ensembles for cis- and trans-cAPB
at 300 K with the differences being largely confined to the linkage regions within which the peptide is covalently
attached to chromophore.
Proton distances
In the following two tables S4 and S5 the proton distances from experiment, 1 earlier MD simulations 2 and our REST
simulations are listed. Table S4 contains the cis data and table S5 the trans data.
S3
Supplementary Material for PCCP
This journal is © The Owner Societies 2010
Force Field:
Temperature:
—
300 K
500 K
C22
300 K
570 K
CMAP
300 K 570 K
Atom 1
-
Atom 2a
b
rexp
c
rMD
d
rMD
d
rMD
e
rMD
e
rMD
APB:0:H2,H4
APB:0:H1,H5
ALA:1:HA
ALA:1:HB*
ALA:1:HB*
CYS:2:HA
CYS:2:HN
ALA:3:HA
ALA:3:HB*
ALA:3:HB*
ALA:3:HB*
ALA:3:HB*
ALA:3:HN
THR:4:HA
THR:4:HB
THR:4:HG2*
THR:4:HG2*
THR:4:HG2*
THR:4:HN
THR:4:HN
ASP:6:HA
ASP:6:HB1
ASP:6:HB2
ASP:6:HB1
ASP:6:HB1
ASP:6:HB2
ASP:6:HB1
ASP:6:HB2
GLY:7:HA1
GLY:7:HA2
GLY:7:HA1
GLY:7:HA2
GLY:7:HN
PHE:8:HA
PHE:8:HA
PHE:8:HB1
PHE:8:HB2
PHE:8:HB1
PHE:8:HB2
PHE:8:HN
PHE:8:HPHE*
-
ALA:1:HN
ALA:1:HN
CYS:2:HN
CYS:2:HN
ALA:3:HN
ALA:3:HN
ALA:1:HN
THR:4:HN
THR:4:HN
APB:0:H1,H5
APB:0:H6,H10
APB:0:H7,H9
CYS:2:HN
CYS:5:HN
CYS:5:HN
APB:0:H1,H5
APB:0:H6,H10
APB:0:H7,H9
ALA:3:HN
CYS:5:HN
GLY:7:HN
THR:4:HG1
THR:4:HG1
GLY:7:HN
PHE:8:HPHE*
PHE:8:HPHE*
APB:0:H7,H9
APB:0:H7,H9
ASP:6:HA
ASP:6:HA
PHE:8:HN
PHE:8:HN
ASP:6:HN
APB:0:H7,H9
APB:0:HN
APB:0:H7,H9
APB:0:H7,H9
APB:0:HN
APB:0:HN
APB:0:HN
APB:0:HN
4.52
6.71
3.01
4.07
5.08
2.88
3.11
2.81
4.32
7.86
8.03
7.21
3.09
2.77
3.30
8.02
7.33
6.79
3.28
3.20
2.94
4.80
4.80
5.30
7.10
7.10
6.20
6.78
4.90
5.02
3.24
3.24
3.00
6.01
2.53
6.92
6.78
3.47
3.80
3.35
6.54
2.69
4.72
2.79
2.86
4.82
2.58
2.67
2.63
3.03
6.15
6.16
5.34
2.67
2.62
2.78
6.80
6.67
5.90
2.84
2.70
2.56
6.22
5.24
2.74
5.47
5.81
6.05
6.43
4.51
4.57
2.47
2.72
2.67
4.24
2.49
4.68
4.38
2.78
2.60
2.68
4.79
2.73
4.77
2.54
2.93
4.56
2.56
3.18
2.63
2.99
10.55
11.06
9.52
2.61
2.58
2.65
7.64
6.87
5.63
2.72
2.76
2.61
6.94
5.61
2.68
5.58
6.10
6.46
7.04
4.54
4.59
2.47
2.55
2.56
4.23
2.47
4.80
4.20
2.94
2.45
2.78
5.21
2.72
4.75
2.55
2.93
4.45
2.70
2.81
2.60
3.02
7.87
8.08
6.56
2.34
2.48
2.61
7.52
7.83
7.34
2.58
2.65
2.53
6.47
5.18
2.57
6.06
6.57
5.90
6.10
4.52
4.54
2.41
2.69
2.51
4.29
2.54
4.70
4.29
2.77
2.52
2.49
5.07
2.73
4.78
2.75
3.44
4.87
2.70
2.63
2.40
3.66
5.86
5.53
4.30
2.31
2.57
3.18
7.78
7.42
6.23
2.74
2.28
2.73
6.38
5.91
3.33
5.96
6.28
6.70
7.56
4.76
4.62
2.88
2.97
2.64
4.48
2.85
4.77
4.23
2.78
2.50
2.45
5.49
2.72
4.75
2.58
3.37
4.46
2.54
2.44
2.50
3.47
6.42
6.81
5.73
2.33
2.47
3.02
7.72
7.70
6.74
2.44
2.45
2.52
5.66
4.86
3.05
5.81
6.26
5.96
6.88
4.57
4.53
2.59
2.67
2.35
4.35
2.65
4.71
4.23
2.75
2.46
2.47
5.14
-
0.23
0.81
0.29
0.34
0.14
RMSV f :
Table S4: Proton distances r for cis-cAPB. a Names of the involved Atoms using the following nomenclature:
Residue:Number:Atom(s) - Residue:Number:Atom(s). A star indicates a set of (chemically) equivalent protons.
b experiment. 1 c Carstens et al. 2 d REST simulation R/T/C22. e REST simulation R/T/CMAP. f RMSV. All distances
are given in Å.
S4
Supplementary Material for PCCP
This journal is © The Owner Societies 2010
Force Field:
Temperature:
—
300 K
500 K
C22
300 K
570 K
CMAP
300 K 570 K
Atom 1
-
Atom 2a
b
rexp
c
rMD
d
rMD
d
rMD
e
rMD
e
rMD
APB:0:H2,H4
ALA:1:HB*
ALA:1:HB*
CYS:2:HA
CYS:2:HB1
CYS:2:HB2
CYS:2:HN
ALA:3:HB*
ALA:3:HB*
ALA:3:HB*
ALA:3:HB*
THR:4:HA
THR:4:HA
THR:4:HB
THR:4:HB
THR:4:HG2*
THR:4:HG2*
CYS:5:HB1
CYS:5:HB2
CYS:5:HB1
CYS:5:HB2
CYS:5:HB2
ASP:6:HA
ASP:6:HB1
ASP:6:HB2
ASP:6:HB1
ASP:6:HB2
ASP:6:HB1
ASP:6:HB2
ASP:6:HB1
ASP:6:HB2
ASP:6:HB1
GLY:7:HN
GLY:7:HA1
GLY:7:HA2
GLY:7:HA1
GLY:7:HA2
GLY:7:HA1
GLY:7:HA2
PHE:8:HB1
PHE:8:HB2
PHE:8:HN
PHE:8:HPHE*
-
ALA:1:HN
CYS:2:HN
ALA:3:HN
ALA:3:HN
ALA:3:HN
ALA:3:HN
ALA:1:HN
CYS:2:HA
CYS:2:HN
THR:4:HN
APB:0:H1,H5
CYS:5:HN
APB:0:H1,H5
CYS:5:HN
ASP:6:HN
ASP:6:HN
APB:0:H6,H10
ASP:6:HN
ASP:6:HN
APB:0:H6,H10
APB:0:H6,H10
APB:0:H7,H9
GLY:7:HN
THR:4:HG1
THR:4:HG1
GLY:7:HN
GLY:7:HN
APB:0:H6,H10
APB:0:H6,H10
APB:0:H7,H9
APB:0:H7,H9
APB:0:HN
ASP:6:HN
ASP:6:HA
ASP:6:HA
PHE:8:HN
PHE:8:HN
APB:0:HN
APB:0:HN
APB:0:HN
APB:0:HN
APB:0:HN
APB:0:HN
4.70
4.02
5.80
2.64
3.19
3.42
2.63
5.22
5.42
3.96
7.02
2.56
6.04
3.09
4.35
5.09
6.42
3.28
3.63
6.29
6.06
6.83
2.82
4.79
4.63
3.60
3.64
7.20
7.20
6.80
6.80
5.62
3.86
4.67
4.40
2.71
3.13
4.28
4.58
3.64
3.79
2.86
6.50
2.65
2.90
5.25
3.11
2.58
2.74
2.29
5.37
5.50
2.99
7.31
2.51
4.03
2.88
6.35
5.55
5.14
2.64
2.92
4.10
4.39
5.01
2.36
6.55
5.95
2.54
2.87
6.63
6.59
5.47
5.75
4.98
3.40
4.46
4.49
2.25
3.18
3.91
4.96
2.75
2.78
2.36
5.31
2.72
2.91
5.21
2.99
2.53
2.60
2.38
5.45
5.41
2.97
7.93
2.54
3.71
3.18
7.06
6.21
5.04
2.37
3.01
4.98
3.70
4.27
2.39
7.59
7.47
2.36
3.27
7.18
7.25
5.89
6.40
5.53
3.96
4.56
4.46
2.19
3.22
3.82
5.08
2.59
2.72
2.29
5.77
2.71
2.95
5.15
3.14
2.49
2.65
2.25
5.47
5.20
2.94
6.32
2.43
3.84
2.73
5.17
5.23
5.10
2.47
2.72
4.31
3.95
4.36
2.45
6.66
5.74
2.40
2.70
5.92
5.58
5.19
4.98
4.64
2.92
4.50
4.52
2.27
3.12
3.93
5.03
2.70
2.62
2.30
5.49
2.71
3.44
5.68
2.60
3.07
3.16
2.27
5.18
5.10
3.40
5.07
2.32
4.33
3.73
4.53
3.82
4.96
3.39
3.41
5.07
4.71
5.21
2.29
4.36
4.18
2.76
3.67
4.79
4.71
3.68
4.23
3.22
3.27
4.42
4.41
2.52
3.18
4.34
4.90
2.50
2.69
2.43
5.68
2.71
3.30
5.37
2.70
3.08
3.17
2.14
5.22
4.82
3.28
4.81
2.36
4.44
3.15
4.25
4.21
5.37
2.93
3.22
4.64
4.41
4.79
2.41
5.29
4.52
2.65
3.11
5.56
5.13
4.60
4.54
3.96
2.53
4.45
4.51
2.35
2.99
4.05
4.95
2.76
2.63
2.39
5.44
-
0.47
0.72
0.45
0.13
0.10
RMSV f :
Table S5: Proton distances r for trans-cAPB. a Names of the involved Atoms using the following nomenclature:
Residue:Number:Atom(s) - Residue:Number:Atom(s). A star indicates a set of (chemically) equivalent protons.
b experiment. 1 c Carstens et al. 2 d REST simulation R/T/C22. e REST simulation R/T/CMAP. f RMSV. All distances
are given in Å.
S5
Supplementary Material for PCCP
This journal is © The Owner Societies 2010
Force field parameters of the chromophore
The following four tables S6, S7, S8, and S9 contain the force field parameters for the APB switch and for its covalent
linkage to the peptide. Fig. S13 shows the chemical structure of the APB chromophore. The mapping between the
atom names given in the figure and the atom types required for specifying the force field is given in table S6.
H2
Figure S13: Chemical structure of the APB chromophore. The mapping between atom names and atom types is
given in table S6.
Name
Atom type
Charge
Name
Atom type
Charge
C1
C3
C5
N1
C7
C9
C11
H1
H4
H6
H9
N
C
CAZ
CAZ
CAZ
NAZ
CAZ
CAZ
CAZ
HAZ
HAZ
HAZ
HAZ
NH1
C
0.4028
-0.0145
-0.0145
-0.2439
0.4028
-0.2389
-0.2389
0.1202
0.0836
0.1202
0.1654
-0.5293
0.5930
C2
C4
C6
N2
C8
C10
C12
H2
H5
H7
H10
HN
O
CAZ
CAZ
CAZ
NAZ
CAZ
CAZ
CAZ
HAZ
HAZ
HAZ
HAZ
H
O
-0.2561
-0.1647
-0.2561
-0.2439
-0.1803
0.3665
-0.1803
0.0836
0.1202
0.1654
0.1202
0.3192
-0.5017
Table S6: Partial charges derived by DFT. 3
S6
Supplementary Material for PCCP
This journal is © The Owner Societies 2010
Type 1
Type 2
CAZ
NAZ
HAZ
NAZ
C
NH1
CT2
CAZ
CAZ
CAZ
NAZ
CAZ
CAZ
CAZ
kb
b0
419.4
341.7
404.7
716.8
250.0
320.0
230.0
1.398
1.419
1.086
1.261
1.504
1.405
1.526
Table S7: Atom types, force constant kb (kcal mol−1 Å−2 ), and equilibrium distance b0 (Å)defining the covalent
bond energy terms. 3
Type 1
Type 2
Type 3
CAZ
HAZ
NAZ
NAZ
C
O
NH1
NH1
H
C
HA
CT2
NH1
CAZ
CAZ
CAZ
NAZ
CAZ
C
C
CAZ
NH1
NH1
CT2
CAZ
CT2
CAZ
CAZ
CAZ
CAZ
CAZ
CAZ
CAZ
CAZ
CAZ
CAZ
CAZ
CAZ
CAZ
kφ
φ0
40.0
34.4
60.2
121.1
45.8
80.0
80.0
70.0
35.0
50.0
49.3
45.8
50.0
120.0
120.0
120.0
114.8
120.0
121.0
116.5
120.0
114.8
120.0
107.5
120.0
116.3
Table S8: Atom types, force constant kφ (kcal mol−1 rad−2 ), and equilibrium angle φ0 (deg) defining the angles
energy terms. 3
Type
Type
Type
Type
kφn
n
φn
CAZ
NAZ
NAZ
NAZ
NAZ
CAZ
CAZ
CAZ
CAZ
CAZ
C
X
C
CAZ
CAZ
C
CAZ
NH1
CAZ
CT1
CAZ
CAZ
C
CAZ
NH1
NH1
H
X
NH1
CT1
CAZ
C
CAZ
NH1
CAZ
CAZ
O
CAZ
C
NAZ
NH1
NAZ
CAZ
CAZ
12.47
2.03
0.18
0.55
0.13
3.10
1.60
2.50
2.50
0.71
0.13
3.10
1.60
2.50
2.50
20.62
3.19
2
2
4
2
4
2
1
2
2
2
4
2
1
2
2
2
4
180.0
180.0
0.0
180.0
0.0
180.0
0.0
180.0
180.0
180.0
0.0
180.0
0.0
180.0
180.0
180.0
0.0
Table S9: Atom types, force constant kφn (kcal/mol), periodicity n, and phase shift (deg) for dihedral energy terms
defining the dihedral energy terms. The parameters for the dihedral CAZ-NAZ-NAZ-CAZ given at the bottom of
the table are used in the MD/ISOM3 simulation. 3
S7
Supplementary Material for PCCP
This journal is © The Owner Societies 2010
Temperature dependence and other properties of the RMSV
Based on a very simple model we explain, why the observable RMSV, which is defined by Eq. (8) and is used
exp
to measure the agreement between NMR proton-proton distances di j and simulation data, is expected to be a
monotonously decreasing function of the simulation temperature T (as is apparent in Fig. 6).
Assume that a peptide is in the solid state, i.e., that the peptide atoms i thermally fluctuate around fixed average
√
positions hri i. If the fixing is harmonic, then the standard deviation of the fluctuations increases with T (in
the limit of small amplitudes) and the positions ri (t) are normally distributed. Therefore, also the distances ri j (t)
between the atoms will be normally distributed
#
"
− (ri j − hri j i)2
1
(1)
exp
p(ri j |σi j ) = √
2σi2j
2πσ
√
around average distances hri j i with standard deviations σi j increasing monotonously with T . Then the so-called
interaction distances 4 di j , which are defined by Eq. (7) and serve for comparisons of simulation data with NOE
exp
distance restraints di j , can be estimated through
di j ≡
"*
1
ri6j
+#−1/6
≈
"Z
∞
rmin
p(ri j |σi j )
dri j
ri6j
#−1/6
(2)
where the minimal distance rmin models a hard-sphere exclusion applicable to close atoms. Because the widths σi j
of the distance distributions p(ri j ) are functions of T , also the interaction distances di j depend on T .
Figure S14: Distance distributions p(ri j |σi j ) of two hydrogen atoms in a putative rigid model structure for three
different temperatures Tκ . Low temperature T1 : solid; intermediate temperature T2 : dashed; high temperature T3 :
dotted. Also indicated through the vertical bars are the associated NMR interaction distances di j,κ .
As a specific example, consider two hydrogen atoms for which a NOE signal has been measurable. Then
exp
the experimental distance di j of these atoms will be not much larger than about 5 Å, because NOE signals
of more distant atoms become very weak. Now suppose that the average distance hri j i in the simulated (rigid)
structure is 8 Å and that the contact distance rmin is 1.5 Å. Consider furthermore three different temperatures Tκ ,
κ = 1, 2, 3, as measured by the three different widths σi j,κ = 1, 2, 3 Å, and assume that the average distances hri j i
are independent of temperature as is approximately the case for solids with a small thermal expansion coefficient.
Then the corresponding interaction distances di j,κ resulting from the three different distance distributions shown in
Fig. S14 are 7.51, 4.66, and 3.29 Å, respectively. Thus, the interaction distances di j,κ monotonously decrease with
increasing temperatures Tκ although the average structure is invariant (hri j i = 8 Å).
exp
exp
Next suppose that the measured distance di j is 4 Å. Then the contributions max[0, di j,κ − di j ]2 of the three
interaction distances di j,κ to the RMSV are 12.3 Å2 , 0.4 Å2 , and 0, respectively (cf. the definition of the RMSV in
S8
Supplementary Material for PCCP
This journal is © The Owner Societies 2010
Eq. (8)). Thus, although the average structures in the three simulations are identical, the low temperature simulation
correctly signifies a large deviation for this particular distance whereas the high temperature simulation signifies no
violation at all. High temperature simulations like those in Ref. 2 can therefore give the incorrect impression of a
good match with NMR data. This artifact is avoided by choosing the experimental temperature in the simulations.
In summary, the simple model of a harmonically fixed and, thus, rigid peptide structure clearly explains why
one should expect a monotonously decreasing RMSV, if one simulates the system at increasing temperatures. Note
exp
in this context that the experimental distances di j should have the same temperature dependence as the interaction
distances di j derived from a simulation implying that one will measure smaller values with increasing T (as long
as the structure remains rigid). If one wants to use the RMSV as an absolute measure for judging the quality of a
model structure, one has to make sure that the thermal fluctuations in the experimental and simulated systems are
of equal size.
The above example has also shown that the interaction distances di j decrease, if more small values ri j are
contained in the ensemble of sampled distances. For a multimodal distance distribution p(ri j ) featuring many
substates, which is the generic case for flexible peptides, this property implies that the interaction distance di j is
dominated by the substates exhibiting small distances. If one finds, e.g., nine times the value ri j = 8 Å and once
the value ri j = 2 Å, then Eq. (7) predicts an interaction distance di j ≈ 3 Å.
Figure S15: Histograms of distance distributions p(ri j |σi j ) for hydrogen atoms observed in REST simulations of
trans-cAPB at two different temperatures.
Finally we want to demonstrate that the simple model actually applies to cAPB. In Figure 15 we provide examples for the temperature dependencies of two randomly selected proton-proton distance distributions, for which
NOE’s were actually observed in trans-cAPB (see table S5 for the corresponding interaction distances di j ). The
two histograms both show broadenings of the respective distributions upon heating and nearly invariant locations,
around which they are centered. In particular, both distributions feature an increasing number of small distances
upon heating which explains, why the associated interaction distances decrease from 4.98 to 4.31 Å for the top
histogram and from 7.47 to 5.74 Å for the bottom histogram with increasing T . Because in the former case the
S9
Supplementary Material for PCCP
This journal is © The Owner Societies 2010
observed distance is 6.29 Å and, thus, larger than the MD interaction distances at both temperatures, the contributions to the RMSV vanish in both cases (cf. Eq. 8). In the latter case, however, it is small measuring only 4.63 Å
and, hence, the contribution to the RMSV decreases with T . Because relatively small distances are frequent among
the NMR data, the monotonous decrease of the total RMSV with increasing T is readily understood.
Note here that the bimodal distribution (top) reflects the existence of at least two conformations in the simulated
ensemble. As mentioned above, the NOE interaction distance is 6.29 and, thus, right at the location of the first
maximum of the 300 K distance distribution shown in the top graph of Fig. S15 indicating again that NOE distances
overlook substates with large proton-proton distances.
Temperature dependence of average helicity scores
Whereas the shapes of the amide I bands in the spectra of peptides are highly sensitive to the temperature, the
ensemble average helicity H¯ 2 of cis-cAPB is nearly independent of temperature. This fact is proven by Fig. S16 for
the CMAP force field, which shows the variation of H¯ 2 within the generalized REST ensemble as a function of the
temperature.
Figure S16: The ensemble average helicity H¯ 2 of cis-cAPB as a function of temperature within the REST simulation
R/C/CMAP.
S10
Supplementary Material for PCCP
This journal is © The Owner Societies 2010
Fast cooling processes monitored by MD
Because the observable H¯ 2 (t) is insensitive to the temperature and senses the structural alteration of the chromophore, which is caused by its cis/trans isomerization, only with a delay of about 100 ps, the faster relaxation
processes are overlooked by H¯ 2 (t). In contrast, the peptide’s temperature directly maps the initial deposition of
heat into the peptide and its subsequent dissipation into the surrounding solvent. For an ensemble of 500 short
(100 ps) simulations of the cAPB photoisomerization in DMSO, where the peptide was described by the C22 force
field, we have monitored the ensemble average time course of cAPB’s temperature T¯ (t).
Figure S17: Temporal evolution T¯ (t) of cAPB’s average temperature extracted from an ensemble of 500 short
simulations of the cis/trans photoisomerization.
Fig. S17 shows these data together with a fit using a sum of two exponential functions. This fit yields a fast decay
of 0.4 ps corresponding to the immediate ballistic dissipation of energy during the chromophore’s isomerization and
a slower cooling process occurring on a time scale of 17 ps. The former time constant roughly agrees with the 0.3 ps
kinetics extracted from cAPBs total energy calculated in earlier MD simulations and with the 0.2 ps time constant
determined by ultrafast pump-probe spectroscopy in the UV/vis region for the chromophore isomerization. 5 The
latter time constant nicely agrees with the 11 ps kinetics observed in the time resolved amide I spectra of cAPB
(see table 3) and with earlier measurements of such cooling kinetics by optical pump-probe spectroscopy, 6 which
determined a cooling time of 15 ps for hot azobenzene in ethanol.
Slow relaxation processes monitored by MD
Fig. S11 in the paper shows the time resolved landscapes G[H1 , H2 ,t] at the time points t/ns ∈ {0, 0.2, 2, 20}. For
a simple numerical representation of these distributions we additionally provide in table 10 the average linkage
and core helicities H¯ 1 (t) and H¯ 2 (t), respectively. The table demonstrates once again that only H¯ 2 (t) is suited to
distinguish cis and trans.
time/ns
−∞
0.2
2.0
20
∞
C22a
H¯ 1
−0.09
0.13
0.30
0.47
0.67
H¯ 2
0.10
−0.02
−0.26
−0.44
−0.56
CMAPb
H¯ 1
H¯ 2
0.38
0.28
0.43
0.31
0.40
0.03
−0.12
−0.22
−0.40
−0.54
Table S10: Temporal evolution of average helicities H¯ i (t), i = 1, 2. a From simulation I/C22 for |t| < ∞, R/C/C22 for
t = −∞, and R/T/C22 for t = ∞. b From I/CMAP for |t| < ∞, R/C/CMAP for t = −∞, and R/T/CMAP for t = ∞.
S11
Supplementary Material for PCCP
This journal is © The Owner Societies 2010
REFERENCES
REFERENCES
Relaxation plotted on a logarithmic time scale
The simulated kinetics of the cis/trans relaxation of cAPB as monitored by the average helicity score H¯ 2 (t) in the
core of the peptide has been presented on a linear time axis in Fig. S10. This linear plot clearly reveals the slow
(τ3 = 23 ns) exponential decay but does not resolve the fast processes associated with τ1 and τ2 . Therefore we plot
in Fig. S18 the same data once again on a logarithmic time scale.
Figure S18: Data on the simulated cis/trans relaxation of cAPB from Fig. S10 presented on a logarithmic time scale
to clearly resolve the fast events.
For the C22 force field, the average core helicity H¯ 2 (t) is essentially invariant during the first 150 ps, although
the isomerization is finished after 0.3 ps, and cooling as well as important conformational changes occur within
this time in the vicinity of the chromophore. 5 This observation demonstrates that it takes a certain amount of time
until the stretching of the chromophore propagates to the core of the peptide. After about 150 ps the average
helicity score H¯ 2 (t) of the peptide core suddenly drops indicating the force-driven unfolding of the one α -helix in
the cis-ensemble.
For CMAP the results are qualitatively similar, but the first unfolding of one of the two α -helices in the cisensemble occurs already after 10 ps. A second sharp drop after about 150 ps indicates the sudden breaking of the
hydrogen bonds stabilizing the second α -helix in the ensemble. The fact that the fast processes mainly involve
individual unfolding events explains, why the very fast time constants τ1 determined by the multi-exponential fits
bear large statistical uncertainties.
References
[1] C. Renner, R. Behrendt, S. Spörlein, J. Wachtveitl and L. Moroder, Biopolymers, 2000, 54, 489–500.
[2] H. Carstens, C. Renner, A. G. Milbradt, L. Moroder and P. Tavan, Biochemistry, 2005, 44, 4829–4840.
[3] H. Carstens, Dissertation, Fakultät für Physik, Ludwig-Maximilians-Universität München, 2004.
[4] X. Daura, K. Gademann, H. Schafer, B. Jaun, D. Seebach and W. F. van Gunsteren, J. Am. Chem. Soc., 2001,
123, 2393–2404.
[5] S. Spörlein, H. Carstens, H. Satzger, C. Renner, R. Behrendt, L. Moroder, P. Tavan, W. Zinth and J. Wachtveitl,
Proc. Natl. Acad. Sci. USA, 2002, 99, 7998–8002.
[6] T. Naegele, R. Hoche, W. Zinth and J. Wachtveitl, Chem. Phys. Lett., 1997, 272, 489–495.
S12
6 Resümee und Ausblick
Nicht nur die Simulationen der schon in der Einleitung ausführlich vorgestellten lichtschaltbaren Azopeptide, sondern auch weitere Simulationsprojekte aus der Arbeitsgruppe
wie die Beschreibung der Chromophor-Konformerenensmebles in Bakteriorhodopsin[102]
oder die Untersuchung des Einflusses der Umgebungspolarität auf die Stabilität der Helix 1 im zellulären Prionprotein [55] warfen die Frage auf, wie die meist knappen Computerresourcen möglichst optimal genutzt werden können. Daher bekam ich zu Beginn
meiner Dissertation den Arbeitsauftrag, die äußerst populäre RE Methode zu untersuchen
und gegebenenfalls zu implementieren.
Hier stellte sich recht schnell heraus, daß die RE Methode in ihrer ursprünglichen
Fassung nur sehr bedingt für den Einsatz in unserer Arbeitsgruppe geeignet ist. Dies liegt
an der großen Anzahl an Temperatursprossen, die notwendig sind, um einen hinreichend
großen Temperaturbereich aufzuspannen, wenn das Lösungsmittel explizit berücksichtigt
wird. Entsprechend beinhaltet eine RE Simulation gewöhnlich dutzende parallel laufende
Simulationen. Da aber meistens nur die Daten bei der Zieltemperatur T0 von Interesse
sind, ist es durchaus fraglich, ob RE wirklich eine effizienteres Abtasten verglichen zu
herkömmlichen Simulationen bewirkt [85].
Man kann es als eine glückliche Fügung ansehen, dass just zu der Zeit, in der meine
ersten Untersuchungen zu RE liefen, Liu und Co-Autoren die sog. Replica Exchange with
Solute Tempering (REST) Methode publizierten [101]. Da diese RE Methode mit ihrem
schon in der Einleitung näher beschriebenen Solute Tempering Konzept die Anzahl der
Temperatursprossen drastisch reduziert, sahen mein Betreuer, Paul Tavan, und ich das
oben angesprochene Problem als gelöst an.
Nachdem ich die REST Methode in Form von Shell-Skripten implementiert hatte,
wendete ich die Methode auf die beiden in der Einleitung dieser Arbeit vorgestellten Azopeptide an. Im Falle des zyklischen Azopeptids cAPB stellten die so ermittelten Gleichgewichtsensembles die Grundlage für die in Kapitel 5 abgedruckte Publikation dar. Bei
der REST Behandlung des β-Haarnadelpeptids schien die Simulation schon nach wenigen Nanosekunden konvergiert zu sein, was bei mir anfänglich Begeisterung hervorrief.
Nach einer kritischen Sichtung der Daten ergab sich aber, dass die RE Methode einem
naiven Anwender eine scheinbare Konvergenz vorgaukeln kann. Diese Erkenntnis war der
Ausgangspunkt für die Untersuchungen, die zu dem in Kapitel 2 abgedruckten Artikel
führten.
Um das Konvergenzverhalten der β-Haarnadel Simulation genauer zu untersuchen,
konstruierte ich ein 3-Zustands Markovmodell für ein β-Haarnadelpeptid, dessen drei
Zustände gerade den gefalteten Zustand, einen Übergangszustand und den ungefalteten
Zustand eines Haarnadelpeptids repräsentieren. Mit diesem Modellsystem führte ich RE
Simulationen durch und untersuchte das Konvergenzverhalten dieser Simulationen bezüg-
87
6 Resümee und Ausblick
lich der freien Energiedifferenz zwischen dem gefalteten und ungefalteten Zustand. Das
Modellsystem wurde so konstruiert, dass diese freie Energiedifferenz bei der Zieltemperatur T0 gerade verschwindet. Auf ein reales β-Haarnadelpeptid übertragen bedeutet dies,
dass im Ensemble bei T0 =300 K etwa 50% der Haarnadelpeptide gefaltet sind, was eine
durchaus realistische Annahme ist [11].
Neben der scheinbaren Konvergenz, die in der zitierten Publikation als Pseudokonvergenz bezeichnet wurde, legten die RE Simulationen eine ernste Schwachstelle der RE
Methode offen: Entgegen der Erwartung, dass eine Temperaturerhöhung die FaltungsEntfaltungsdynamik des Modellsystems beschleunigt, trat genau das Gegenteil ein. Zwar
beschleunigte eine Temperaturerhöhung den Entfaltungsprozess, gleichzeitig verlangsamte sie aber den Faltungsprozess dermaßen, dass die durchschnittliche Zeit für eine Entfaltung und nachfolgende Rückfaltung mit steigender Temperatur anwuchs. Dieses überraschende Verhalten hat seine Ursache darin, dass die Barriere zwischen dem entfalteten Zustand und dem Übergangszustand entropischer Natur ist. Tatsächlich wurde dieses Verhalten auch schon bei experimentellen Messungen zur Faltungsdynamik an βHaarnadelpeptiden beobachtet [110], was ein gutes Indiz für die Realitätsnähe des verwendeten 3-Zustandsmodells darstellt. Da für das von mir zu untersuchende β-Haarnadelpeptid Faltungszeiten im Bereich von Mikrosekunden gemessen wurden [16], gab ich
das Vorhaben, das strukturelle Gleichgewichtsensemble dieses Peptids mittels MD-Simulationen zu bestimmen, als rechentechnisch unhandhabbar auf.
Sind hingegen enthalpische Barrieren für langsame Abtastgeschwindigkeiten verantwortlich, so können auf Temperaturerhöhung basierende Strategien wie RE die Abtastgeschwindigkeit steigern [111]. Dabei hängt die Abtasteffizienz der RE Methode nicht
alleine von den Eigenschaften des zu simulierenden Systems ab, sondern auch davon,
wie bestimmte, der Methode innewohnende Parameter gewählt werden. Zu diesen Parametern zählt beispielsweise die Zieltemperatur T0 und die maximale Temperatur TN −1 .
Während für die Zieltemperatur meist standardmäßig T0 = 300 K gewählt wird, kann die
Frage nach der Temperatur TN −1 nicht pauschal beantworten werden. Oft richtet sich die
Wahl von TN −1 nach der Anzahl an Temperatursprossen und damit nach der Anzahl an
Replikaten, die man gleichzeitig aufgrund der verfügbaren Rechnerresourcen simulieren
kann. Hat man sich für eine maximale Temperatur TN −1 entschieden, stellt sich die Frage,
wie die einzelnen Temperaturen Tk einer Temperaturleiter zu wählen sind. Ein mögliches
Kriterium für die Güte einer Temperaturleiter ist die Zeit, die ein Replikat im Schnitt benötigt, um von T0 nach TN −1 und zurück zu wandern. Diese Zeit wird als round-trip Zeit
τ bezeichnet und sollte durch eine optimale Temperaturleiter minimiert werden.
Stutzig machte mich nun, dass es in der Literatur sich widersprechende Aussagen gab,
wie eine auf die round-trip Zeit optimierte Temperaturleiter zu konstruieren ist. So fanden Predescu und Co-Autoren, dass eine optimale Temperaturleiter einhergeht mit einer
mittleren Akzeptanzwahrscheinlichkeit von etwa 40% zwischen in der Temperatur benachbarten Replikate [112]. Abweichend hierzu gaben Nadler und Hansmann eine mittlere Akzeptanzwahrscheinlichkeit von rund 20% an. Diese Autoren gaben darüber hinaus
auch erstmals eine einfache Formel an, mit deren Hilfe die einzelnen Sprossen Tk einer
optimalen Temperaturleiter bestimmt werden können, sobald Tmin = T0 und Tmax = TN −1
88
sowie die Wärmekapazität C des Simulationssystems bekannt sind [113].
Um zu klären, welche mittlere Akzeptanzwahrscheinlichkeit tatsächlich zur kürzesten
round-trip Zeit führt, untersuchte ich den Sachverhalt an einem einfachen Modellsystem
aus harmonischen Oszillatoren. Ein solches Oszillatorensystem benötigt ähnlich wie das
3-Zustandsmodell für das β-Haarnadelpeptid sehr wenig Rechenleistung und verfügt darüber hinaus über eine von der Temperatur des Systems unabhängige Wärmekapazität.
Letztere Eigenschaft war deswegen wichtig, weil Nadler und Hansmann in ihrer Herleitung eine konstante Wärmekapazität vorausgesetzt hatten — eine Bedingung, die für Simulationssysteme mit explizitem Lösungsmittel meist in guter Näherung erfüllt ist [113].
Abweichend zu den theoretischen Resultaten von Nadler und Hansmann ergaben meine RE Rechnungen an dem Oszillatorsystem, dass, je nach Systemgröße, mittlere Akzeptanzwahrscheinlichkeiten zwischen 40% und 45% zu minimalen round-trip Zeiten führen.
Hieraus resultierte eine zu Nadler und Hansmann abweichende Formel zur Bestimmung
der Temperatursprossen Tk . Diese Formel stellt den Kern der ersten in Kapitel 3 abgedruckten Publikation dar, wobei darüber hinaus dort erstmals eine weitere Vorschrift zur
Konstruktion einer optimierten ST Temperaturleiter angegeben wurde. Aus der√letztgenannten Vorschrift geht hervor, dass eine solche ST Leiter eine um den Faktor 1/ 2 kleinere Anzahl an Temperatursprossen benötigt als eine entsprechende RE Leiter.
Obwohl die Ergebnisse meiner Rechnungen sowohl der Prüfung durch meinen Kollegen Martin Lingenheil als auch der finalen Prüfung durch meinen Betreuer Paul Tavan
standhielten und außerdem im Wesentlichen mit den Resultaten von Predescu und CoAutoren übereinstimmten, bestand dennoch der Verdacht, dass wir eventuell noch nicht
alles bis ins letzte Detail verstanden haben. Dieser Verdacht lag darin begründet, dass die
Berechnungen von Nadler und Hansmann auf vernünftigen Annahmen basierten und auch
sonst keine Fehler zu erkennen waren. Tatsächlich stellte sich im Rahmen der nun folgenden genauen Analyse heraus, dass das dem Austauschprozess zugrundeliegende Austauschschema für die unterschiedlichen Resultate verantwortlich ist.
Typischerweise wird der Austausch der Temperaturen im Wechsel mit allen „geraden“ Temperaturpaaren . . . , (T2i , T2i+1 ), (T2i+2 , T2i+3 ), . . . und allen „ungeraden“ Temperaturpaaren . . . , (T2i+1 , T2i+2 ), (T2i+3 , T2i+4 ), . . . durchgeführt [112, 114, 115]. Dieses
Schema lag auch unseren RE Simulationen zugrunde. Da man diesem Austauschschema
noch keinen Namen gegeben hatte, wurde es von uns mit DEO bezeichnet, wobei diese Bezeichnung für Deterministic Even-Odd steht. Auch Predescu und Co-Autoren verwendeten das DEO Schema, während die Arbeit von Nadler und Hansmann indirekt ein
zufälliges anstatt eines abwechselnden Austauschens zwischen den geraden und ungeraden Paaren unterstellt. Wir gaben diesem im Vergleich zum DEO leicht abgewandelten
Austauschschema den Namen SEO (Stochastic Even-Odd).
In der zweiten im Kapitel 3 abgedruckten Publikation wurden daher unterschiedliche
Austauschschemata bezüglich ihres Einflusses auf die round-trip Zeiten untersucht. Es
stellte sich heraus, dass das DEO Austauschschema mit Abstand zu den kürzesten roundtrip Zeiten führt. Zwar bewirkt das DEO Schema, im Unterschied zum SEO Schema, keine
einfache, sondern eine abgewandelte Zufallsbewegung (Random Walk) der Replikate auf
der Temperaturleiter, dennoch gilt auch für das DEO Schema, dass die mittleren round-
89
6 Resümee und Ausblick
trip Zeiten proportional zum Quadrat der Anzahl N der Temperatursprossen sind. Dieser
quadratische Zusammenhang
bedeutet, dass die ST Methode im Vergleich zu RE wegen
√
NST = NRE / 2 um einen Faktor 1/2 kürzere round-trip Zeiten liefert, solange beide
Verfahren das gleiche Austauschschema verwenden.
Letztere Erkenntnis führte mich dazu, das Solute Tempering Konzept, welches ursprünglich von Liu und Co-Autoren nur auf RE angewendet wurde [101], auch auf die
ST Methode zu übertragen. Die hieraus resultierende ST Variante namens SST wurde in
der Publikation des Kapitel 4 vorgestellt und auf ein konkretes MD System angewandt.
Unter anderem zeigte sich, dass SST gegenüber REST höhere Abtastgeschwindigkeiten
liefert. Dabei wurde für die SST Simulationen das DEO Austauschschema meines Wissens zum ersten mal auf eine ST Methode angewendet. Dass in der Vergangenheit der Unterschied zwischen den round-trip Zeiten von RE und entsprechenden ST Simulationen
nie besonders stark ausgefallen ist, kann vermutlich auch darauf zurückgeführt werden,
dass RE Simulationen mit dem DEO Austauschschema betrieben wurden, während für
ST Simulationen das für diese Abtasttechnik näherliegende, aber unterlegene SEO Schema verwendet wurde.
Obwohl die SST Methode höhere Abtastgeschwindigkeiten als REST liefert, wurde im
Kapitel 5 zur Bestimmung des strukturellen Gleichgewichtsensembles des cAPB Modellpeptids die REST Methode verwendet, da zum Startzeitpunkt der REST Simulationen die
SST Methode noch nicht ausgearbeitet war. Dass konventionelle MD bei Raumtemperatur zur Bestimmung der strukturellen Gleichgewichtsensembles von cAPB ungeeignet ist,
hatte schon mein Vorgänger, Heiko Carstens, festgestellt, indem er die Häufigkeit von Diederwinkelflips der φ/ψ Diederwinkel analysierte [3]. Die beobachteten Flips waren derart
selten, dass er dazu überging, die Gleichgewichtssimulationen bei 500 K durchzuführen.
Obwohl die bei 500 K erzeugten Gleichgewichtsensembles für cis- und trans-cAPB
sehr gut mit den Proton-Proton Abständen aus NMR Messungen übereinstimmten, war
es ein Teilergebnis der Publikation aus Kapitel 5, dass diese Übereinstimmung keineswegs eine Garantie für realistische Gleichgewichtsensembles darstellt. Die dort vorgestellten REST Simulationen wurden für zwei unterschiedliche Kraftfelder (CHARMM22 und
CHARMM22/CMAP) durchgeführt und lieferten neben den strukturellen Gleichgewichtsensembles bei Raumtemperatur auch nichtphysikalische Hochtemperaturensembles. So
wichen die aus den REST Simulationen bei 570 K mit dem CHARMM/CMAP Kraftfeld
ermittelten strukturellen Hochtemperaturensembles stark von den bei Raumtemperatur ermittelten Ensembles ab, obwohl die Strukturdaten sowohl für 570 K als auch für 300 K
im Einklang mit den experimentellen NMR Daten standen. Interessanterweise standen
die strukturellen Ensembles, die mit REST unter Verwendung des CHARMM Kraftfeldes
berechnet wurden, weit weniger gut im Einklang mit den NMR Daten, was darauf hindeutet, dass die Erweiterung des CHRAMM Kraftfeldes durch die CMAP Korrekturkarte
tatsächlich ein realistischeres Kraftfeld liefert.
Im Vordergrund der in Kapitel 5 abgedruckten Publikation stand jedoch die Frage,
auf welchen Zeitskalen die durch lichtinduzierte cis-trans Isomerisierung ausgelöste Relaxation des cAPB Peptids abläuft. Die Bestimmung der strukturellen cis- und transGleichgewichtsensembles mittels REST diente primär dazu, die Start- und Endpunkte der
90
lichtinduzierten Dynamiken genau zu bestimmen, um dadurch den Fortschritt der lichtinduzierten Dynamik verfolgen zu können. Für jedes der beiden schon für die REST Simulationen verwendeten Kraftfelder wurden jeweils 50 Startstrukturen aus dem entsprechenden cis-Gleichgewichtsensemble zufällig ausgewählt, um an ihnen MD Nichtgleichgewichtssimulationen durchzuführen, indem zu Beginn jeder Simulation durch die cis-trans
Isomerisierung des integrierten Farbstoffes die Relaxation hin zum entsprechenden transEnsemble ausgelöst wurde.
Die Ergebnisse dieser Simulationen wurden mit den aus IR pump-probe Messungen
ermittelten Relaxationszeiten der Arbeitsgruppe Zinth verglichen. Dabei konnten für die
kürzesten durch IR bestimmten Relaxationszeiten von 11 ps und 137 ps entsprechende
Zeiten weder aus den CHARMM noch aus den CHARMM/CMAP Nichtgleichgewichtssimulationen extrahiert werden. Dies liegt wohl hauptsächlich daran, dass die verwendete
¯ 2 zur Untersuchung sehr schneller Dynamiken nicht geeignet ist, da H
¯ 2 in
Observable H
den ersten 100 ps sich nur sehr wenig ändert und damit nicht besonders sensitiv in diesem
initialen Zeitraum ist. Dies spiegelt sich auch in den recht großen statistischen Fehlern der
¯ 2 Observamittels MD bestimmten Kurzzeitdynamiken wider. Tatsächlich hatte ich die H
ble eingeführt, um die langsamsten, bislang auch experimentell nicht bekannten Relaxationszeiten zu bestimmen und nicht, wie es mein Vorgänger schon getan hatte [3, 7], die
kurzen Zerfallszeiten.
Für die längste experimentell gemessene Zerfallszeit von etwa 1,4 ns lieferten die MD
Simulationen für beide Kraftfelder (CHARMM und CHARMM/CMAP) dann auch eine
erstaunlich gute Übereinstimmung mit den experimentellen Daten. Da die IR pump-probe
Messungen aus technischen Gründen nur ein Zeitfenster von 3 ns abdeckten, konnten damit keine längeren Relaxationszeiten ermittelt werden. Andererseits war anhand des nach
3 ns beobachteten Differenzspektrums klar, dass der Relaxationsprozess zu diesem Zeitpunkt noch nicht beendet ist.
Um die Nachfolgeprozesse zu analysieren, dehnte ich daher die Nichtgleichgewichtssimulationen auf 20 ns aus und fand eine Relaxationszeit von etwa 23 ns, unabhängig vom
verwendeten Kraftfeld. Diese Relaxationszeit konnte eindeutig als die langsamste Zerfallszeit identifiziert werden, womit die Frage nach der Dauer der cis-trans Relaxationsdynamik aus theoretischer Sicht beantwortet werden konnte. Dennoch bleibt es eine spannende Frage, ob die berechnete Langzeitdynamik auch durch das Experiment bestätigt
werden kann. Leider ist mittlerweile der Vorrat an cAPB aufgebraucht, so dass die Beantwortung dieser Frage davon abhängt, ob in der Zukunft noch einmal eine ausreichende
Menge an cAPB hergestellt wird.
Unabhängig davon wird jedoch sicherlich das cAPB Peptid in MD Simulationen als
Modellpeptid weiterhin von Interesse sein. Beispielsweise kann an diesem Modellpeptid
das in der Entwicklung befindliche implizite Lösungsmittelmodell durch den Vergleich
von freien Energielandschaften getestet werden, die einerseits durch das implizite Lösungsmittelmodell und andererseits, wie in Kapitel 5 geschehen, durch explizites Lösungsmittel bestimmt wurden. Angemerkt sei, dass die Verwendung eines impliziten Lösungsmittelmodells das Solute Tempering Konzept keineswegs bedeutungslos werden lässt, da
sich dieses Konzept auch auf Teilbereiche des zu untersuchenden Moleküls anwenden
91
6 Resümee und Ausblick
lässt [102].
Das Solute Tempering Konzept hat in Verbindung mit RE (REST) und ST (SST) neben
der bloßen Einsparung an Temperatursprossen vermutlich einen weiteren, bislang in der
Literatur nicht erwähnten Vorteil: Wir vermuten, dass REST sowie SST in Verbindung mit
einem sog. Berendsen Thermostaten [116] verwendet werden darf, ohne dadurch die Gültigkeit der kanonischen Ensemblestatistik zu verletzen. Hierzu muss man wissen, dass der
Berendsen Thermostat sehr häufig in RE Simulationen verwendet wird, obwohl er kein
echtes kanonisches Ensemble erzeugt [117] und daher streng genommen nicht in RE bzw.
ST Simulationen eingesetzt werden darf [118, 119]. Erste Voruntersuchungen an einem
kleinen Oktapeptid mittels MD ergaben, dass solange ausschließlich das Lösungsmittel
und nicht das gelöste Molekül an den Thermostaten gekoppelt ist, der im Austauschkriterium berücksichtigte Energieanteil kanonisch verteilt ist. Dies wurde von uns schon länger
vermutet [118] und war ein Grund dafür, dass das cAPB Peptid in den in Kapitel 5 vorgestellten Simulationen nicht an den Berendsen Thermostaten gekoppelt wurde. Mittlerweile
hat sich Sebastian Bauer diesem Thema gewidmet und man darf in naher Zukunft von ihm
gesicherte Resultate erwarten.
Ein weiteres Projekt, das ich auf Initiative von Gerald Mathias am Ende meiner Doktorandenzeit begonnen hatte, hat das Ziel, ein neues Austauschschema zu entwickeln, das
den bislang aus Kapitel 3 bekannten Austauschschemata überlegen ist. Obwohl ich auch
hier nicht mehr die Zeit fand, dieses Projekt erfolgreich abzuschließen, sind erste vorläufige Resultate durchaus erfolgsversprechend. Entsprechend wird dieses Projekt von Gerald
Mathias weitergeführt und auch hier darf man in naher Zukunft Ergebnisse erwarten.
Wie ich bereits weiter oben erwähnt habe, kann man unter bestimmten Bedingungen
round-trip Zeit optimierte Temperaturleitern berechnen, sobald man die Wärmekapazität
sowie die minimale und maximale Temperatur T0 und TN −1 vorgibt. Welche maximale
Temperatur TN −1 in einer RE bzw. ST Simulation zu optimalen Ergebnissen führt, kann
bislang jedoch nicht angegeben werden. Tatsächlich wird man diese Frage wohl auch in
der Zukunft nicht pauschal beantworten können. Allerdings könnten die Erkenntnisse aus
den Kapiteln 2 und 3 dazu führen, wenigstens Regeln zur Bestimmung einer optimalen
Temperatur TN −1 für wohldefinierte Grenzfälle zu finden. Erste Ideen hierzu existieren
zwar bereits, deren Qualität muss aber erst noch durch geeignete Simulationen überprüft
werden.
Leider stellt auch der beste Austauschalgorithmus in Verbindung mit einer optimalen
Temperaturleiter noch keine Garantie für kurze round-trip Zeiten dar. Vielmehr hängt die
round-trip Zeit in einer realen RE Simulation wesentlich davon ab, wie stark die potentiellen Energien der einzelnen Systemzustände (Konformationen) voneinander abweichen.
Durch den Austauschprozess werden nämlich Replikate, die sich in einer Konformation
mit niedriger potentieller Energie befinden, bevorzugt zu niedrigen Temperaturen ausgetauscht, während Replikate mit hohen potentiellen Energien zu hohen Temperaturen
befördert werden. Solange Replikate in ihrer Konformation verharren und große energetische Unterschiede zwischen den Konformationen der einzelnen Replikate bestehen,
werden die Replikate vergleichsweise lange für einen round-trip benötigen und die in Ka-
92
pitel 3 angegebene Formel zur Bestimmung der round-trip Zeit ist dann nicht mehr gültig1 .
Dieses Problem tritt typischerweise bei Phasenübergängen von gefalteten zu ungefalteten
Systemzuständen auf, wobei der Energiesprung je nach RE Variante unterschiedlich stark
ausgeprägt sein kann.
Will man also beispielsweise REST oder SST zum Abtasten des Konformationsraumes
verwenden, ist es empfehlenswert, durch initiale Testrechnungen erst einmal abzuschätzen, ob das Solute Tempering Konzept dieses Problem lindert oder gar verschärft, weil das
skalierte Kraftfeld bei TN −1 im Vergleich zum unskalierten Kraftfeld kleinere oder eventuell größere Energieunterschiede zwischen gefalteten und ungefalteten Zuständen hervorruft. Tatsächlich findet sich in der Literatur eine Arbeit von Huang und Co-Autoren,
in der das Solute Tempering Konzept vermutlich die dort beobachtete schlechte Durchmischung der Replikate im Temperaturraum aufgrund des hier beschriebenen Sachverhaltes
verursacht hat [120].
Andererseits lässt sich die an einem Phasenübergang gehemmte Durchmischung der
Replikate durch verschiedene Ansätze steigern. Ein solcher Ansatz stammt von Trebst und
Co-Autoren [121], und beruht darauf, dass die round-trip Zeit deutlich verkürzt werden
kann, wenn die Anzahl der Temperatursprossen in der Umgebung der kritischen Temperatur deutlich erhöht wird. Ein weiterer, von Kamberaj und van der Vaart vorgestellter
Ansatz besteht darin, die Energiefunktion mittels des Wang-Landau Algorithmus [122]
derart zu modifizieren, dass alle Replikate in etwa die gleiche mittlere Energie aufweisen. Damit verschwindet der Energiesprung im Bereich der kritischen Temperatur, was zu
einem Random Walk der Replikate im Temperaturraum führt. Für ein System mit relativ
wenigen Freiheitsgraden wurde dieser Ansatz schon erfolgreich getestet [123]. Dennoch
steht der Nachweis noch aus, ob letzterer Ansatz auch für komplexe Systeme praktikabel ist: Zum Einen wird die Bestimmung einer hinreichend „flachen“ Energiefunktion mit
einem erheblichen Aufwand verbunden sein und zum Anderen ändert die Modifikation
der Energiefunktion die Besetzungswahrscheinlichkeiten der einzelnen Zustände, was im
Extremfall dazu führen kann, dass wesentliche Zustände in einer solchen Simulation nicht
mehr vorkommen, und somit auch eine Regewichtung der Daten keine akkuraten Ergebnisse liefern kann.
1
Tatsächlich wird dort eine Formel für die inverse round-trip Zeit, die sog. round-trip Rate angegeben.
93
Literaturverzeichnis
[1] A. L. Lehninger. Grundkurs Biochemie. Walter de Gruyter, Berlin, 1985.
[2] L. Stryer. Biochemie. Spektrum Akademischer Verlag, Heidelberg, 1991.
[3] H. Carstens. Konformationsdynamik lichtschaltbarer Peptide: Molekulardynamiksimulationen und datengetriebene Modellbildung. Dissertation, Fakultät für Physik, LudwigMaximilians-Universität München, 2004.
[4] S. Spörlein, H. Carstens, H. Satzger, C. Renner, R. Behrendt, L. Moroder, P. Tavan, W. Zinth,
and J. Wachtveitl. Ultrafast spectroscopy reveals subnanosecond peptide conformational dynamics and validates molecular dynamics simulation. Proc. Natl. Acad. Sci. USA, 99:7998–
8002, 2002.
[5] H. Carstens, C. Renner, A. G. Milbradt, L. Moroder, and P. Tavan. Multiple loop conformatuions of peptides predicted by molecular dynamics simulations are compatible with NMR.
Biochemistry, 44:4829–4840, 2005.
[6] A. R. Fersht and V. Daggett. Protein folding and unfolding at atomistic resolution. Cell,
108:573–582, 2002.
[7] P. Tavan, H. Carstens, and G. Mathias. Molecular dynamics simulations of proteins and
peptides: Problems, achievements, and perspectives. In Johannes Buchner and Thomas
Kiefhaber, editors, Protein Folding Handbook. Part I., chapter 33, pages 1170–1195. WileyVCH, 2005.
[8] W. F. van Gunsteren and J. Dolenc. Biomolecular simulations: Historical picture and future
perspectives. Biochem. Soc. Trans., 36:11–15, 2008.
[9] C. Renner, R. Behrendt, S. Spörlein, J. Wachtveitl, and L. Moroder. Photomodulation of
conformational states. I. Mono- and bicyclic peptides with (4-amino)phenylazobenzoic acid
as bockbone constituent. Biopolymers, 54:489–500, 2000.
[10] R. Denschlag, W. J. Schreier, B. Rieff, T. E. Schrader, O. Koller, L. Moroder, W. Zinth, and
P. Tavan. Relaxation time prediction for a light switchable peptide by molecular dynamics.
Phys. Chem. Chem. Phys., 12:6204–6218, 2010.
[11] S.-L. Dong, M. Löweneck, T. E. Schrader, W. J. Schreier, W. Zinth, L. Moroder, and C. Renner. A photocontrolled beta-hairpin peptide. Chem.-Eur. J., 12:1114–1120, 2006.
[12] M. Karplus and J. A. McCammon. Molecular dynamics simulations of biomolecules. Nature Struct. Biol., 9:646–652, 2002.
[13] G. Mathias, B. Egwolf, M. Nonella, and P. Tavan. A fast multipole method combined with
a reaction field for long-range electrostatics in molecular dynamics simulations: The effects
of truncation on the properties of water. J. Chem. Phys., 118:10847–10860, 2003.
[14] E. Lindahl, B. Hess, and D. van der Spoel. Gromacs 3.0: A package for molecular simulation
and trajectory analysis. J. Mol. Mod., 7:306–317, 2001.
95
Literaturverzeichnis
[15] R. Denschlag, M. Lingenheil, and P. Tavan. Efficiency reduction and pseudo-convergence
in replica exchange sampling of protein folding-unfolding equilibria. Chem. Phys. Lett.,
458:244–248, 2008.
[16] T. E. Schrader, W. J. Schreier, T. Cordes, F. O. Koller, G. Babitzki, R. Denschlag, C. Renner,
M. Löweneck, S.-L. Dong, L. Moroder, P. Tavan, and W. Zinth. Light-triggered β-hairpin
folding and unfolding. Proc. Natl. Acad. Sci. USA, 104:15729–15734, 2007.
[17] P. L. Freddolino, F. Liu, M. Gruebele, and K. Schulten. Ten-microsecond molecular dynamics of fast-folding ww domain. Biophys. J., 94:L75–L77, 2008.
[18] K. Hukushima and K. Nemoto. Exchange Monte Carlo method and application to spin glass
simulations. J. Phys. Soc. Jpn., 65:1604–1608, 1996.
[19] U. H. E. Hansmann. Free energy landscape and folding mechanism of a β-hairpin in explicit
water: A replica exchange molecular dynamics study. Chem. Phys. Lett., 281:140–150,
1997.
[20] Y. Sugita and Y. Okamoto. Replica-exchange molecular dynamics method for protein folding. Chem. Phys. Lett., 314:141–151, 1999.
[21] C. M. Dobson. Protein folding and misfolding. Nature, 426:884–890, 2003.
[22] J. L. Sohl, S. S. Jaswal, and Agard D. A. Unfolded conformations of α-lytic protease are
more stable than its native state. Nature, 395:817–819, 1998.
[23] Z. Wang, J. Mottonen, and E. J. Goldsmith. Kinetically controlled folding of the serpin
plasminogen activator inhibitor 1. Biochemistry, 35:16443–16448, 1996.
[24] C. B. Anfinsen, E. Haber, M. Sela, and F. H. White Jr. The kinetics of formation of native
ribonuclease during oxidation of the reduced polypeptide chain. Proc. Natl. Acad. Sci. USA,
47:1309–1314, 1961.
[25] C. J. Epstein, R. F. Goldberger, and C. B. Anfinsen. The genetic control of tertiary protein
structure: Studies with model systems. Cold Spring Harbor Symp. Quant. Biol., 28:439–
449, 1963.
[26] C. B. Anfinsen. Principles that govern the folding of protein chains. Science, 181:223–230,
1973.
[27] C. M. Dobson. Experimental investigation of protein folding and misfolding. Methods,
34:4–14, 2004.
[28] J. P. Taylor, J. Hardy, and K. H. Fischbeck. Toxic proteins in neurodegenerative disease.
Science, 296:1991–1995, 2002.
[29] F. U. Hartl and M. Hayer-Hartl. Molecular chaperones in the cytosol: from nascent chain to
folded protein. Science, 295:1852–1858, 2002.
[30] L. Pauling, R. B. Corey, and H. R. Branson. The structure of proteins: Two hydrogen-bonded
helical configurations of the polypeptide chain. Proc. Natl. Acad. Sci. USA, 37:235–240,
1951.
[31] L. Pauling and R. B. Corey. The pleated sheet, a new layer configuration of polypeptide
chains. Proc. Natl. Acad. Sci. USA, 37:251–256, 1951.
96
Literaturverzeichnis
[32] W. A. Eaton, V. Munoz, P. A. Thompson, E. R. Henry, and J. Hofrichter. Kinetics and dynamics of loops, α-helices, β-hairpins, and fast-folding proteins. Acc. Chem. Res., 31:745–
753, 1998.
[33] C. D. Snow, N. Nguyen, S. Pande, and M. Gruebele. Absolute comparison of simulated and
experimental protein-folding dynamics. Nature, 420:102–106, 2002.
[34] W. Y. Yang and M. Gruebele. Folding at the speed limits. Nature, 423:193–197, 2003.
[35] U. Mayor, N. R. Guydosh, C. M. Johnson, Grossmann J. G., S. Sato, G. S. Jas, S. M. V.
Freund, D. O. V Alonso, V. Daggett, and A. R. Fersht. The complete folding pathway of a
protein from nanoseconds to microseconds. Nature, 421:863–867, 2003.
[36] L. L. Qiu, S. A. Pabit, A. E. Roitberg, and S. J. Hagen. Smaller and faster: The 20-residue
trp-cage protein folds in 4 µs. J. Am. Chem. Soc., 124:12952–12953, 2002.
[37] J. A. McCammon. Protein dynamics. Rep. Prog. Phys., 47:1–46, 1984.
[38] W. A. Eaton, V. Munoz, S. J. Hagen, G. S. Jas, L. J. Lapidus, E. R. Henry, and J. Hofrichter. Fast kinetics and mechanisms in protein folding. Annu. Rev. Biophys. Biomol. Struct.,
29:327–359, 2000.
[39] S. Geibel, J. H. Kaplan, E. Bamberg, and T. Friedrich. Conformational dynamics of
the Na+ /K+ –ATPase probed by voltage champ flourometry. Proc. Natl. Acad. Sci. USA,
100:964–969, 2003.
[40] D. W. Green, V. M. Ingram, and M. F. Perutz. The structure of haemoglobin IV. Sign
determination by the isomorphous replacement method. Proc. Roy. Soc. A, 225:287–307,
1954.
[41] J. C. Kendrew, G. Bodo, H. M. Dinitzis, R. G. Parrish, H. Wyckoff, and D. C. Phillips. A
three-dimensional model of the myoglobin molecule obtained by X-ray analysis. Nature,
181:662–666, 1985.
[42] B. Nölting. Mechanism of protein folding. Proteins, 41:288–298, 2000.
[43] R. Rowan III, J. A. McCammon, and B. D. Sykes. A study of the distances obtained from
nuclear magnetic resonance nuclear Overhauser effect and relaxation time measurements
in organic structure determination. Distances involving internally rotating methyl groups.
Application to cis- and trans-crotonaldehyde. J. Am. Chem. Soc., 96:4773–4780, 1974.
[44] K. Wüthrich. NMR of Proteins and Nukleic Acids. Wiley, New York, 1986.
[45] C. K. Woodward. Hydrogen exchange rates and protein folding. Current Opinion in Struct.
Biol., 4:112–116, 1994.
[46] A. K. Bhuyan and J. B. Udgaonkar. Real-time NMR measurements of protein folding and
hydrogen exchange dynamics. Current Science, 10:942–952, 1999.
[47] A. Weiss. Fluorescence spectroscopy of single biomolecules. Science, 283:1676 – 1683,
1999.
[48] L. Nilsson and B. Halle. Molecular origin of time-dependent fluorescence shifts in proteins.
Proc. Natl. Acad. Sci. USA, 102:13867–13872, 2005.
[49] F. Schotte, M. Lim, T. A. Jackson, A.V. Smirnov, J. Soman, J. S. Olson, G. N. Phillips,
M. Wulff, and P. A. Anfinrud. Watching a protein as it functions with 150-ps time-resolved
X-ray crystallography. Science, 300:1944–1947, 2003.
97
Literaturverzeichnis
[50] J. Bredenbeck, J. Helbing, J. R. Kumita, G. A. Woolley, and P. Hamm. α-helix formation in
a photoswitchable peptide tracked from picoseconds to microseconds by time-resolved IR
spectroscopy. Proc. Natl. Acad. Sci. USA, 102:2379–2384, 2005.
[51] E. Chen, M. J. Wood, A. L. Fink, and D. S. Kliger. Time-resolved Circular Dichroism studies
of protein folding intermediates of cytochrome c. Biochemistry, 37:5589–5598, 1998.
[52] O. Bieri, J. Wirz, B. Hellrung, M. Schutkowski, M. Drewello, and T. Kiefhaber. The speed
limit for protein folding measured by triplet-triplet energy transfer. Proc. Natl. Acad. Sci.
USA, 96:9597–9601, 1999.
[53] M. Stork, A. Giese, H. A. Kretzschmar, and P. Tavan. MD simulations indicate a possible
role of parallel α-helices in seeded aggregation of poly-Gln. Biophys. J., 88:2442–2451,
2005.
[54] T. Hirschberger, M. Stork, B. Schropp, K. F. Winkelhofer, J. Tatelt, and P. Tavan. Structural
instability of the prion protein upon M205S/R mutations revealed by molecular dynamics
simulations. Biophys. J., 90:3908–3918, 2006.
[55] M. Lingenheil, R. Denschlag, and P. Tavan. Highly polar environments catalyze the unfolding of PrPC helix 1. Europ. Biophys. J., 2010. DOI: 10.1007/s00249-009-0570-6.
[56] S. B. Oskan, G. A. Wu, J. D. Chodera, and K. A. Dill. Protein folding by zipping and
assembly. Proc. Natl. Acad. Sci. USA, 104:11987–11992, 2007.
[57] K. A. Dill, S. B. Ozkan, T. R. Weikl, J. D. Chodera, and V. A. Voelz. The protein folding
problem: when will it be solved? Current Opinion in Struct. Biol., 17:1–5, 2007.
[58] O. M. Becker, A. D. MacKerell, B. Roux, and Watanabe M. Computational Biochemistry
and Biophysics. Marcel Decker Inc., New York, 2001.
[59] J. A. McCammon, B. R. Gelin, and M. Karplus. Dynamics of folded proteins. Nature,
267:585–590, 1977.
[60] M Levitt and S. Lifson. Refinement of protein conformations using a macromolecular energy minimization procedure. J. Mol. Biol., 46:269–279, 1969.
[61] B. R. Brooks, R. E. Bruccoleri, B. D. Olafson, D. J. States, S. Swaminathan, and M Karplus.
CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem., 4:187–217, 1983.
[62] A. D. MacKerell, D. Bashford, M. Bellott, R. L. Dunbrack, J. D. Evanseck, M. J. Field,
S. Fischer, J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir, K. Kuczera, F. T. K.
Lau, C. Mattos, S. Michnick, T. Ngo, D. T. Nguyen, B. Prodhom, W. E. Reiher, B. Roux,
M. Schlenkrich, J. C. Smith, R. Stote, J. Straub, M. Watanabe, J. Wiorkiewicz-Kuczera,
D. Yin, and M. Karplus. All-atom empirical potential for molecular modeling and dynamics
studies of proteins. J. Phys. Chem. B, 102:3586–3616, 1998.
[63] D. A. Pearlman, D. A. Case, J. W. Caldwell, W. S. Ross, T. E. Cheatham, S. Debolt, D. Ferguson, G. Seibel, and P. Kollman. AMBER, a package of computer programs. Comput. Phys. Commun., 91:1–41, 1995.
[64] W. R. P. Scott, P. H. Hünenberger, I. G. Tironi, A. E. Mark, S. R. Billeter, J. Fennen, A. E.
Torda, T. Huber, P. Krüger, and W. F. van Gunsteren. The GROMOS biomolecular simulation program package. J. Phys. Chem., 103:3596–3607, 1996.
98
Literaturverzeichnis
[65] W. L. Jorgensen, D. S. Maxwell, and Tirado-Rives J. Development and testing of the OPLS
all-atom force field on conformational energetics and properties of organic liquids. J. Am.
Chem. Soc., 118:11225–11236, 1996.
[66] L. Verlet. Computer ”experiments” on classical fluids. I. Thermodynamical properties of
Lennard-Jones molecules. Phys. Rev., 159:98–103, 1967.
[67] M. P. Allen and D. J. Tildesley. Computer Simulations of Liquids. Oxford University Press,
Oxford, 1987.
[68] N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller. Equation
of state calculation by fast computing machines. J. Chem. Phys., 21:1087–1092, 1953.
[69] M. Yoneya, H. J. C. Berendsen, and Hirasawa K. A noniterative matrix method for constraint
molecular-dynamics simulations. Molecular Simulation, 13:395–405, 1994.
[70] A. D. MacKerell, M. Feig, and C. L. Brooks, III. Extending the treatment of backbone energetics in protein force fields: Limitations of gas-phase quantum mechanics in reproducing
protein conformational distributions in molecular dynamics simulations. J. Comput. Chem.,
25:1400–1415, 2004.
[71] A. Kitao, K. Yonekura, S. Maki-Yonekura, F.A. Samatey, K. Imada, K. Namba, and N. Go.
Switch interactions control energy frustration and multiple flagellar filament structures.
Proc. Natl. Acad. Sci. USA, 103:4894–4899, 2006.
[72] P. L. Freddolino, A. S. Arkhipov, S. B. Larson, A. McPherson, and K. Schulten. Molecular
dynamics simulations of the complete satellite tobacco mosaic virus. Structure, 14:437–449,
2006.
[73] P. Maragakis, K. Lindorff-Larsen, M. P. Eastwood, R. O. Dror, J. L. Klepeis, I. T. Arkin,
M. O. Jensen, H. Xu, N. Trbovic, R. A. Friesener, A. G. Palmer, and D. E. Shaw. Microsecond molecular dynamics simulation shows effect of slow loop dynamics on backbone
amide order parameters of proteins. J. Phys. Chem. B, 112:6155–6158, 2008.
[74] W. C. Still, A. Tempczyk, R. C. Hawley, and T. Hendrickson. Semianalytical treatment of
solvation for molecular mechanics and dynamics. J. Am. Chem. Soc., 112:6127–6129, 1990.
[75] V. A. Voelz, G. R. Bowman, K. Beauchamp, and V. S. Pande. Molecular simulation of ab
initio protein folding for a millisecond folder NTL9(1-39). J. Am. Chem. Soc., 132:1526–
1528, 2010.
[76] Hugh Nymeyer and Angle E. García. Simulation of the folding equilibrium of α-helical
peptides: A comparison of the generalized born approximation with explicit solvent. Proc.
Natl. Acad. Sci. USA, 100:13934–13939, 2003.
[77] R. Zhou and B. J. Berne. Can a continuum model reproduce the free energy landscape of a
β-hairpin folding in water? Proc. Natl. Acad. Sci. USA, 99:12777–12782, 2002.
[78] R. Geney, M. Layten, R. Gomperts, V. Hornak, and C. Simmerling. Investigation of salt
bridge stability in a generalized born solvent model. J. Chem. Theory Comput., 2:115–127,
2006.
[79] H. Lei and Y. Duan. Two-stage folding of HP-35 from ab initio simulations. J. Mol. Biol.,
370:196–206, 2007.
99
Literaturverzeichnis
[80] J. D. Jackson. Classical Electrodynamicss. John Wiley and Sons, New York, second edition,
1975.
[81] M. Stork and P. Tavan. Electrostatics of proteins in dielectric solvent continua: I. Newton’s
third law marries qe forces. J. Chem. Phys., 126:165105, 2007.
[82] B. Egwolf and P. Tavan. Continuum description of solvent dielectrics in molecular-dynamics
simulations of proteins. J. Chem. Phys., 118:2039–2056, 2003.
[83] M. Stork and P. Tavan. Electrostatics of proteins in dielectric solvent continua: I. newton’s
third law marries qe forces. J. Chem. Phys., 126:165105, 2007.
[84] M. Stork and P. Tavan. Electrostatics of proteins in dielectric solvent continua: II. first
applications in molecular dynamics simulations. J. Chem. Phys., 126:165106, 2007.
[85] D. M. Zuckerman and E. Lyman. A second look at canonical sampling of biomolecules
using replica exchange simulation. J. Chem. Theory Comput., 2:1200–1202, 2006.
[86] Y. Okamoto. Generalized-ensemble algorithms: Enhanced sampling techniques for Monte
Carlo and molecular dynamics simulations. J. Mol. Graph. Model., 22:425–439, 2004.
[87] S. Kumar, D. Bouzida, R. H. Swendsen, P. A. Kollman, and J. M. Rosenberg. The weighted histogram analysis method for free-energy calculations on biomolecules. I. the method.
J. Comput. Chem., 13:1011–1021, 1992.
[88] G. M. Torrie and J. P. Valleau. Nonphysical sampling distributions in Monte Carlo freeenergy estimation - umbrella sampling. J. Comput. Phys., 23:187–199, 1977.
[89] G. H. Paine and H. A. Scheraga. Prediction of the native conformation of a polypeptide
by a statistical-mechanical procedure. I. Backbone structure of enkephalin. Biopolymers,
24:1391–1436, 1985.
[90] M. Mezei. Adaptive umbrella sampling: Self-consistent determination of the nonBoltzmann bias. J. Comput. Phys., 68:237–248, 1987.
[91] C. Bartels and M. Karplus. Multidimensional adaptive umbrella sampling: Applications to
main chain and side chain peptide conformations. J. Comput. Chem., 18:1450–1462, 1997.
[92] B. A. Berg and T. Neuhaus. Multicanonical algorithms for first order phase transitions.
Phys. Lett. B, 267:249–253, 1991.
[93] B. A. Berg and T. Neuhaus. Multicanonical ensemble: A new approach to simulate firstorder phase transitions. Phys. Rev. Lett., 68:9–12, 1992.
[94] H. Fukunishi, O. Watanabe, and S. Takada. On the Hamilton replica exchange method
for efficient sampling of biomolecular systems: Application to protein structure prediction.
J. Chem. Phys., 116:9058–9067, 2002.
[95] R. Affentranger and I. Tavernelli. A novel Hamiltonian replica exchange MD protocol to
enhance protein conformational space sampling. J. Chem. Theory Comput., 2:217–228,
2006.
[96] S. Kannan and M. Zacharias. Folding of trp-cage mini protein using temperature and biasing
potential replica exchange molecular dynamics simulations. Int. J. Mol. Sci., 10:1121–1137,
2009.
100
Literaturverzeichnis
[97] H. S. Hansen and P. H. Hünenberger. Using the local elevation method to construct optimized umbrella sampling potentials: calculation of the relative free energies and interconversion barriers of glucopyranose ring conformers in water. J. Comput. Chem., 31:1–23,
2010.
[98] R. E. Bruccoleri and M. Karplus. Conformational sampling using high-temperature molecular dynamics. Biopolymers, 29:1847–1862, 1990.
[99] A. P. Lyubartsev, A. A. Martinovski, S. V. Shevkunov, and P. N. Vorontsov-Velyaminov.
New approach to Monte Carlo calculation of the free energy: Method of expanded ensembles. J. Chem. Phys., 96:1776–1783, 1992.
[100] E. Marinari and G. Parisi. Simulated tempering: A new Monte Carlo scheme. Europhys.
Lett., 19:451–458, 1992.
[101] P. Liu, B. Kim, R. A. Friesner, and B. J. Berne. Replica exchange with solute tempering:
A method for sampling biological systems in explicit water. Proc. Natl. Acad. Sci. USA,
102:13749–13754, 2005.
[102] G. Babitzki, R. Denschlag, and P. Tavan. Polarization effects stabilize bacteriorhodopsin’s
chromophore binding pocket: A molecular dynamics study. J. Phys. Chem., 113:10483–
10495, 2009.
[103] R. Denschlag, M. Lingenheil, P. Tavan, and G. Mathias. Simulated solute tempering.
J. Chem. Theory Comput., 5:2847–2857, 2009.
[104] M. Kastner. Monte Carlo methods in statistical physics: Mathematical foundations and
strategies. Commun. Nonlinear Sci. Numer. Simul., 15:1589–1602, 2010.
[105] S. Park and V. S. Pande.
76:016703, 2007.
Choosing weights for simulated tempering.
Phys. Rev. E,
[106] D. A. Kofke. On the acceptance probability of replica-exchange Monte Carlo trials. J. Chem.
Phys., 117:6911–6914, 2002.
[107] K. A. Dill and S. Bromberg. Molecular driving forces: statistical thermodynamics in chemistry and biology. Garland Science, Taylor and Francis Group, New York, 2003.
[108] D. J. Earl and M. W. Deem. Parallel tempering: Theory, applications, and new perspectives.
Phys. Chem. Chem. Phys., 7:3910–3922, 2005.
[109] R. Reichold. Rechnergestützte Beschreibung der Struktur und Dynamik von Peptiden und
ihren Bausteinen. Dissertation, Fakultät für Physik, Ludwig-Maximilians-Universität München, 2009.
[110] D. Du, Y. Zhu, C.-Y. Huang, and F. Gai. Understanding the key factors that control the rate
of β-hairpin folding. Proc. Natl. Acad. Sci. USA, 101:15915–15920, 2004.
[111] A. Mitsutake, Y. Sugita, and Y. Okamoto. Generalized-ensemble algorithms for molecular
simualtions of biopolymers. Biopolymers, 60:96–123, 2001.
[112] C. Predescu, M. Predescu, and C. V. Ciobanu. On the efficiency of exchange in parallel
tempering Monte Carlo simulations. J. Phys. Chem. B, 109:4189–4196, 2005.
[113] W. Nadler and U. H. E. Hansmann. Optimized explicit-solvent replica exchange molecular
dynamics from scratch. J. Phys. Chem. B, 112:10386–10387, 2008.
101
Literaturverzeichnis
[114] T. Okabe, M. Kawata, Y. Okamoto, and M. Mikami. Replica-exchange Monte Carlo method
for the isobaric-isothermal ensemble. Chem. Phys. Lett., 335:435–439, 2001.
[115] M. J. Abraham and J. E. Gready. Ensuring mixing efficiency of replica-exchange molecular
dynamics simulations. J. Chem. Theory Comput., 4:1119–1128, 2008.
[116] H. J. C. Berendsen, J. P. M. Postma, W. F. van Gunsteren, A. Dinola, and J. R. Haak. Molecular dynamics with coupling to an external bath. J. Chem. Phys., 81:3684–3690, 1984.
[117] Tetsuya Morishita. Fluctuation formulas in molecular-dynamics simulations with the weak
coupling heat bath. J. Chem. Phys., 113:2976–2982, 2000.
[118] M. Lingenheil, R. Denschlag, and P. Tavan. The "hot-solvent/cold-solute" problem revisited.
J. Chem. Theory Comput., 4:1293–1306, 2008.
[119] E. Rosta, N.-V. Buchete, and G. Hummer. Thermostat artifacts in replica exchange Molecular Dynamics simulations. J. Chem. Theory Comput., 5:1393–1399, 2009.
[120] X. Huang, M. Hagen, B. KIM, R. A. Friesner, Zhou R., and Berne B. J. Replica exchange
with solute tempering: Efficiency in large scale systems. J. Phys. Chem., 111:5405–5410,
2007.
[121] S. Trebst, M. Troyer, and U. H. E. Hansmann. Optimized parallel tempering simulations of
proteins. J. Chem. Phys., 124:174903, 2006.
[122] F. Wang and D. P. Landau. Determining the density of states for classical models: A random
walk algorithm to produce a flat histogram. Phys. Rev. E, 64:056101, 2001.
[123] H. Kamberaj and A. van der Vaart. An optimized replica exchange molecular dynamics
method. J. Chem. Phys., 130:074906, 2009.
102
Danksagung
Mein vornehmlicher Dank am Gelingen dieser Arbeit gebührt meinem Betreuer und Doktorvater Prof. Paul Tavan, der mir trotz meiner fünfjährigen Tätigkeit in einer Bank und der
damit verbundenen Physikabstinenz die Gelegenheit zur Promotion gab. Danke auch für
deine Unterstützung und Schulung im wissenschaftlichen Schreiben und Präsentieren, das
entgegengebrachte Vertrauen und die damit verbundene Möglichkeit eigenverantwortlich
zu arbeiten.
Großer Dank geht auch an meinen Kollegen, Freund und „Sparringspartner“ Dr. Martin Lingenheil für die tolle und fruchtbare Zusammenarbeit. Unsere nicht selten unüberhörbaren Diskussionen werde ich sehr vermissen, wenngleich der ein oder anderen Zimmernachbar weniger wehmütige Gedanken damit verbinden wird. Entsprechend gilt mein
Dank allen im C-Flügel ansässigen Kollegen für die hohe „Lärmtoleranz“.
Leider viel zu kurz war meine Zusammenarbeit mit Dr. Gerald Mathias. Danke für
deine wertvollen Ratschläge und Anregungen und nicht zuletzt für deine Unterstützung
beim Publizieren. Unschätzbar war und ist auch die Arbeit unseres Systemadiminstrators Sebastian Bauer, ohne dessen Einsatz das Computernetzwerk sicherlich nicht halb
so gut funktionieren würde. In diesem Zusammenhang sei auch Dr. Rudolf Reichold und
Dr. Bernhard Schropp gedankt.
Über die Arbeitsgruppe hinaus gilt mein besonderer Dank Herrn Prof. Zinth und seinen
beiden Mitarbeitern Dr. Tobias Schrader und Dr. Wolfgang Schreier für die unkomplizierte und – wie ich glaube – erfolgreiche Zusammenarbeit. An dieser Stelle danke ich auch
gerne den immer freundlichen und hilfsbereiten Sekretärinnen, Frau Michaelis und Frau
Widmann-Diermeier, sowie dem Akademischen Rat Dr. Karl-Heinz Mantel. Darüber hinaus richte ich ein Dankeschön an alle Mitglieder des BMO für die tolle Atmosphäre. Das
jeden Sommer wiederkehrende Eisbachschwimmen oder der gepflegte Fußball-Kick im
Englischen Garten werden mir fehlen. Zu guter Letzt gebührt auch großer Dank meiner
Familie und meinem Freundeskreis für den all zu leicht als selbstverständlich empfundenen sozialen Rückhalt, welcher mir unabdingbar für das Gelingen des Projekts Dissertation erscheint.
103
Lebenslauf
Name:
Robert Denschlag
Geburtsdatum: 19.5.1972
Geburtsort
Worms
seit 2009
Lehrer an der Albert-Schäffle-Schule in Nürtingen
2005 – 2009
Doktorand in der Arbeitsgruppe Tavan (LMU München)
2000 – 2004
Anwendungsinformatiker innerhalb der Dresdner Bank
1999
Physik Diplom
1995
Mathematik Vordiplom
1994
Physik Vordiplom
1992 – 1999
Studium der Physik und Mathematik an der TH Karlsruhe
1991 – 1992
Grundwehrdienst
1991
Abitur
1989 – 1991
Besuch des Rudi-Stephan Gymnasium Worms
1982 – 1989
Besuch der Karmeliter Realschule Worms
1978 – 1982
Besuch der Kerschensteiner Grundschule Worms
105