DeerAnalysis2013 User Manual G. Jeschke ETH Z¨ urich Wolfgang-Pauli-Str. 10, 8093 Z¨ urich, Switzerland [email protected] http://www.epr.ethz.ch/software/index February 13, 2013 1 Purpose of the program The program DeerAnalysis2013 can extract distance distributions from deadtime free pulse ELDOR data (constant-time and variable-time four-pulse DEER) [3, 4]. Furthermore, it can be used for direct comparison of primary data of similar samples [5]. For a series of related samples, an average distance distribution can be computed taking into account the signal-to-noise ratio of the individual data sets. With some caution [6] the program may also be applied to the analysis of dead-time free double-quantum coherence EPR experiments [7]. It should not be used for data from experiments that have a significant dead time, td > 2r3 ns nm−3 , (1) where r is the shortest distance in the distribution. Except for fits of two user-defined models for three-spin systems 10.5.1, the program assumes isolated spin pairs. If more than two spins are coupled (e.g. spin-labeled protein oligomers), distance distributions are only approximate with artifact contributions at both short and long distances [1]. A future version of DeerAnalysis will allow for an approximate correction of these contributions. If you use DeerAnalysis2011 in your research, please cite [2]: • G. Jeschke, V. Chechik, P. Ionita, A. Godt, H. Zimmermann, J. Banham, C. R. Timmel, D. Hilger, H. Jung, Appl. Magn. Reson. 2006, 30, 473– 498. The underlying mathematical problem is (moderately) ill-posed, i.e., quality of the analyzed data is very crucial. Pre-processing tools are implemented to correct for experimental imperfections (phase errors, displacements of the time origin of the modulation) and to separate the intramolecular distances of interest from the intermolecular background contribution. Furthermore, the 1 program provides several independent approaches for extracting the distance distribution, which helps to get a feeling for the reliability of the distribution. Characterization of the distance distribution in terms of its mean value hri and width (standard deviation σr ) is usually reliable [5] and is therefore a standard output. The performance of the different approaches for data analysis depends on the type of distance distribution (narrow or broad peaks or both) and was discussed in some detail in Ref. [5]. DeerAnalysis2013 features a module for validation of distance distributions obtained by Tikhonov regularization. With this module a systematic error analysis can be performed that may consider experimental noise, uncertainties in background correction for a given dimensionality of the background and uncertainties in the dimensionality of the background. This analysis can provide error bars for the points in the distance distribution. DeerAnalysis2011 is based on experience with earlier programs DeerFit, DeerTrafo, and in particular DeerAnalysis2004, DeerAnalysis2006, and DeerAnalysis2008, DeerAnalysis2009, DeerAnalysis2010, and DeerAnalsysis2011 as well as with model-specific fitting of data [8, 10, 11]. It supersedes the earlier programs with respect to reliability and functionality. At the present time, DeerAnalysis2013 is released only as source code that can be run within MATLAB but not as a stand-alone application. 2 Changes with respect to DeerAnalysis2011 DeerAnalysis2013 is an upgrade of DeerAnalysis2011, which fixes a few glitches, can retain modulation depth information in user model fitting, can reload its own output files, and, last but not least, has a feature for suppression of ghost contributions to distance distributions that can arise in multi-spin systems [12]. 3 Changes with respect to DeerAnalysis2009 and DeerAnalysis2010 DeerAnalysis2011 is a minor upgrade of DeerAnalysis2009 and DeerAnalysis2010, which fixes a few glitches, improves the Validation tool, and introduces Rice distribution models (courtesy M. Spitzbarth, S. Domingo K¨ ohler, M. Drescher) [13]. Distance distributions have now a color-coded background to indicate reliability in different distance ranges (see Section 10.2). The patched version from March 2012 allows for modulation depth scaling comparison of DEER traces with different length and time resolution, independent of the sequence in which they were loaded and allows for switching off automatic modulation depth scaling between simulated and experimental form factors in the display. 2 4 Changes with respect to DeerAnalysis2008 DeerAnalysis2009 is a minor upgrade of DeerAnalysis2008, which fixes a few bugs and can be run on Mac (Mac executable for Tikhonov regularization courtesy Glenn Millhauser). The following list gives an overview of the changes • automatic phase correction including offset of the imaginary part • irregular cutoff behavior and occasional failures in data loading fixed • user-defined models for three-spin systems with equilateral geometry • improved automatic determination of the corner of the L curve • Mac executables of the Tikhonov regularization modules R bug that cannot be fixed. SomeThere is one Microsoft Windows Vista times the Tikhonov regularization executable crashes for unknown reasons. In this case the same computation has to be performed once again. There is usually no crash on second call. The bug seems to be fixed in Windows 7. 5 Changes with respect to DeerAnalysis2006 DeerAnalysis2008 is a major upgrade of DeerAnalysis2006, which fixes a few bugs and introduces a number of improvements in data analysis and interpretation. The following list gives an overview of the most important changes • no limitations on minimum time increment and maximum number of data points in experimental data • resolution of 1 ns (by interpolation) for zero-time setting • display option for error estimates with Tikhonov regularization, including error estimates obtain by systematic analysis with respect to uncertainties in input data • estimates for distance constraints are provided • data reduction with use of the information from all data points for enhancing computation speed • enhanced display update during L curve computation and progress bar window with estimate of computation time remaining; long computations can be interrupted • possibility for semi-manual background correction (input of modulation depth) and decay time constant • new model-based fit for random-coil conformations (unfolded proteins) • user-defined models can have up to eight fit parameters • test data sets for training are provided, and theoretical distance distributions are displayed for such data sets • Linux executables of the Tikhonov regularization modules Note that error estimates for Tikhonov regularization that can be displayed without using the new validation tool are not the true errors of the distance distribution. They are nevertheless included, as they provide some hint to problems in data analysis. To obtain better estimates of the error, please use 3 the new validation tool. To compute dipolar spectra with higher resolution you may edit the file zero filling.dat. This file is an ASCII file that contains a single integer number n. Data are zero-filled to n times the length of the original time-domain data before Fourier transformation. The default value is 4. Note that zero-filling leads to a purely cosmetical sinc interpolation of data points in the spectrum. No new information is obtained, but the data look better. Purists may want to use a value of 1. 6 Changes with respect to DeerAnalysis2004 The user interface of the new version DeerAnalysis2006 was written from scratch, while many of the computational subroutines are well tested subroutines of DeerAnalysis2004. The choice of analysis techniques was narrowed down to the ones that we and others found most reliable. An optimum regularization parameter for Tikhonov regularization is now predicted from the L curve as suggested by Freed’s group [15], while the stabilizing constraint of a purely positive distance distribution is maintained as in our previous approach. Recent results of Tsvetkov’s group on the effects of finite microwave pulse amplitude [16, 17] were also taken into account. An optional excitation bandwidth correction is now included that will be described in detail elsewhere. The following list gives an overview of the most important changes • two data sets can be directly compared on screen (dual display) • reasonably fast excitation bandwidth correction • easy work with experimental background functions • significance check for minor peaks in the distance distribution • computation and display of L curves in Tikhonov regularization • the total number n of coupled spins is now displayed, not n − 1 • results are no longer saved automatically • user-defined models with up to six parameters can be fitted 7 Installation DeerAnalysis2013 requires Matlab 7.6 (R2008a) or higher and was tested in Windows, Linux, and Mac environments (Mac test by Glenn Millhauser). Most tests were performed under Matlab 2008a on Windows. Design of the user interface may not be optimum for Linux or Mac. The Windows package can be installed by unpacking the ZIP file DeerAnalysis2013.zip into a directory of your choice. Linux installation is the same. Note that with Linux or Mac versions, you have to assign execution privilege to the binary files for Tikhonov regularization. For the Linux version these files are ftikreg r old.out and ftikreg r new.out, while for the Mac version the files are ftikreg r old.maci and ftikreg r new.out. If you have an earlier Matlab version, please use Deer- 4 Analysis2006. 1 Please note that the DeerAnalysis directory must not be write protected, as the program uses this directory for file exchange with the external program FTIKREG. (Tip: It may be useful to add the path to DeerAnalysis to your default Matlab path in the Matlab startup script startup.m by addpath(c:/Program Files/DeerAnalysis2013))) or to include this directory in the Matlab path using Set path... in the Matlab File menu. 8 The user interface To run the program, start Matlab, change to the directory where it is installed (e.g., by cd(c:/Program Files/DeerAnalysis2013)) and call it by typing DeerAnalysis at the Matlab prompt The graphical user interface shown in Fig. 1 opens, of course first without a loaded data set. Figure 1: Graphical user interface of DeerAnalysis 2013. The user interface has been programmed with the following ideas in mind 1 Unfortunately, once a user interface has been edited in Matlab 2008a there is no way to go back. The MathWorks did not warn about this problem in their release information of Matlab 2008 5 • no unnecessary complexity • no hidden functionality (no menus) • default behavior should give reasonable results for most data • experienced users can easily override default behavior. Default behavior is to read Elexsys (Xepr) data files, assume that the last three quarters of the data can be used for the background fit, adjust the phase automatically, and correct for exponential background decay (homogeneous spatial distribution of nanoobjects). Initially, no points are cut off at the end of the data set, the distance distribution is obtained by approximate Pake transformation (APT) [6], and the mean distance hri and standard deviation σr by moment analysis within the range from 1.5 to 8 nm using distance domain smmothing with a filter width of 0.2 nm. A suggestion for cutting off noisy or distorted data points at the end of the data set is made. All this happens automatically once you load a data set via the Load button. Different models for the background can be selected in the Background models panel (center of bottom half of Fig. 1) as described in more detail below. Similarly, Tikhonov regularization or fitting of the data by a model distance distribution can be selected in the Distance analysis panel. As these approaches are time-consuming, fitting is not started automatically but only after clicking on the corresponding Fit button. Adjustable parameters can be edited directly (the most common errors, such as non-digit input or values out of range, are corrected automatically) or incremented or decremented by + and buttons, respectively. Several parameters can be adjusted or reset by automatic procedures (described below). This is done with the ! buttons. Display in each of the plot windows can be toggled. In the plot below the Original data panel display of the imaginary part (magenta trace) can be switched on or off by clicking on the imaginary checkbox. If two data sets have been loaded, the real part of the previous set (data set B) can be displayed as a blue trace by clicking on the dual display checkbox. This automatically suppresses display of the imaginary part of the active data set (data set A) and changes the imaginary checkbox into a mod. depth scaling checkbox. Dual display and modulation depth scaling also effect the other two plots, where results corresponding to data set B are also displayed as blue traces. The plot below the Dipolar evolution panel can be toggled between timedomain display and display of the dipolar spectrum by checking the corresponding radiobuttons. Finally, the plot below the Distance distribution panel can alternate between display of the distance distribution and the L curve, after a Tikhonov regularization with L curve computation has been performed. There is no Help function, but the controls are provided with short explanations that will show up when you move the cursor above them. 6 Pulse sequence observe p p p/2 t1 t1 t2 t2 pump p t Dn = 65-70 MHz pump B0 observe Microwave mode pump observe Nitroxide spectrum nmw Figure 2: Pulse sequence and positions of the observer and pump frequency with respect to the nitroxide spectrum and to the microwave mode for the four-pulse DEER experiment. 9 9.1 Pre-processing Loading data Data input and output is initiated by buttons in the Data sets panel. By default the program expects Bruker Elexsys data (binary format). It recognizes automatically if the data are complex (quadrature detection) or real (singlechannel detection, discouraged). If the data set is one-dimensional, it is interpreted as output of a (classical) constant-time DEER experiment [3], see also Fig. 2. If the data set is two-dimensional with exactly two traces, it is interpreted as a variable-time DEER experiment [4] with the first trace being the reference trace and the second trace being the recoupled trace. For any other size of experimental data, program response is undefined. If you unintentionally load a data set of some other experiment, it is advisable to close the program and restart it. Mainly as a support for ESP 380 machines, the program has the capability 7 to read data in WIN-EPR binary format (select by radio button in the Formats column of the). As the binary number format of the ESP 380 is somewhat obscure, this mode requires that the data are first read into WIN-EPR on a PC and saved again from WIN-EPR. This mode is less well tested than the Elexsys mode and completely untested for two-dimensional data. Alternatively you can convert ESP 380 data to ASCII data (also possible in WIN-EPR with command sequence 1D processing/Parameters/List data file.../Save). From an ASCII file, only one-dimensional data can be read. If there are any header lines before the numerical data, they must start with a percentage character (%). By default, the program expects the time axis (in nanoseconds) in the first column, the real part of the data in the second column, and the imaginary part (if present) in the third column. These assignments can be adapted in the edit fields below the ASCII radio button. For ASCII data exported from WIN-EPR, the proper settings are 2, 3, and 4 instead of 1, 2, and 3. The first six lines (header lines) have to be deleted or commented out by a % character. The program automatically recognizes if there is no imaginary part. After successfully loading data, the Status panel shows a short characterization of the data set (const-time/variable-time DEER, complex/real, number of data points). The filename is included in the title of the DeerAnalysis main window and is also shown in line A: of the Data sets panel. From version 2013 on, data saved by DeerAnalysis can be reloaded, when the DeerAnalysis radiobutton in the Data sets panel is activated. By default this activates the Locked checkbox in the same panel. Any of the DeerAnalysis output files (e.g. ... bckg.dat) can be selected, data from all files are loaded. The same feature allows for loading output data from the DEER window of MMM, provided that a comparison of experimental data with a simulated distance distribution was performed in MMM. In Locked mode any automatic processing is switched off, data and results are displayed in the form found in the saved files. The Locked checkbox can be deactivated to allow for data processing. Note however that not the full information from the primary data set is available after such reloading. Only the real part of the primary data, starting at the zero time determined by DeerAnalysis or by the user is saved and can be reloaded. In other words, phase correction and zero time determination should not be changed in such data sets. 9.2 Determining zero time The time origin of the dipolar evolution function corresponds to τ1 = τ2 (see Fig. 2). Because pulse lengths are finite, the relation between this equation and actual delays in the pulse sequence may not be trivial. We therefore suggest determination of the time origin (zero time) from experimental data with a good signal-to-noise ratio (SNR) for the pulse lengths and τ1 delay that you actually use. To obtain a precise value a standard sample with a short distance should be used. If you later measure on the same spectrometer with the same pulse lengths and τ1 you can use the same value. Knowing this value is important for data with poor SNR where automatic determination is likely to fail. Automatic 8 determination of zero time t0 is based on the expectation that the real part of the signal should be symmetric about the time origin. For the proper choice of t0 , the first moment of the signal in a range symmetric about t0 should thus be zero. In a first step, the program approximates zero time by the time tmax at which the real part is maximum. Then the first moment is determined in a window tx tmax /2, where tx is shifted through the whole data set. The optimum value of t0 is the time tx where the first moment is minimum. This procedure is performed with a time resolution of 1 ns, obtained by interpolation of the experimental data. Such enhanced time resolution improves results for very short distances, where it may be important that the true zero time may fall in between two experimental data points. Zero time is influenced by pulse lengths [16, 17]. This algorithm should work well for good SNR an distances up to ≈ 5 nm. If it fails under such conditions, τ1 is too short (expected symmetry of the data is spoiled by interference between adjacent microwave pulses). The algorithm may fail for very long distances where data close to the maximum are pretty flat. For such long distances small mis-settings have only minor influence on the distance distribution. You may correct the automatically determined zero time by the + and buttons right and left from the value or by direct input of a new value in the edit field (fit by the eyes). A wrong choice may be easier to detect when you switch the Dipolar evolution plot to frequency domain (spectrum). 9.3 Phase correction In a properly adjusted DEER experiment, the signal should be entirely in the real part of the data set. If receiver offsets are canceled by [(+x)-(-x)] phase cycling of the first pulse, as we strongly suggest, the imaginary part is zero. It is therefore tempting to acquire and process only the real part. We discourage this. For very weak signals, as you occasionally encounter with membrane proteins, it is difficult to adjust signal phase exactly during setup. Consequently, part of the signal will be in the imaginary part. Furthermore, depending on stability of your spectrometer, there may be small phase drifts during the experiment. It is better to correct for these drifts than to ignore them. Finally, unexpected artifact signals are likely to manifest in the imaginary part (see Fig. 3). If the imaginary part after phase correction strongly deviates from zero at early times, it is advisable to acquire data with a longer τ1 value (see Fig. 2). Automatic phase correction can be based on the expectation that the imaginary part should be zero at sufficiently long times. By default, the program determines the corresponding phase correction directly after loading complex data by minimizing the root mean square deviation of the imaginary part for the last three quarters of the data (part between the blue and orange cursors). The phase correction in degree is displayed in the Original data panel. You may correct phase manually by using the + and - buttons right and left from the value or by direct input of a new value in the edit field. Phase is automatically restricted to the range (−180, +180)◦ . If you did not phase cycle and do have a receiver offset, you may aim to flatten the imaginary part and 9 1 ..\examples\CT_deer_broad 0.8 0.6 0.4 0.2 0 -0.2 short-time artifact 0 0.5 1 1.5 2 t (µs) Figure 3: Imaginary-part artifact at early times (see red arrow) due to mw pulse interference. Interpulse delay τ1 should be long enough for the artifact to have almost completely decayed at t = 0 (green vertical line). put all modulation into the real part. Note however, that in this case you are likely to have a receiver offset in the real part, too. This will be detrimental to data analysis. Automatic phase correction can be reactivated by the ! button left from the value. It will always relate to the part of the data between the blue and orange cursors. If you move any of these cursors, the result may differ from the result that you got directly after loading. Automatic phase correction after loading can be deactivated by unselecting the check box Autophase in the Data sets panel. In some measurements of weak samples with some spectrometers we found that automatic phase correction did not work and that after a manual phase correction that brought all modulation into the real part, a flat non-zero imaginary part remained. Apparently the problem comes from overload of the videoamplifier during the pulse. Small residual constant offsets of the imaginary part may also result if the phase drifted significantly during the measurement. Such data are less reliable than data with zero offset in the imaginary part, in particular with respect to background correction, as there may also be an offset in the real part. However, if the hardware problem cannot be fixed, it may still be warranted to process the data and interpret them (with caution), in particular, as for weak samples the background may be flat, so that an offset in the real part does not pose a problem. DeerAnalysis2009 automatically fits an offset in the imaginary part if the checkbox Offset corr. (below the Load button) is activated. The fitting happens on file load (if the checkbox Reset is also activated) and on clicking the Phase(◦ ) ! button in the Original data panel. The fit criterion is flatness of the imaginary part in the last 87.5% of the data points. The first 12.5% are excluded to allow for early-time artifacts as shown in Fig. 3. Alternatively, you can use manual phase correction to flatten the imaginary part ”by eyes”. 10 The offset of the imaginary part is displayed as a magenta dotted line. If you obtain large offsets or visible offsets even for strong samples, please fix your spectrometer. 9.4 Cutting data For several reasons, you may want to exclude points at the end of your data set from analysis. First, some people prefer to acquire data up to delays t, where the pump pulse starts to interfer with the last observer pulse or even overlaps with it. In this case, the last data points are spoiled. Second, if at maximum t the signal has decayed to a very small value (say 0.1 times maximum intensity), the dipolar eveolution function after background correction will be rather noisy, as correction involves division by the background decay. Third, SNR in variabletime DEER data increases with t even before background correction. It may be wise to cut the data at a time where noise is still tolerable. By default no data points are cut off at the end, but a suggestion for cutoff is displayed as an orange vertical cursor in the Dipolar evolution plot (see Fig. 4). This suggestion is derived from the difference D between the experimental dipolar evolution function and its fit by the APT result. The mean square deviation Mk of eleven consecutive points Dk−5 . . . Dk+5 around the kth data point is computed for all indices k. The minimum of M is a measure for the noise level. An acceptable noise level of 6min(M ) is assumed. The programm then searches for a range of consecutive points at the end of the data set that all fulfil the condition M (k) > 6min(m). If such a range of points exists, the program suggests to cut it off. Otherwise the orange cutoff cursor is set to the end (right border) of the trace. The suggestion can be accepted by clicking on the ! button of the Cutoff controls in the Original data panel. Note that this may in turn improve the fit, thus leading to a smaller value min(M ) and a new cutoff suggestion. Therefore, it is advisable to click on the ! button several time to iteratively approach the optimum cutoff. Furthermore, the cutoff suggestion depends on correct settings of other parameters (zero time, phase). For instance, for the variable-time DEER data set shown in Fig. 4 the zero time must be zero, while the program automatically determines 96 ns. If this is not corrected, not good fit is obtained and unnecesserily many data points are cut off. Note also that this data set with relatively poor SNR was intentionally selected for explanation of data cutoff. For many data sets, no cutoff at all may be required and DeerAnalysis2006 immediately sets the cutoff cursor to the right border. Generally, cutting off a significant amount of data will suppress noise but will also cause a suppression of long distances by background correction. Proper background correction may become more difficult. 9.5 Background correction In most cases, EPR distance measurements are performed to elucidate the structure of a nanoscopic object. Only distances within this object are of interest. 11 a b 1 1 0.8 0.9 0.6 0.8 0.4 0.7 0.2 0.6 0 10 20 30 0 t (µs) ..\examples\VT_deer_8nm 5 10 15 t (µs) Figure 4: Cutting off the noisy part at the end of variable-time DEER data. (a) Dipolar evolution plot for the whole data set. The orange cursor shows the suggested cutoff time. (b) Dipolar evolution function (black) and fit (red) by a distance distribution obtained with APT after cutting the data at the suggested time. The contribution of distances to neighboring objects should be suppressed. If you think about a biradical or bilabelled protein molecule, you want to measure the intramolecular distance and suppress contributions from intermolecular distances. Such a separation of the signal V (t) = {1 − [1 − ∆D(t)]}B(t) into a dipolar evolution function D(t) for the nanoobject itself and a background decay B(t) due to neighboring objects requires a criterion for distinguishing the two contributions. Furthermore, the functional form of the background decay has to be known. This functional form is related to the spatial distribution of the nanoobjects. A separation can only be successful if distances within the object are typically shorter than distances to neighboring objects. The wanted contribution is then confined to the earlier part of the time domain data, while later parts are dominated by the background decay. The decay can only be fitted properly if the maximum time t in the pulse sequence (Fig. 2) is significantly longer than the time at which the dipolar modulation has decayed. A more detailed discussion can be found in Ref. [5]. Separation into the two contributions is simple and reliable if the distance distribution is dominated by distances shorter than 4 nm. In protein samples, it becomes challenging for distances between 4 and 6 nm, and near to impossible for distances longer than 6 nm, unless protons around the spin labels can be strongly diluted by deuteration [4]. Note that one can still get a quite reliable estimate of a distance of closest approach if separation fails. However, the width and shape of the distance distribution should not be discussed in such a situation. 12 In simple cases (short distances and homogeneous distribution of the nanoobjects in three dimensions), separation depends only weakly on the choice of parameters. Default behavior of the program should then be sufficiently good. By default, an exponential background decay corresponding to a homogeneous three-dimensional distribution is fit to the last three quarters of the data. The fit parameter is the decay time constant, which is proportional to the concentration of nanoobjects. With proper calibration such fits can be used to determine local concentrations (see Section 9.6). Generally, the background is shown as a red line in the Original data plot. A continuous line is plotted in the range where the background was fitted (between the blue and orange cursors), a dotted line is plotted where the fit was extrapolated. The r.m.s. value of the background fit is displayed in the Background model panel. a b -1 0 f (MHz) 1 c -1 0 f (MHz) 1 -1 0 1 f (MHz) Figure 5: Manifestation of different background fits in the dipolar spectrum (example data set CT DEER 5nm). (a) Part of the background is attributed to the biradical. (b) Good separation of intra- and intermolecular contributions, as obtained with automatic correction (! button). (c) Part of the biradical contribution is attributed to the background. For distances of ≈ 4 nm and longer, choice of the time range for background fitting may decide whether you obtain artifacts in the distance distribution at long distances. Unlike the other problems in determining a distance distribution, this problem is most severe for narrow distributions of distances. In this case the modulation decays more slowly and thus interferes more strongly with the background fit. Our automatic determination of the optimum fit range is based on the assumption that the longest detectable distance exceeds the largest distance within the nanoobject. If this condition is met, the distance distribution after correct background correction is zero at the maximum detectable distance. This can be checked by approximate Pake transformation (APT, see below). APT is sufficiently fast to be applied at all possible choices of the starting time for the background fit. For any selected background model, this search for the optimum starting time can be initiated by clicking on the blue ! button in the Original data panel. Depending on the length of your data set and the speed of your computer, this optimization can take up to a few minutes. The starting time for background fitting can also be adjusted manually with 13 the blue + and - buttons or by direct input into the edit field. The consequences can best be judged when switching the bottom left plot below to frequency domain. For a narrow distance distribution, the black trace should look like a Pake pattern. Deviations are best seen at zero frequency. There should be neither a positive spike nor an obvious hole in the center of the Pake pattern (see Fig. 5). Background correction can be switched off completely by selecting the No correction radiobutton in the Background model panel. In this modus the input data are interpreted as a dipolar evolution function which is already separated from background. The modus is intended for compatibility with external pre-processing programs, for polynomial fitting of single-label data to derive an experimental background function (see below), or for fitting by a user model that explicitly contains the background contribution. User models consisting of a single Gaussian peak with 3D homogeneous background (Gaussian hom) or of two Gaussian peaks with 3D homogeneous background (Two Gaussians hom) are already included in DeerAnalysis2006. However, we strongly discourage fitting background and distance distribution simultaneously, as such fits are very likely to end up in local minima of the error hypersurface. Whenever a separation of the background contribution from the contribution of the nanoobject can be performed with some confidence, it should be done before analysis of the distance distribution. For long distances, the intramolecular contribution may be significant throughout the whole time range. In such cases, any fit of a background function to the data is biased. If independent information on expected modulation depth and concentration (density) is available, e.g. from other double mutants of the same protein, it may be better to directly input these estimates instead of fitting. This can be done by selecting the option Form factor based fit. This option is displayed red when the program estimates from the preliminary distance distribution that fitting would be more reliable and is displayed green if the program expects problems with background fitting. It is also possible to systematically vary modulation depth and density and search for the values that lead to the best fit of the form factor. The rational behind this algorithm is that the background contribution cannot be fitted by a distribution of distances with an upper limit. Incomplete background correction or overcorrection will thus lead to a deterioration of the fit of the form factor. This algorithm works well for data with very high signal-to-noise ratio, but is easily mislead by noise. It should thus be used with caution. The search for optimum modulation depth and density can be started by selecting the option Form factor based fit and then clicking on the ! button in the Form factor based fit subpanel. If there are uncertainties about the parameters of background correction, the result of Tikhonov regularization should be checked by using the Validation module. In the following we shortly discuss the possible choices for the spatial distribution of nanoobjects. They can be selected by checking the corresponding radiobutton in the Background model panel. 14 9.5.1 Homogeneous This model is strongly suggested for all cases where you do not have experimental background functions from singly labelled molecules. The general background function in this model is (2) B (t) = exp −ktd/3 where k quantifies the density of the spins and d is the dimensionality of the homogeneous distribution. Unless there is a confinement on length scales below 10 nm, the distribution is homogeneous in d = 3 dimensions. This case applies to most solutions. Membrane proteins in a liposome may be confined to d = 2 dimensions. If possible, such confinement should be established by control measurements on singly labelled proteins, for which d = 2 is expected give a better fit than d = 3. For labels attached to a stretched polymer chain, d = 1 may be appropriate. Note also that a choice of d = 6 corresponds to a Gaussian background decay, as it has been observed with the single-frequency SIFTER experiment [9]. The dimension is not necessarily an integer number- if experimental data of a singly labelled sample can be nicely fitted with a fractal dimension, it is advisable to use the same fractal dimension for background correction of the corresponding doubly labelled sample. When the Fit dimensionality checkbox is selected, both k and d are fitted. This mode is suggested only for determining the fractal dimension of purely homogeneous (singly-labelled) samples. In this case the Bckg. control in the Original data panel should be set to zero (green and blue cursors coincide), as the early decay of the data is most sensitive to the parameter d. 9.5.2 Polynomial Short distances are underrepresented in the intermolecular distance distribution, ff the spin labels are attached to nanoobjects that cannot penetrate each other. As a result, the intermolecular contribution decays more slowly at early times than would be expected for a homogeneous distribution. If singly labelled objects are available, the intermolecular part can be measured separately and an experimental background function can be derived. Directly using the noisy experimental data set of the singly labelled sample would introduce significant statistical errors. It is therefore prudent to use a smooth fit function for that purpose. Almost any intermolecular decay can be reproduced by fitting a polynomial to the logarithm of the original data. DeerAnalysis2006 allows for polynomials with an order of up to 15, but note that the lowest order should be selected that still gives a good fit (flat trace in the Dipolar evolution plot. Polynomial fits are mainly implemented for deriving and afterwards saving experimental background functions from singly labelled samples, not for direct background correction. 15 9.5.3 Experimental Once experimental background functions have been derived from singly labelled samples, they can be used for correcting the background in corresponding doubly labelled samples. In this mode, the relative magnitudes of the polynomial coefficients are kept fixed. The background model is given by ! o X c n tn B (t) = exp −k (3) n=0 where k is the density (concentration) parameter, o the order of the polynomial, and the cn are the polynomial coefficients determined previously on the singly labelled samples. The only fit parameter is k. In principle, background data should be individually measured for both label positions in a doubly labelled sample, as the supression of short distances depends on how deep the label is buried in the nanoobject. The weighted sum of both background functions is a better approximation for the actual background in the doubly labelled sample than each individual background function. Several background polynomials can be added using the Add button in the Background model panel. A weighting factor can be specified in a dialog box that opens after clicking on this button. Note that the different labeling efficiencies at the two positions are already accounted for with weighting factor 1.0 if both singly labelled samples were measured with the same protein concentration. 9.6 Determining local concentrations The parameters of the background fit are related to the number of coupled spins within the nanoobject (modulation depth after background correction) and to the density of nanoobjects (parameter k). For calculation of the number of spins and of absolute densities, the modulation depth parameter λ has to be known, which depends strongly on the excitation position, length, and flip angle of the pump pulse and weakly on line broadening in the nitroxide spectrum and shape of the resonator mode. Reliable quantification therefore requires a calibration with known samples and proper adjustment of the flip angle of the pump pulse (see Section 15). The calibration should be repeated if the resonator or the length of the pump pulse is changed. Protonated and deuterated nitroxide spin labels also require separate calibrations. Determination of the number of coupled spins is more reliable when based on Tikhonov regularization or a fit of the data by a model distribution and is therefore discussed later on (Section 12.3). For a 3D homogeneous distribution of objects, the density is proportional to the local concentration. The term local refers to the length scale of the DEER experiment, which extends to approximately 1020 nm for the background. Measurements of local concentrations can be calibrated with a solution of an appropriate spin label (e.g., protonated or deuterated TEMPOL) in toluene. An example data set from our own calibration (CT DEER tempol 2500uM) is provided. This data set was acquired with a 2mM TEMPOL solution in toluene, 16 which corresponds to a concentration of 2.5 mM at 80 K, as toluene shrinks to approximately 80% of its room temperature volume when freeze-quenched in liquid nitrogen. To calibrate 3D background fitting for determination of concentrations, select Homogeneous as the background model, set dimensions to 3, and load a data set for a sample with known concentration. Adjust zero time and phase, if necessary. Now input the concentration (in the units you prefer) into the edit field Density. The color of the density value then changes to green. When you now load other experimental data sets that have been measured with the same resonator and experimental settings and use the same background model, you can directly read off concentrations from the edit field Density. Note that the program looses calibration on restart. 9.7 Long-pass filtering The major artifact contribution to DEER time-domain signals is usually nuclear modulation due to matrix protons. At X-band frequencies, such proton modulation corresponds to a distance of approximately 1.5 nm. By restricting the distance range for analysis to (1.75, 8) nm, contributions by nuclear modulation can be suppressed. However, as computation of distance distribution is an ill-posed problem, an out-of-range artifact may still influence the result within the range of interest. Very strong proton modulations, as they are sometimes encountered for membrane proteins in liposomes or detergent micelles, should thus be eliminated by filtering. This can be achieved by completely eliminating contributions above a certain maximum frequency, which roughly corresponds to suppressing distances below a certain minimum distance. Such complete suppression was described in Ref. [4]. For broad distance distributions with contributions both below and above 1.75 nm, complete suppression may introduce an artificial hole at t = 0 into the time-domain data and may thus replace the nuclear modulation artifact with a suppression artifact. To avoid this, filtering in DeerAnalysis2006 is performed by fitting a third-order polynomials to the real and imaginary parts of the frequency-domain data between the cut-off frequency and the Nyquist frequency. The frequency-domain data in this range are then replaced by the polynomial. This suppresses the sharp nuclear modulation peak as well as highfrequency noise, while keeping the high frequency contributions of broad distance distributions intact. Filtering is enabled by selecting the Long pass filter checkbox in the Dipolar evolution panel. The cut-off distance (lower limit, default 1.6 nm) can be changed in the edit field right from this check box. When working with broad distributions of short distances, the default value is often a good compromise between residual proton modulation and partial suppression of short distances. 17 10 10.1 Extracting distance distributions General remarks The computation of a distance distribution P (r) from a dipolar evolution function V (t) is an ill-posed problem. For such problems, small variations in the input data (e.g., noise) can cause large variations in the output data. In other words, significantly different distance distributions may correspond to very similar dipolar evolution functions. Data analysis therefore depends strongly on striking a good compromise between improving resolution and decreasing the influence of experimental noise. First and foremost, data should be acquired with as good as possible SNR. Reproducing results for a given sample is usually a good idea. Second, ill-posedness must be taken into account in data analysis. There are several ways of doing this, which all have one feature in common: one tries to find a resolution in distance domain at which a good fit of the experimental data is obtained without introducing strong noise artifacts into the distance distribution. 10.2 Reliability of distance distributions The reliability of distance distributions depends strongly on the maximum dipolar evolution time. A rule of thumb was derived by reanalyzing data that were simulated from known distance distributions. At a maximum dipolar evolution time tmax = 2 µs the shape of the distance distribution is reliable up to a distance of about 3 nm (reliable distribution limit). The mean distance hri and width σr are reliable up to a distance of 4 nm (reliable width limit), whereas the mean distance, but not the width is reliable between 4 and 5 nm (reliable meand distance limit). Beyond 5 nm no reliable mean value can be determined, although the presence of a long distance (distinct from background) can be recognized up to 6 nm (distance regonition limit). All these limits scale with the cubic root of tmax . DeerAnalysis2011 displays reliability ranges by color-coding the background of the distance distribution plot. The pale green range corresponds to a reliable shape of the distribution, the pale yellow range to reliable mean distance and width, the pale orange range to reliable mean distance (but not width), and the red range to recognition of a long-distance contribution that cannot be quantified. The computed distribution is displayed only up to the distance recognition limit. This new feature is intended to caution users against overinterpretation of data. For preparing figures of distance distributions for papers it may be required to suppress these background colors. This can be achieved by deactivating the checkbox Guidance. 18 0.03 0.02 0.01 0 2 4 6 8 10 r (nm) Figure 6: Color coding for reliability ranges. Pale green: Shape of distance distribution is reliable. Pale yellow: Mean distance and width are reliable. Pale orange: Mean distance is reliable. Pale red: Long-range distance contributions may be detectable, but cannot be quantified. The example data were cut off at a maximum dipolar evolution time of 10 µs. In this particular case the shape of the distribution can safely be interpreted. 10.3 Approximate Pake Transformation (APT) A very fast algorithm relies on an approximate integral transformation to dipolar frequency domain, subsequent correction of cross-talk artifacts, and mapping to distance domain (APT) [6]. Ill-posedness is moderated by proper discretization in dipolar frequency domain. If SNR is too small, the distance distribution may still be influence by strong noise artifacts. A better compromise between reliability of the distribution and resolution can then be achieved by distancedomain smoothing, i.e., by giving up resolution in favor for a smoother distribution. As APT is very fast, it can also be used to generate starting values for fit procedures. The disadvantage of APT with respect to other techniques is that it cannot incorporate the constraint P (r) > 0 (for all r). This disadvantage, however, is significant, as the constraint strongly stabilizes the solution. For this reason, two other approaches for data analysis are incorporated into DeerAnalysis2006. 10.4 Tikhonov regularization Other approaches rely on computation of a simulated time-domain signal S(t) from a given distance distribution P (r) by S (t) = K (t, r) P (r) , (4) where K is the kernel function. For the DEER experiment with ideal pulses, the kernel function is known analytically Z 1 cos 3x2 − 1 ωdd t dx , (5) K (t, r) = 0 19 with 2π · 52.04 MHz nm−3 . (6) r3 The case of non-ideal pulses is discussed in Section 10.6. The most elegant response to ill-posedness is Tikhonov regularization. In this approach, the compromise between smoothness (artifact suppression) and resolution of the distance distribution is quantified by a regularization parameter α. The optimum distance distribution P (r) is found by minimizing the objective function 2 2 d 2 Gα (P ) = kS (t) − D (t)k + α 2 P (r) (7) dr ωdd (r) = for a given α. The first term on the right hand side of eqn (7) is the mean square deviation between the simulated and experimental dipolar evolution function while the second term is the regularization-parameter weighted square norm of the second derivative of P (r), which is a measure for the smoothness of P (r). The larger α the less noise artifacts are introduced. However, a larger α also causes a stronger broadening of peaks in the distance distribution. Therefore, small α are required for samples with well defined distances (narrow peaks) and large α for very broad distributions, which otherwise disintegrate into many narrow peaks. Unfortunately, the correct width of the peaks is often not known in advance. There are different ways for mathematically defining an optimum regularization parameter. The past version DeerAnalysis2004 used the self-consistency criterion [19, 20]. However, determination of an optimum α is itself influenced by noise [5], and the self-consistency criterion appears to be more sensitive to noise distortions than the L curve criterion [15]. The L curve is a plot of log η(α) versus log ρ(α), where 2 ρ (α) = kS (t) − D (t)kα (8) quantifies the means square deviation and 2 2 d η (α) = 2 P (r) dr (9) α the smoothness. For well behaved data (good signal-to-noise ratio, relatively narrow peaks in the distribution), this plot is L-shaped as is illustrated in Fig. 7a. In the range of small regularization parameters α (left of the corner, undersmoothing) the slope is steep and negative, as increasing α and thus the smoothing strongly decreases the norm of the second derivative of P (r) without strongly affecting the mean square deviation. In contrast, right of the corner (oversmoothing) the mean square deviation increases strongly with increasing α as the simulation is no longer a good fit of the data. At the same time, η decreases only gradually as noise-related spikes in P (r) are already smoothed out. If the SNR is worse and the peaks in the distance distribution are broader, the corner of the L curve is somewhat less pronounced (Fig. 7b). 20 -15 -18 2 6 4 r (nm) 2 -22 -20 -26 -25 -2.5 b log h log h -10 a -14 -2 -1.5 -1 -3.7 log r -3.6 -3.5 6 4 r (nm) -3.4 -3.3 log r Figure 7: Tikhonov L curves. The red data points correspond to the optimum regularization parameter. The insets show the distance distribution obtained with this parameter. a) Data set dOTP 5nm, α = 1. b) Data set CT DEER broad, α = 100. The computationally most efficient implementation of the L curve criterion does not allow for additionally introducing the constraint P (r) > 0. As this constraint strongly stabilizes the solution, DeerAnalysis2006 relies on the Fortran program FTIKREG, written by J. Weese and distributed by the Materials Research Center Freiburg, which allows for using it. The L curve criterion is then implemented by computing Tikhonov regularization for a pre-defined set of regularization parameters α ~ = (0.001, 0.01, 0.1, 1, 10, 100, 1000, 10000, 100000) . (10) Our experience suggests that this set is sufficient for all cases of practical interest. If required, Tikhonov regularization can also be performed for intermediate values or values that are smaller or larger than the limits of this set. After an L curve has been computed, the distance distribution and simulated dipolar evolution function can be inspected for all values of α, which is helpful in cases where this curve does not exhibit such a clear corner as in Fig. 7. In such cases, automatic recognition of the corner may fail. Tikhonov regularization is performed by clicking on the corresponding Fit button in the Distance analysis panel. By default, L curve computation is disabled, as it is time consuming. The regularization parameter (default: 1) can be changed in the corresponding edit field in the Distance distribution panel. The distance range for Tikhonov regularization is determined by the blue and magenta start and end values in the Distance distribution panel, which can also be edited. Computation of the L curve can be requested by clicking on the Compute L curve checkbox in the Distance analysis panel and subsequently clicking on the Fit button. After starting an L curve computation by clicking on the Fit button, a progress bar will appear as soon as the first Tikhonov regularization has been performed. At that point an estimate of the 21 remaining computation time is displayed. If necessary the computation can be interrupted by closing the progress bar. For that the × button in the upper right corner of the progress bar has to be clicked. After the next Tikhonov regularization is completed, a window appears that allows for interrupt or continuation of the computation. After such a computation, the L curve is automatically displayed instead of the Distance distribution plot with the automatically derived selection of the corner highlighted in red and the corresponding regularization parmeter shown in the Reg. par. control. The selection of the corner can be shifted with the + and - buttons of the Reg. par. controls. Such changes update the fit in the Dipolar evolution panel and the r.m.s. value in the Distance analysis panel. The distance distributions for different regularization parameters can be inspected in the same way after unselecting the L curve checkbox in the Distance distribution panel. Automatic L curve corner recognition selects the regularization parameter that has the shortest distance from the lower left corner in a log-log plot of square norm of the second derivative of P (r) vs. mean square deviation of the simulated from the experimental form factor (new algorithm in DeerAnalysis2009). The lower left corner is defined by the minimum square norm of the second derivative of P (r) and the minimum mean square deviation among all regularization parameters for which Tikhonov regularization was performed. Note that L curves can be misshaped if there are problems with background correction or if data are very noisy. In such cases the automatically determined choice of the regularization parameter may not be optimum. If you have any information on the expected width of the distribution (or of the most narrow features in the distribution), it is usually best to select the regularization parameter manually. The optimum choice is the one that just does not cause undue broadening of expected narrow features. After a Tikhonov regularization has been performed, the Validation button becomes accessible. An error analysis with respect to noise and uncertainties in background correction can now be performed (see Section 11). 10.5 User models Generally, the solution of an ill-posed problem can be stabilized by introducing additional constraints. A distance distribution P (r) that conforms to a simple model with only a few parameters, for example a distribution consisting of one or two Gaussian peaks, is strongly constrained. Fitting of the data by a model distribution can thus improve reliability of the analysis. Furthermore, by comparing the parameters for a series of related samples trends can be easily recognized. This approach is offered in DeerAnalysis2006 by an interface for fitting pre-processed data by user-defined models for the distance distribution P (r). Model functions with one and two Gaussian peaks are already implemented. The model library can be extended by the user as described below. In applying this approach one should be aware that a model can impose constraints that do not apply to the true distance distribution and may thus suppress information contained in the original data. For instance, the example 22 data set dOTP 5nm can be fitted relatively well by a distance distribution consisting of a single Gaussian peak, but this imposes a symmetry on the peak that is not a feature of the true distribution. The true distribution decays more steeply towards high distances than towards low distances as seen in the inset in Fig. 7a (and the reason for this asymmetry is well understood). It is thus advisable to perform a model-independent analysis by Tikhonov regularization first. From a set of distance distributions for the same class of samples, it is then often possible to derive a model function that does not impose undue constraints but does make use of additional information on the sample that comes from other characterization techniques. 10.5.1 Fitting with existing models When DeerAnalysis2009 starts, the program checks the subdirectory models for existing Matlab scripts (extension .m). The current distribution contains the scripts • Gaussian.m Single Gaussian peak with mean distance hri and standard deviation σ(r). • Gaussian hom.m Single Gaussian peak with mean distance hri and standard deviation σ(r). and homogeneous 3D background. To be used with Background model No correction and only with great care. • random coil.m. Random-coil model for a polymer chain or an unfolded protein or an unfolded domain of a protein. N is the number of amino acid residues between the two labels including both labeled residues. ν is the scaling exponent (0.602 for good solvents, expected for soluble proteins in water, 0.5 for θ-solvent, less than 0.5 for poor solvents). • Rice3d.m Single Rice peak [13] with mean distance hνi and standard deviation σ in three dimensions. Note that for the Gaussian √ limit of the Rice distribution, the standard deviation σ is by a factor 2 smaller than the value of σ in the Gaussian distribution as implemented in DeerAnalysis. • Sphere Surface.m. Homogeneous distribution of spin labels on the surface of a sphere. The sphere diameter ds has a Gaussian distribution with standard deviation σ(ds ). • Triangle DGauss.m Assumes a three-spin system (equilateral triangle) with double Gaussian distribution of the center-vertex distance (two Gaussian peaks). Useful for homotrimers with two distinct conformations or significantly nonGaussian distance distributions. Pair- and three-spin contributions to the 23 form factor are considered, based on [1] with a correction in the fraction of two-spin contributions. The total modulation depth ∆ (Delta) and the number of Monte Carlo trials (nmc) are fixed parameters, whereas ∆ should be set to the modulation depth obtained with background fitting. • Triangle Gauss.m Assumes a three-spin system (equilateral triangle) with Gaussian distribution of the distance between the center and the vertices. Useful for homotrimers. Pair- and three-spin contributions to the form factor are considered, based on [1] with a correction in the fraction of two-spin contributions. The total modulation depth ∆ (Delta) and the number of Monte Carlo trials (nmc) are fixed parameters, whereas ∆ should be set to the modulation depth obtained with background fitting. • Two Gaussians.m Two Gaussian peaks with mean distances hr1 i and hr2 i and standard deviations σ(r1 ) and σ(r2 ). The population of the first peak (integral) is p1 , the one of the second peak 1 − p1 . • Two Gaussians hom.m Two Gaussian peaks with mean distances hr1 i and hr2 i and standard deviations σ(r1 ) and σ(r2 ). The population of the first peak (integral) is p1 , the one of the second peak 1 − p1 . The concentration for a homogeneous 3D background is c. To be used with Background model No correction and only with utmost care (do this only if you are desperate and interpret results with great caution). • Two Rice3d.m Two Rice peaks [13] with mean distances hν1 i, hν2 i and standard deviations σ1 and σ2 . The population of the first peak (integral) is p1 , the one of the second peak 1 − p1 . Note that for the Gaussian √ limit of the Rice distribution, the standard deviation σ is by a factor 2 smaller than the value of σ in the Gaussian distribution as implemented in DeerAnalysis. • WLC rigid.m. Worm-like chain model [11] for a semi-rigid polymer (or DNA) with labelto-label distance L and persistence length Lp . • WLC rigid Gauss.m. Worm-like chain model convoluted with a Gaussian distribution with standard deviation σ(r) that accounts for conformational distribution of the label. These models, and any models implemented by the user, are included in the model fit popupmenu of the Distance analysis panel. On selecting an entry of this menu, the parameter definitions, default values and limits of the corresponding model are read and the parameter controls in the model fit subpanel are updated. A model can have up to eight parameters. If it has less, superfluous parameter controls are disabled. 24 Models with Gaussian peaks that also included homogeneous background, which were available in DeerAnalysis2006, have been discontinued. There fitting behaviour was found to be too unstable. Before fitting, select the model fit radiobutton in the Data analysis panel. The Distance distribution plot now shows the APT result as a black dotted narrow line and the distance distribution corresponding to the current model and parameter values as a red dotted bold line. The Dipolar evolution plot displays the experimental data (black line) and the data simulated with the current model (red dotted line). You may now edit the starting values of the fit parameters in the model fit subpanel until you obtain a reasonable agreement between experimental and simulated data. Of course, this step can be skipped and fitting can be started immediately, but by first improving your starting values you decrease the probability to get stuck in a local minimum of the error hypersurface. Before fitting you can also decide whether you want to fit all parameters (default behavior) or whether you want to keep some parameters fixed at their starting values. To fix a parameter, unselect the corresponding checkbox. Fitting is started by clicking on the Fit button in the model fit subpanel. During fitting, the Status panel displays the current r.m.s. value. Note that fitting can be rather slow if the excitation bandwidth correction (see Section 10.6) is switched on. After the fit is completed, the parameter values are updated, the Distance distribution plot shows the fitted distance distribution as a black bold line, and the Dipolar evolution plot displays the experimental data (black line) and the fit (red line). Model fitting considers the distance distribution in the range between 1 and 10 nm. For data sets extending to times longer than 4 µs, an upper limit of 10 nm may be too short if the homogeneous background is also fitted. As mentioned earlier, we strongly suggest to remove the background contribution before fitting. 10.5.2 Implementing a new model The interface between DeerAnalysis2006 and the model scripts was designed to allow for writing model scripts without knowledge on the inner working of the main program. A model script has two input variables, a vector of distances r0 at which values of the distance distribution have to be computed and a vector of parameters par. The only output parameter is the distance distribution, which is a vector of the same length as r0. Note that the integral of the distance distribution can be arbitrary, as DeerAnalysis2006 internally renormalizes the distribution to an integral of 0.01 for simulations and later computes the number of coupled spins from the modulation depth of the experimental data. This means that no amplitude parameter is needed. Only if the distribution corresponds of more than one contribution (for instance two Gaussian peaks), a parameter for the relative amplitude of an additional component with respect to the first component has to be defined. Consequently, a Gaussian distribution is defined by only two parameters, the 25 mean distance hri and the width (standard deviation) σr . A distribution consisting of two Gaussian peaks thus has the parameters hr(1), i, σr(1) , the relative contribution of the first peak p(1), and hr(2), i, σr(2) . It is convenient to define the relative contributions so that they relate to the integral of the peaks (number of spins) and that p(1) + p(2) = 1. The model script Two Gaussians.m is written this way. A model script needs to declare its parameters to DeerAnalysis2006 and provide default values as well as lower and upper limits for them. This is done in a comment section. As an example consider the script Gaussian.m: function distr=Gaussian(r0,par), % % Model library of DeerAnalysis2011: Gaussian % % single Gaussian peak with mean distance <r> and width (standard % deviation) s(r) % (c) G. Jeschke, 2006 % % PARAMETERS % name symbol default lower bound upper bound % par(1) <r> 3.5 1.5 10 % par(2) s(r) 0.5 0.05 5 gauss0=(r0-par(1)*ones(size(r0)))/par(2); distr=exp(-gauss0.^ 2); The first line is the function declaration, which is the same for all user models except for the function name (here Gaussian). The following lines, which start with the % character, are all comment lines, as far as Matlab is concerned. However, when the model is selected, DeerAnalysis2006 scans these comment lines in the source file for parameter declarations. A parameter declaration line begins with the % character, followed by at least one space and the parameter name. Valid parameter names are par(1), par(2), par(3), par(4), par(5), and par(6). Only as many parameters have to be declared as are needed for the model (here 2). The parameter name is followed by at least one space and then by the parameter symbol. The symbol consists of at least one nonspace character. It is shown as identification of the parameter control in the model fit subpanel. A symbol of up to five non-space characters can always be displayed, longer symbols are completely displayed only if some of the characters are narrow. The symbol is followed by at least one space and then the default value of this parameter. The default value is displayed in the edit field of this parameter and is the starting value for the fit if the user does not make any input before clicking on the Fit button. A good set of starting values provide for a distribution that is mainly confined between 1.5 and 8 nm and that clearly exhibits all relevant features of the model. The default value is followed by at least one space and the lower limit. No input samller than this value is accepted by the edit field. Likewise, the value is used as a lower boundary in parameter fitting. The lower limit is followed by at least one space and the upper limit, which is analogous to the lower limit. Note that definition of the default values 26 and limits is mandatory. Program response is undefined if the parameter line is incomplete. Extended models can return both the distance distribution and the background corrected DEER trace including the constant offset part to DeerAnalysis. This is required when accounting for multi-spin effects. Declaration of such functions requires the keyword #extended# in the comment section. In both extended and standard user functions it is possible to limit the number of parameters that are fitted by default by using the keyword #enable# in the comment section. In multi-spin fits it may be necessary to switch off automatic modulation depth scaling between experimental and simulated traces in the display of DeerAnalysis. A checkbox in the Model fit panel is provided for that purpose. function [deer,distr]=Triangle Gauss(r0,t0,par) % % Model library of DeerAnalysis2008: triangle with Gaussian distribution of % % % % % % % % % % % % % % % 10.6 vertex positions single Gaussian peak with mean distance <r> and width (standard deviation) s(r) (c) G. Jeschke, 2009 #extended# denotes a model that provides both distribution and deer trace #enable# 2 only the first two parameters are fitted by default PARAMETERS name symbol default lower bound upper bound par(1) <rv> 2.5 0.5 10 mean distance from C3 axis par(2) s(v) 0.5 0.02 5 std. dev. of vertex position par(3) Delta 1 0.1 1 total modulation depth par(4) nmc 5000 1000 100000 number of Monte Carlo trials Accounting for limited excitation bandwidth Analysis of DEER distance measurements is usually based on analytical expressions, such as eqn (5), that assume ideal pulses. Past versions of our analysis programs accounted for this by suggestion a lower limit of 1.75 nm for the reliability of the distribution. Maryasov and Tsvetkov [16] first suggested to use corrected expressions to get more reliable results for short distances. Their approach considered the full Hamiltonian during the pulse, except for the pseudosecular contribution of the dipole-dipole coupling. They still assumed that the observed spins are not excited by the pump pulse and the pumped spins are not excited by the observer pulse. With these remaining assumptions, which are however not very well fulfilled, they could still obtain analytical expressions for the three-pulse DEER experiment. Based on these expressions, the effect of finite pulse lengths on determining distance distributions was assessed in a later contribution by Milov et al. [17]. 27 To relax the remaining assumptions and extend the approach to four-pulse DEER, we examined the dependence of the modulation depth λ on the dipolar frequency ωdd for typical lengths of the observer and pump pulses. Numerical density matrix computations of the full pulse sequence were performed for this purpose. Details will be published elsewhere. The dependence of λ on ωdd can be aproximated quite nicely by a Gaussian function ω2 (11) λ (ωdd ) = exp − dd2 , ∆ω where ∆ω is an effective excitation bandwidth with respect to dipolar frequencies. For a four-pulse DEER experiments with a pulse length of 24 ns for all pump and observer pulses and for an experiment with a 12 ns pump pulse and 32 ns observer pulses, we find the same excitation bandwidth of 16 MHz. For a four-pulse DEER experiments with a pulse length of 24 ns for all pump and observer pulses the excitation bandwidth is 12 MHz. The expression in eqn (11) can be used as a correction of the kernel function, eqn (5): Z 1 2 ωdd exp − K (t, r; ∆ω) = cos 3x2 − 1 ωdd t dx , (12) 2 ∆ω 0 so that effects of finite pulses length can be accounted for without much additional computational effort if the kernel is anyway computed during fitting, as DeerAnalysis2006 does it during Tikhonov regularization. 0.14 a 0.08 b 0.12 0.1 0.06 0.08 0.04 0.06 0.04 0.02 0.02 0 1 0 2 3 4 1 r (nm) 2 3 4 r (nm) Figure 8: Excitation bandwidth correction. Blue distance distributions were obtained without, black ones with correction. a) Tikhonov regularization with optimum regularization parameter α = 1. b) Fit by a single Gaussian peak. However, simulations of the dipolar evolution function from a distance distribution, as they are required in model fits or at the end of Tikhonov regularization, can be performed with a pre-computed kernel for the expression given by eqn (5, while the kernel must be computed ”on-the-fly” for the expression given by eqn (12. This is because the latter expression depends on an additional variable parameter ∆ω and, furthermore, does not allow for scaling. In the former 28 expression, a scaling of the t axis by a factor x can be compensated by scaling of the distance axis by a factor x1/3 . Without bandwidth correction, DeerAnalysis2006 uses fast computations with a pre-computed ideal kernel. Therefore, bandwidth correction considerably slows down simulations and model fits and is thus not selected as default behavior of the program. It can be activated by selecting the Exci. bandwidth checkbox in the Dipolar evolution panel. The effect of excitation bandwidth correction is illustrated in Fig. 8 for data set deer bi oligo n8 50K from the calibdepth subdirectory. Data were cut off at 1504 ns to improve the background fit. Without correction (blue distributions) distances below 1.75 nm are strongly suppressed. With correction they are recovered. The r.m.s. deviation improves from 0.000320 without correction to 0.000286 with correction in Tikhonov regularization and from 0.000396 without correction to 0.000335 with correction for a Gaussian fit. An improvement in the r.m.s. value may not always be found. The mean distance obtained with Tikhonov regularization changes from 1.97 to 1.85 nm. For a slightly longer flexible biradical (data set deer bi oligo n10 50K), the correction is somewhat smaller, as the mean distance changes from 2.07 to 1.98 nm (data not shown). Note also that the Gaussian fits do not account very nicely for the true shape of the distribution in this case. 10.7 Ghost suppression If more than two spin labels are present in a nanoobject, sum and difference combinations appear in the DEER signal [22]. This leads to ghost peaks in distance distributions that can be suppressed by power scaling of the form factor with an exponent 1/(N −1), where N is the number of spins per nanoobject, for instance the number of protomers in a symmetric protein homooligomer [12]. Such ghost suppression can be performed by activating the checkbox ghost suppression in the Form factor panel and inserting the number of spins in the edit field on the right of this checbox. It is advisable to perform data processing both with and without such suppression and to compare the distance distributions obtained. For detailed discussion of performance and limitations, see [12]. 10.8 Test data sets DeerAnalysis2008 has a subdirectory simulated with Matlab scripts for generating test data with artificial noise and artificial background. The directory also contains a number of such test data sets: Gaussian distributions, distributions with two Gaussian peaks, a boxcar distribution, and a sawtooth distribution. These data sets are intended for test and training purposes. Test data sets are loaded by selecting the ASCII subpanel of the Data sets panel. The default column assignment (Time 1, Real 2, Imaginary 3) applies. These data sets can be processed in the same way as experimental data sets. In the Distance distribution plot of the main window and of the validation 29 tool the distance distribution used for generating these data sets is displayed in cyan color. Precomputed test data sets are a series of Gaussian distributions with fixed ratio of ten between mean distance and standard deviation, a boxbar from 2 to 4 nm, a sawtooth from 2 to 4 nm, Gaussian distributions at 3 and 4 nm with two-dimensional background, and a few distributions consisting of two Gaussian peaks. Further test data sets can be computed by using the Matlab scripts create test data.m (single Gaussian, 3D background), create test data 2D.m (single Gaussian, 2D background), create test data nb.m (two Gaussians), create test data special.m (boxcar), create test data special 2.m (sawtooth), and create test data special 3.m. 11 Error analysis (validation) of distance distributions For an ill-posed problem the relation between noise in the input data and uncertainty of the output data is difficult to predict. Furthermore, background deconvolution is not exact for experimental data, which also introduces an error in the form factor. Often this error due to imperfect background correction even dominates the error in distance distribution. While error propagation in Tikhonov regularization cannot be predicted analytically, there is an obvious numerical approach for such a prediction. Assume that the uncertainties of background correction can be modelled by variation of the starting time for the background fitting within certain bounds and of the dimensionality of the spatial distribution within certain bounds. Background correction can now be performed for a sufficiently large number ntrials of parameter sets within these bounds and the form factors obtained can be subjected to Tikhonov regularization. This provides ntrials distance distributions that can be statistically analyzed. Thus, a lower and upper limit, a mean value, and a standard deviation are obtained for each point in the distance distribution. Alternatively, uncertainty of background parameters can be given in terms of lower and upper bounds for the density (proportional to concentration), the modulation depth, and the background dimensionality. This approach was followed until DeerAnalysis2010 and is still available for backward compatibility. This old approach does not consider that the uncertainties in the background parameters may be correlated. Thus it tends to overestimate the error. It is more advisable to vary the starting time for the background fit and, if necessary, background dimensionality. Still, some combinations of dimensionality and starting time may result to poor fits of the form factor. A poor fit can be defined as a fit whose root mean square deviation from the experimental data exceeds the root mean square deviation (r.m.s.d.) of the best fit by a factor Lprune (prune level). DeerAnalysis2011 suggests a prune level of 1.15, but users can define this value according to their own experience. Parameter combinations that lead to fits with r.m.s.d. ¿ prune 30 level · min(r.m.s.d.) are excluded from statistical analysis. The influence of noise on the distance distribution can be estimated in a similar way as the influence of uncertainties in background parameters. In this case noise is artificially enhanced by adding pseudorandom numbers so that the noise level is increased by a certain factor Lnoise . Errors are probably overestimated when taking Lnoise = 2. A value Lnoise = 1.5 is suggested. The tests for noise and background influence can be combined to obtain a total error estimate. Figure 9: Screenshot of the validation window. The validation tool is started by clicking on the Validation button in the Tikhonov regularization subpanel of the Data analysis panel. This button is set inactive when a Tikhonov regularization has not yet been performed. Validation uses the regularization parameter that is selected in the main window when the button is clicked. This regularization parameter is displayed in the title line of the validation window (Figure 9) The user can now select what parameters have to be treated as uncertain (default: Background starting time), what lower and upper bounds are estimated for these parameters and how many values within these bounds should be tested. The trial values are distributed uniformly between the bounds. Each allowed value of each variable parameter is combined with each allowed value of each other parameter. One should be aware that this may lead to large numbers of trials and that for each trial a Tikhonov regularization has to be performed. Computation times may be substantial. The Rough grid and Fine grid but31 tons make suggestions for the bounds and number of trials that are based on the result of the background correction performed in the main window. These suggestions should only be used if there is no independent information on the bounds. After starting the computation by clicking on the Compute button, a progress bar will appear as soon as the first Tikhonov regularization has been performed. At that point an estimate of the remaining computation time is displayed. If necessary the computation can be interrupted by closing the progress bar. For that the × button in the upper right corner of the progress bar has to be clicked. After the next Tikhonov regularization is completed, a window appears that allows for interrupt or continuation of the computation. After all trials are computed the Distance distribution plot displays the distance distribution with the best r.m.s.d. as bold green line, grey error bars that indicate the full variation of the probability of a given distance over all trials, a lower error estimate corresponding to the mean value of the probability minus two times its standrad deviation, and an upper error estimate corresponding to the mean value plus two times the standard deviation (red dotted lines). The mean distance is indicated by a vertical cyan dotted line. For test data sets, the distance distribution used in simulating the data is displayed as a cyan solid line. By using the - and + buttons left and right from the data set number display, all computed distance distributions (and the corresponding background functions and fits iof the form factor) can be inspected in turn. The Short only button selects the data set with the smallest contribution of long distances. The Compact button selects the set with the most narrow distribution. The ? button selects the data set with largest r.m.s.d. of the fit from the experimental form factor and the ! button the data set with the lowest r.m.s.d. Poor fits can now be excluded from statistical analysis by selecting the prune level and clicking on the Prune button. Note that this is irreversible. The excluded data sets are lost and can only be recovered by repeating the computation. Separate Matlab figures of all plots (for copying, saving or printing) are obtained with the Copy button. The validation tool can be left either with the Cancel button, leaving the original state of the main window intact, or with the Close button, transferring the mean distance distribution and the error bounds to the main window. A report of the validation is stored automatically in the latter case. After returning to the main window, the mean distance distribution from all trials is displayed. If the error est. option in the Distance distribution panel is selected, the lower and upper error bounds (two times standard deviation) are displayed as grey error bars. 12 Post-processing For many cases, one wants to quantify the distance distribution in terms of a few numbers, i.e., mean distance and width of the whole distribution or of individual 32 peaks. For oligomers of membrane proteins and self-assembled supramolecular systems, it may also be of interest to derive the number of spins within an individual nanoobject. All these values can be obtained by post-processing. 12.1 Moment analysis and peak picking Analysis of a number of simulated and experimental DEER data sets suggested that the first moment (mean distance) and second moment (variance, square of the standard deviation) of the distance distribution are stable parameters. In other words, these values are only very slightly influenced by noise-induced artificial splittings in the distance distribution. This applies in particular to the results of those techniques that incorporate the constraint P (r) > 0 (Tikhonov regularization and model fitting). Moment analysis of the distance distribution in the range of interest (default: 1.58 nm) is therefore performed automatically. The mean distance (hri) and standard deviation (s(r)) are displayed in the Distance analysis panel. To exclude obvious artifacts at the short or long end of the distance range (due to nuclear modulations or errors in background correction), you may change the range for analysis using the + and - buttons for the blue and magenta cursor in the Distance distribution panel or direct input into the corresponding edit fields. This option can also be used for extending the distance range if very long distances have been measured or for selecting only a single peak in a multimodal distance distribution and determining its mean distance and width. When the Expand checkbox is selected, the distance distribution is displayed only between the cursors. 12.2 Checking for the relevance of small peaks With Tikhonov regularization, one sometimes observes small peaks in the distance distribution that may be related to noise, to errors in background correction, or to genuine small contributions to the distance distribution. It is instructive to check the contribution of such peaks to the simulated dipolar evolution function or dipolar spectrum. To suppress such peaks, move the blue and magenta cursors so that they include them (see Fig. 10) and click on the green Suppress button. The distance distribution without these peaks is shown as a green curve and the corresponding fit of the experimental data is displayed in the Dipolar evolution plot also as a green curve. In the case illustrated in Fig. 10, the small peaks are obviously artifacts. The original (red) fit has a slightly better r.m.s. value, but is not perfect (see first minimum of the oscillation). The green fit is better at the first minimum but worse at the second maximum. In this case, the small peaks should thus be disregarded in interpretation. 12.3 Number of coupled spins The number of spins within a nanoobject can be derived from the (calibrated) modulation depth if decay due to spins in other nanoobjects can be neglected, as was shown early on by the Novosibirsk group [21]. The same applies for the 33 x 10 -3 4 a b 1 0.9 3 0.8 2 0.7 0.6 1 0.5 0.4 0 3 4 5 6 7 0 r (nm) 1 2 3 t (µs) Figure 10: Suppressing small peaks in data set deer bi 36 50K from the calibdepth subdirectory. a) Distance distribution obtained by Tikhonov regularization (black) and after suppressing the peaks between the blue and magenta cursor by clicking on the green Suppress button (green). b) Experimental duipolar evolution function (black), fit by Tikhonov regularization (red), and fit after suppressing the two small peaks between the blue and magenta cursor. modulation depth in the dipolar evolution function after appropriate correction of the background decay [5]. The total modulation depth is given by ∆ = 1 − exp [λ (hni − 1)] , (13) where hni is the average number of spins in the observed nanoobjects. To use this information, DeerAnalysis2006 therefore retains information on the modulation depth in the dipolar evolution function. Quantification requires knowledge of the modulation depth parameter λ, which depends strongly on the excitation position, length, and flip angle of the pump pulse and weakly on line broadening in the nitroxide spectrum and shape of the resonator mode. Reliable quantification therefore requires a calibration with known samples and proper adjustment of the flip angle of the pump pulse (see Section 15). Spectra from our own series of calibration samples (six biradicals and one triradical) are provided in the folder calibdepth. They correspond to 12 ns π pump pulses irradiated at the maximum of the nitroxide spectrum (see Fig. 2) using a Bruker 3mm split-ring resonator. Note that not all example spectra in other folders were measured under the same conditions. To calibrate modulation depths for your own applications, you should measure at least one genuine biradical with close to 100% degree of spin-labeling under your measurement conditions. Analyse the data for this biradical, preferably with Tikhonov regularization and change the number of spins in the corresponding edit field of the Distance analyis panel to 2. The number is then displayed in green instead of red color. If another data set, measured under the same conditions, is loaded and processed, 34 the displayed number of spins should correspond to the true average number hni of spins in the nanoobject. Note that this calibration is lost on restarting DeerAnalysis and that it is unreliable when using excitation bandwidth correction. Also consult Section 9.6. 12.4 Comparing data sets (dual display) To compare two data sets of the same sample or of similar samples first load one of the data sets and process it as usual. To keep the same processing parameters for the second data set, you may then want to uncheck the Reset checkbox below the Load button in the Data sets panel. After loading the second data set, its file name is shown in line A: of the Data sets panel. This is the active data set. The file name of the previous data set is shown in line B:. The original data and processing results can now be compared by selecting the Dual display checkbox in the Original data panel. Traces corresponding to the previous data set are now shown in blue in all plots. In the Dipolar evolution plot, only experimental data, but no fits are shown for the previous data set. If the two data sets differ considerably in their modulation depth, but have similar distance distribution, the samples may just differ in the extent of spin labelling or the measurement conditions (flip angles, resonator, pulse lengths) may have been slightly different. To check for this, use modulation depth scaling [5] by selecting the mod. depth scaling checkbox in the Original data panel. Differences in the distance distribution are noise-related if the original data are not significantly different after such modulation depth scaling. 13 13.1 Output Saving data Unlike its predecessor program DeerAnanlysis2004, the new version DeerAnalysis2006 does not automatically save results, except as an option during processing series of data sets (see Section 14). On attempt to close the program after time-consuming fits (Tikhonov regularization, model fits) without saving results, the user is reminded. The whole set of data including background correction, experimental and fitted dipolar evolution function and spectrum, distance distribution, processing parameters, results of moment analysis and fitted parameters and L curve (if available) are saved together with the same basis file name, but into different ASCII files. After clicking on the Save button, the user is asked for the file name. The last extension and, if present, a suffix res are removed to derive the basis name basname (this is useful for overwriting old results by selecting their name in the diplayed file list). The following files are then saved: 35 x 10 1 a 0.95 1 b 0.95 0.9 -3 c 5 4 0.9 0.85 3 0.85 0.8 0.75 0.8 0.7 0.75 0.65 2 1 0 0.7 0 0.5 1 1.5 2 0 0.5 1 t (µs) 1.5 2 2 4 t (µs) x 10 1 d 0.95 0.9 6 8 r (nm) 1 e f 5 0.95 4 0.9 3 0.85 -3 2 0.85 0.8 1 0.8 0.75 0 0.5 1 1.5 2 0 0 0.5 1 t (µs) 1.5 2 2 4 t (µs) 1 g 0.9 10 1 h 0.95 0.9 8 x 10 i 8 6 0.85 0.8 6 r (nm) -3 4 0.8 0.7 2 0.75 0 0.7 0.6 0 0.5 1 t (µs) 1.5 2 0 0.5 1 1.5 t (µs) 2 2 4 6 8 r (nm) Figure 11: Dual display for comparison of two data sets. All data sets are from the subdirectory examples\series. Left column: Original data. Middle column: Dipolar evolution functions after background correction. Right column: Distance distributions. a-c) Comparison of data sets series2 (active set A, black traces) and series1 (set B) without modulation depth scaling. d-f) Comparison of data sets series2 (active set A, black traces) and series1 (set B) with modulation depth scaling. g-i) Comparison of data sets series8 (active set A, black traces) and series1 (set B) with modulation depth scaling. • • • • • • basname res.txt a summary of the program settings and the results basname bckg.dat the phase-corrected original data and background fit 1st column: time axis (in µs), 2nd column: real part of original data, 3rd column: background fit 4th column: imaginary part of original data (if present) basname fit.dat the dipolar evolution function and its fit 1st column: time axis (in µs), 2nd column: dipolar evolution function after background correction, 3rd column: fit of the dipolar evolution function basname spc.dat the dipolar spectrum and its fit 36 1st column: frequency axis (in MHz), 2nd column: experimental dipolar spectrum, 3rd column: fit of the dipolar spectrum basname distr.dat the distance distribution 1st column: distance axis (in nm), 2nd column: distance distribution P (r) basname Lcurve.dat the L curve of Tikhonov regularization (only if computed) 1st column: log(ρ), 2nd column: log(η), rd The results file basname res.txt protocols all relevant program settings, the mean distance, width of the distance distribution, and third moment, and for Tikhonov regularization, the regularization parameter. For model fits, the values of all fit parameters are also saved here. 13.2 Copying or printing individual plots The three current plots of DeerAnalysis2006 can be copied into individual Matlab figures by clicking on the Copy button in the Data sets panel. Using the figure menu, the plots can then be rescaled, edited, annotated, printed, exported as different graphics formats or copied into the Windows clipboard (item Copy figure in the Edit menu). Matlab has a good help system that explains these possibilities. 14 Processing a series of similar data sets A global analysis of several data sets is useful when measurements on the same sample have been reproduced or when samples have been prepared under slightly different conditions and one wants to check whether structural changes have occurred (see also Section 12.4). The first case requires computation of an average distance distribution that takes into account the signal-to-noise ratio of the individual data sets. In the second case the comparison should be performed for modulation-depth normalized primary data rather than for distance distributions as it is difficult to estimate what degree of change in the distance distribution is significant [5]. For both tasks a text file listname.txt has to be prepared that contains a list of filenames (without extension) of all the data sets that are to be processed together (for an example, see the file series.txt in the subdirectory example\series). List processing starts with analysis of a pilot data set, which should ideally be the data set with the best signal-to-noise ratio. This data set with best signal-to-noise ratio should also be the first set in the list, as the first data set is used as a reference for modulation depth scaling. After loading the pilot data set it is processed as usual. Series processing is then initiated by the Series button in the Data sets panel. Progress is reported in the Status panel and line A: of the Data set panel. Plots are also updated (with a slight delay) during series processing. The program will return after the last data set has been processed. This data set is now the active data set. The average distance distribution and average dipolar evolution function after series processing as well as average results of moment analysis are not displayed on screen, but are saved automatically. These files have the following formats: 37 • listname res.txt a summary of the program settings and the results for the average of all data sets • listname mean.dat the mean dipolar evolution function 1st column: time axis (in µs), 2nd column: mean dipolar evolution function after background correction, • listname cmp.dat modulation-depth normalized primary data (without background correction) 1st column: time axis (in µs), n remaining columns: primary data (real part) for data sets 1 · · · n, • listname diff.dat n × n matrix quantifying the difference between data sets large values in element (k, j) indicate that data sets k and j differ significantly Primary data sets and distance distributions are averaged with a weighting factor that is inversely proportional to the mean square deviation of the fit of the dipolar evolution function. This corresponds to a maximum likelihood estimate of the average. By default results for the individual data sets are not automatically saved. Automatic saving can be initiated by selecting the Autosave checkbox below the Series button. Note that even with this checkbox selected, automatic saving takes place only during series processing, not when processing individual data sets via the Load button. 15 Hints for Data Acquisition Conversion of a dipolar evolution function as measured by a magnetic resonance experiment to a distance distribution is an ill-posed mathematical problem [6]. This means that even small deviations from the theoretical function (noise, phase problems, an intensity offset) can cause significant distortions in the distance distribution. Thus, it is of utmost importance to acquire experimental data with the best quality possible within a reasonable measurement time. The choice of a number of experimental parameters has been discussed earlier [14]. From our own experience we suggest to perform measurements in the following way. A temperature of 80 K is a good compromise for most samples, but sensitivity is often somewhat better at 50 K. For critical samples such as membrane proteins, cooling to 50 K is often worth the effort. Unless the sample really has a strong signal, one should plan for measuring two samples in 24 hours, one during the day and one over night. Spectrometers tend to be stable enough over a period of several hours and the quality of the distance distribution tends to be limited by the signal-to-noise ratio except for synthetic model compounds with very narrow distance distributions. The observer and pump frequencies 38 should be stable within 1 MHz during the measurement time, and this should be checked. It is good practice to acquire data with quadrature detection and to adjust the detector phase properly at the beginning. That way instability of the spectrometer can be recognized by the appearance of a significant imaginary part of the signal. Note that a small phase drift (corrections up to 20◦ for a measurement extending over several hours) is no cause for alarm. For four-pulse DEER on pairs of nitroxides at X-band frequencies we suggest that the pump pulse has a length of 12 ns. This can be achieved with a Bruker 3-mm-split-ring resonator. We also suggest that all the observer pulses have the same length of 32 ns. These conditions cannot be met at all spectrometers and with all probeheads. Using a length of 32 ns for all pulses, or a length of 16 ns for the π/2 pulses and a length of 32 ns for the π pulses also provides good results. If your pump π pulse has the same length as the π observer pulses, you may want to set the observer frequency to the center of the resonator mode and the pump frequency into the flank. Note however, that the opposite setting as suggested by Fig. 2 allows for a shorter pump pulse and hence larger modulation depth. The power of the pump pulse should be adjusted for optimum flip angle (optimum echo inversion) using an inversion recovery sequence πpump − T − π/2obs − τ − πobs − τ − echo . This has to be done with coinciding pump and observer frequency at the position in the microwave mode where the pump pulse is applied. After this step the pump frequency should not be changed anymore. If this procedure is not followed, modulation depths are ill-defined and should not be compared between samples. The step is also an absolute requirement if concentrations are to be determined. We suggest that the pump pulse is applied at the maximum of the nitroxide spectrum, which maximizes modulation depth. This minimizes artifacts due to nuclear modulations, phase noise, and spectrometer imperfections. The observer pulses are then applied at the low-field local maximum which corresponds to increasing the observer frequency (spectrometer frequency) by approximately 65 MHz. You may measure the field difference ∆B0 between the two maxima and multiply it by 2.8 to obtain the exact frequency difference for your particular nitroxide. A phase cycle (+x) − (−x) should be applied to the first observer pulse to eliminate offsets in the detector channels. If this phase cycling is omitted, any phase correction of the primary data will not be exact and hence background correction by program DeerAnalysis2006 will not be exact. Furthermore, modulation depth information is not reliable. In principle, the problems could be solved by introducing the offset as an additional parameter in background correction, but we strongly discourage such an approach, as it complicates separation of the dipolar evolution function from the background which may be difficult anyway for long distances. For the interpulse delays in the four-pulse DEER experiment π/2(νobs )-τ1 π(νobs )-t′ -π(νpump )-(τ1 +τ2 -t′ )-π(νobs )-τ2 -echo we suggest τ1 = 200 ns for protonated solvents/matrices and τ1 = 400 ns for deuterated solvents/matrices (at X band). To suppress proton modulations it is advantageous to perform the experiment at eight different values of τ1 spaced by 8 ns and starting with 39 the values given above. The signals of the eight experiments are added. In variable-time DEER [4] we suggest initial values τ2,0 = 300 ns for protonated and τ2,0 = 500 ns for deuterated samples. In constant-time DEER, τ2 = 800 ns is usually convenient for setup (adjustment of the detector phase). For the actual measurement, the choice of τ2 depends on transverse relaxation, signal strength, and on the longest distances that have to be measured. It is difficult to give general suggestions, but the problem has been discussed in some detail in Ref. [4]. The integration gate should match the width of the observer echo, which is similar to the width of the longest observer pulses. The gate should be centered at the echo maximum. If you can save data in Xepr (Elexsys) format, DeerAnalysis 2004 can directly import binary data. For ESP 380 data we suggest importing them into WIN-EPR and saving them in the binary WIN-EPR format (this step converts the coding of binary float numbers to a format legible by MATLAB). If you use another data acquisition system, you should save your data in an ASCII representation. Acknowledgment We thank Freiburger Materialforschungszentrum for the FTIKREG Tikhonov regularization code by J. Weese, G. Milhauser for helping with implementation of a Mac version, and M. Spitzbarth, S. Domingo K¨ ohler, and M. Drescher for implementing the Rice distribution model. References [1] G. Jeschke, M. Sajid, M. Schulte, A. Godt. Phys. Chem. Chem. Phys. 11 (2009) 6580–6591. [2] G. Jeschke, V. Chechik, P. Ionita, A. Godt, H. Zimmermann, J. Banham, C. R. Timmel, D. Hilger, H. Jung. Appl. Magn. Reson. 30 (2006) 473–498. [3] M. Pannier, S. Veit, A. Godt, G. Jeschke, H. W. Spiess, J. Magn. Reson. 142 (2000) 331–340. [4] G. Jeschke, A. Bender, H. Paulsen, H. Zimmermann, A. Godt, J. Magn. Reson. 169 (2004) 1–12. [5] G. Jeschke, G. Panek, A. Godt, A. Bender, H. Paulsen, Appl. Magn. Reson. 26 (2004) 223-244. [6] G. Jeschke, A. Koch, U. Jonas, A. Godt, J. Magn. Reson. 155 (2001) 72–82. [7] P.P. Borbat, J.H. Freed, Chem. Phys. Lett. 313 (1999) 145–154. [8] G. Jeschke, A. Godt, ChemPhysChem 4 (2003) 1328–1334. 40 [9] G. Jeschke, M. Pannier, A. Godt, H. W. Spiess, Chem. Phys. Lett. 331 (2000) 243–252. [10] D. Hinderberger, H. W. Spiess, G. Jeschke, J. Phys. Chem. 108 (2004) 3698–3704. [11] A. Godt, M. Schulte, H. Zimmermann, G. Jeschke, Angew. Chem. Int. Ed. 45 (2006) 7560–7564. [12] T. von Hagens, Y. POlyhach, M. Sajid, A. Godt, G. Jeschke. Phys. Chem. Chem. Phys. under revision. [13] S. Domingo K¨ ohler, M. Spitzbarth, K. Diederichs, T. E. Exner, M. Drescher, J. Magn. Reson., 208 (2011) 167–170. [14] G. Jeschke, ChemPhysChem 3 (2002) 927–932. [15] Y. W. Chiang, P. P. Borbat, J. H. Freed, J. Magn. Reson. 172 (2005) 279–295. [16] A. G. Maryasov, Y. D. Tsvetkov, Appl. Magn. Reson. 18 (2000) 583–605. [17] A. D. Milov, B. D. Naumov, Y. D. Tsvetkov, Appl. Magn. Reson. 26 (2004) 587–599. [18] A. N. Tikhonov, V. Y. Arsenin, Solutions of Ill-Posed Problems (Wiley, New York, 1977). [19] J. Honerkamp, J. Weese, Cont. Mech. Therm. 2 (1990) 17. [20] J. Weese, Comput. Phys. Commun. 69 (1992) 99–111 (1992). [21] A. D. Milov, A. B. Ponomarev, Y. D. Tsvetkov, Chem. Phys. Lett. 110 (1984) 67–72. [22] G. Jeschke, M. Sajid, M. Schulte, A. Godt, Phys. Chem. Chem. Phys. 11 (2009) 6580–6591. 41

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement