Endonuclease PvuII (1PVI) DNA - GATTACAGATTACA
CAP - Catabolite gene Activating Protein (1BER)
DNA - GATTACAGATTACAGATTACA Endonuclease PvuII bound to palindromic DNA recognition site CAGCTG (1PVI) DNA - GATTACAGATTACAGATTACA TBP - TATA box Binding Protein (1C9B)
CAP - Catabolite gene Activating Protein (1BER)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
TBP - TATA box Binding Protein (1C9B)
 

° 

Recipes

-

Answer complex questions

This section gives step-by-step instructions for some of the more complicated tasks. You should have a look here in case the movies and macros available from www.yasara.com/repository cannot help you any more.

° 

Run molecular dynamics simulations

Running molecular dynamics simulations is one of YASARA's main applications. A lot of scientific insight can be gained from the analysis of MD trajectories, as long as one keeps in mind that MD simulations tend to look more realistic than they really are. They are also an essential tool for molecular modeling: during a simulation, you can freely interact with your protein and pull atoms around.

YASARA provides two different kinds of force fields: the NOVA force field which has been optimized for energy minimizations in vacuo, and the (Y)AMBER/YASARA force fields which are intended for realistic molecular dynamics simulations in aqueous solution.

° 

Preparing the topology

Before a simulation can be run, appropriate force field parameters have to be assigned to all the residues in the soup . While this is straightforward for well known ones like the standard 20 amino-acids or common ions, it is far from trivial for the infinitely many other molecules that nobody has explicitly parameterized yet.

Especially PDB files often contain ligands and cofactors at rather low resolution and without hydrogen atoms, and it is very challenging for a computer program to automatically analyze the molecule, assign the right bond orders and subsequently add the missing hydrogen atoms. Further complications arise from the fact that bond orders and protonation states depend on the pH you want to simulate.

A lot of efforts have been spent to make YASARA smart enough to do all this automatically, and in the majority of cases it is enough to follow these simple steps:

  • Click 'Options > Default pH' to set the pH you want to use during your simulation.
  • Click 'File > Load > PDB file' to load your protein and assign bond orders according to the chosen pH.
  • Click 'Edit > Clean > All' to add the missing hydrogen atoms and correct any problems.
  • Look at non-standard residues (e.g. press <F2>, then <F6>) while showing the bond orders to verify that YASARA has made the expected choices.

The problem of assigning bond orders and adding missing hydrogen atoms is an ambiguous one, in many cases multiple correct solutions exist. If YASARA does not make the intended assignments, there are several options:

  • Provide a hint for the favored solution by explicitly adding hydrogen atoms or setting bond orders to critical atoms.
  • Extend YASARA's chemical knowledge using the simple SMILES-based assignment of pH-dependent bond orders and protonation states in the GROUP_DATA section of the file yasara.def (present in YASARA Dynamics+). How that works is explained in detail there, and please forward your modifications to us, so that they can be included in the next update.
  • Add a complete topology entry for your residue to the TOPOLOGY_DATA section in the file yasara.def. This option is used for all the standard residues but should be avoided for the others, because it introduces a dependency on residue and atom names - and these are essentially random choices left to the crystallographers for all but the standard residues.

° 

Preparing the force field

You can only run a simulation if the force field 'knows' how to treat each residue in the soup. For AMBER-style force fields (including YAMBER), YASARA provides a fully automatic approach to parameter assignment called AutoSMILES, while the NOVA force field may require a bit of manual intervention.

The (Y)AMBER/YASARA force fields are aimed at realistic molecular dynamics simulations in aqueous solution, while NOVA has been developed for quick simulated annealing minimizations in vacuo.

To check if your structure requires special attention, click on Simulation > Force field and select the force field you want to use, then click Edit > Clean > All. YASARA now tries to prepare your structure for simulation, by adding missing atoms (hydrogens, side-chains, terminal oxygens) and deleting those that are not needed (atoms with alternate locations). If you get an error message, look at the indicated residue and try to fix the problem. Here are some hints:

  • If YASARA complains about an incomplete backbone and the indicated residue is a terminal one, just delete it.

  • If your protein contains bound metal ions, delete all bonds between the ions and the protein. Molecular dynamics force fields like (Y)Amber treat the interaction with metal ions purely electrostatically.

Click Simulation > Define simulation cell > Set automatically > OK to add a simulation cell. For molecular modeling purposes, you can also define a small cell that contains only a fraction of your protein, e.g. the active site. Then click Simulation > Simulator > Initialize to assign the force field parameters. At this point, you will get an error message if further attention is needed.

When changing a force field manually, keep in mind that installing a YASARA update may overwrite the force field files with newer versions, so make a backup of your changes. You can also create your own force field by saving the definition file under a different name and adding an entry for the new force field at the end of the file yasara/yasara.def.

° 

Deriving new (Y)AMBER force field parameters

Starting with version 6, YASARA can derive (Y)AMBER force field parameters for unknown molecules fully automatically, allowing to simulate 98% of the structures in the PDB at the touch of a button. Manual intervention is only needed in case of exotic metal ions. The approach behind the AutoSMILES algorithm can be summarized as follows:

  • Assignment of pH dependent fractional bond orders and protonation patterns, typing of ring systems by a graph-theoretic approach.
  • Identification of known molecules (from the force field definition file or the AMBER Parameter Database using SMILES strings. If no hit is found, proceed to step 3).
  • Calculation of semi-empirical AM1 Mulliken point charges[3]. This step involves a geometry optimization with the COSMO solvation model[4] and avoids fatal rearrangements sometimes found when optimizing highly charged molecules like ATP4- in vacuo.
  • Assignment of AM1BCC atom- and bond types.
  • Application of the 'AM1 Bond Charge Correction' to improve the AM1 charges and make them better represent the electrostatic potential around the molecule - just like RESP charges.
  • Further improvement of the AM1BCC charges using the known ideal RESP charges of similar molecule fragments, identified via SMILES strings.
  • Assignment of GAFF (General AMBER Force Field) atom types and remaining force field parameters.
  • In the end, the newly created parameters are cached for instantaneous availability next time.

Since all this is done automatically, only one step is required in practice: Press <F12> to run the simulation. YASARA also produces a detailed force field parameter assignment report in the console, which you can examine in detail to see what happened.

Figure: The steps of the AutoSMILES force field parameter assignment procedure.

If a residue contains more atoms than YASARA's QM module can handle , YASARA will try to split it up into smaller pieces that can easily be parameterized independently. While this works well for lipids, where each hydrophobic tail is usually parameterized separately, you may have to give YASARA a hint for other large molecules by following these steps:
  • Identify an aliphatic carbon that is as far away as possible from polar atoms, and - if removed - would split your residue in two parts in the soup (i.e. all atoms with lower numbers precede the carbon, all atoms with higher numbers follow it in the soup).
  • Mark the carbon, then right-click to activate the context menu and click Split > Atom.
  • In the bottom sequence selector, your residue should now have been split in exactly two parts. Now you can run the simulation.

If you nevertheless want to define force field parameters manually , this implies adding at least a residue topology in AMBER PREP format to one of the force field definition files (*.fof) in the yasara/fof subdirectory. The best location is probably gafftopo.fof, since this file is included in all force fields.

Try to follow these steps for AMBER-style force fields:

  • Look at the repository of AMBER force field parameters at http://pharmacy.man.ac.uk/amber/. Maybe your molecule has already been parameterized. Also try to Google it. Note that most of the AMBER Parameter Database is already included in YASARA by default.

  • Read the introduction to parameter fitting on page 287 of the AMBER 7 manual. (Downloadable from http://amber.scripps.edu).

  • Exit YASARA and open the force field definition file in a text editor. You can find the force fields at yasara/fof/ForceFieldName.fof, e.g. yasara/fof/yamber2.fof for the YamberII force field.

  • Go to the end of the file and add a topology entry for your residue. The format of these topology entries is also described on the AMBER website. YASARA does not use the bond length, angle and dihedral data, so these columns can be set to zero.

The information needed for every atom is the sequential number (column 1), the atom name in the PDB file (column 2), the force field atom type (column 3) and the point charge on the atom (last column). Just keep the header (including the three DUMMy atoms), and replace the name of the compound (line 1) and the three letter code of your residue (line 3, 'AGS' in the example below). The second line must stay empty.

If your ligand contains planar groups (around resonance or double bonds), you must also add improper dihedral entries (see IMPROPER statements below).

The LOOP statement used by AMBER is not needed and ignored.


N-Acetyl-D-glucosamine-6-sulfate ( 1' O and no 4' OH-group, Gaussian98, RESP)

AGS  INT    1
CORR OMIT DU   BEG
  0.000000
    1 DUMM   DU   M    0  -1  -2     0.0000    0.0000    0.0000  0.000
    2 DUMM   DU   M    1   0  -1     1.0000    0.0000    0.0000  0.000
    3 DUMM   DU   M    2   1   0     1.0000   90.0000    0.0000  0.000
    4 C1     AC   M    0   0   0     0.0000    0.0000    0.0000  0.124693
    5 C2     CT   M    0   0   0     0.0000    0.0000    0.0000  0.032432
    6 C3     CT   M    0   0   0     0.0000    0.0000    0.0000  0.107401
    7 C4     CT   M    0   0   0     0.0000    0.0000    0.0000  0.044439
    8 C5     CT   M    0   0   0     0.0000    0.0000    0.0000  0.098935
    9 C6     CT   M    0   0   0     0.0000    0.0000    0.0000  0.031706
   10 N      N    M    0   0   0     0.0000    0.0000    0.0000 -0.466264
   11 O1     OG   M    0   0   0     0.0000    0.0000    0.0000 -0.430845
   12 O3     OH   M    0   0   0     0.0000    0.0000    0.0000 -0.658535
   13 O5     OS   M    0   0   0     0.0000    0.0000    0.0000 -0.386715
   14 O6     OS   M    0   0   0     0.0000    0.0000    0.0000 -0.399527
   15 C2N    C    M    0   0   0     0.0000    0.0000    0.0000  0.749408
   16 O2N    O    M    0   0   0     0.0000    0.0000    0.0000 -0.632137
   17 CME    CT   M    0   0   0     0.0000    0.0000    0.0000 -0.448969
   18 HME    HC   M    0   0   0     0.0000    0.0000    0.0000  0.119617
   19 HME    HC   M    0   0   0     0.0000    0.0000    0.0000  0.119617
   20 HME    HC   M    0   0   0     0.0000    0.0000    0.0000  0.119617
   21 S      SO   M    0   0   0     0.0000    0.0000    0.0000  1.131685
   22 O1S    O2   M    0   0   0     0.0000    0.0000    0.0000 -0.603269
   23 O2S    O2   M    0   0   0     0.0000    0.0000    0.0000 -0.603269
   24 O3S    O2   M    0   0   0     0.0000    0.0000    0.0000 -0.603269
   25 H1     HC   M    0   0   0     0.0000    0.0000    0.0000  0.160956
   26 H2     HC   M    0   0   0     0.0000    0.0000    0.0000  0.127396
   27 H3     HC   M    0   0   0     0.0000    0.0000    0.0000  0.137259
   28 H4     HC   M    0   0   0     0.0000    0.0000    0.0000  0.139063
   29 H5     HC   M    0   0   0     0.0000    0.0000    0.0000  0.086498
   30 H6     HC   M    0   0   0     0.0000    0.0000    0.0000  0.085794
   31 H6     HC   M    0   0   0     0.0000    0.0000    0.0000  0.085794
   32 HN     H    M    0   0   0     0.0000    0.0000    0.0000  0.296182
   33 HO3    HO   M    0   0   0     0.0000    0.0000    0.0000  0.434306

LOOP
O5 C1

IMPROPER
 C2N  C2   N    HN
 CME  N    C2N  O2N

DONE

  • Update the force field by starting YASARA with the command line option -upd:
    
    yasara -upd
    
    

YASARA will then recompile all force fields and tell you if something went wrong. If you do not get an error message, restart YASARA and try to initialize the simulation again. If YASARA still complains, consider the following points:

  • Numbering of chemically equivalent hydrogens. The AMBER force fields do not follow the PDB convention that hydrogens bound to the same atom are numbered in the first column. YASARA corrects this problem for standard residues, but cannot guess the right answer for 'your' new residues. To avoid problems, do not additionally number hydrogens that are bound to the same atom. If you look at the example above, atoms 30 and 31 are bound to C6, and are both named H6, and neither H61/H62 nor 1H6/2H6.

  • If you created a topology for an unusual amino acid by copying from a standard residue, you also have to delete the numbers of chemically equivalent hydrogens as just described above.

References:

[1] Fast, efficient generation of high-quality atomic charges. AM1-BCC model: II. Parameterization and validation Jakalian A, Jack DB and Bayly CI (2002) J Comput Chem 23,1623-1641

[2] Development and Testing of a General Amber Force Field Wang J, Wolf RM, Caldwell JW, Kollman PA and Case DA (2004) submitted.

[3] MOPAC: A semiempirical molecular orbital program Stewart JJP (2000) J.Comp.Aided Mol.Des. 4,1-103

[4] Conductor-like screening model for real solvents: a new approach to the quantitative calculation of solvation phenomena Klamt A (1995) J.Phys.Chem. 99, 2224-2235

° 

Deriving new NOVA force field parameters

NOVA uses molecular trees to assign force field parameters. Most of the time, it is enough to follow this procedure:

  • Load a PDB file of the structure with the unknown ligand(s). If the PDB file does not contain CONECT records (no chemical bonds defined, the ligand appears as a point cloud in ball&stick display) let YASARA find the bonds (Edit > Find bonds in). In any case clean the structure (Edit > Clean), and save it in the yasara/fof directory.

  • Edit the file yasara/fof/nova.fof, search for the text 'OTHER MOLECULES' and insert the name of your PDB file below. If your PDB file also contains a protein, you should place a question mark '?' in front (this will tell YASARA to look only at new features in your structure and ignore the rest). This structure will now be used to learn equilibrium bond lengths and angles, it must therefore have an accurate covalent geometry.

Example:


;
;SOME OTHER MOLECULES
;====================
?MyStructureWithProtein.pdb
MyLigandAlone.pdb
smallmol.pdb
heparin.pdb

  • Update the force field by starting YASARA with the command line option -upd:
    
    yasara -upd
    
    
    YASARA will then recompile all force fields and tell you if something went wrong. If you do not get an error message, restart YASARA and try to initialize the simulation again.

  • If you still get the error message 'NOVA force field tree not found' and your molecule contains fused ring systems, simply use one of the other force fields which automatically detect and handle planarity.

  • You can always check that the bond types were assigned correctly by loading the PDB file and clicking on the atoms: the bond types are displayed in the lower left HUD.

  • If your residue contains unusual chemical groups that do not look like anything you find in proteins, DNA and sugars, the NOVA force field may not contain point charge definitions for the group. This will result in polar atoms without a charge. You can verify that by clicking on the polar atoms after initializing the simulation. The simulation HUD on the right lists all point charges on this atom. If there is more than one charge on an atom, you can cycle through them with <Home> and <End>. If you are really missing a charge and doing more than a quick&dirty modeling task, you must add an entry to the CHARGE_POSITION_DATA section in yasara/fof/nova.fof and recompile the force field.

  • The same principle holds for planar groups. If a group is not recognized as planar, a corresponding entry has to be added to the PLANE_CONFORMATION_PLANES section to avoid an out-of-plane distortion.

As NOVA has been optimized for proteins only, it is generally advised that you use the (Y)AMBER/YASARA force fields when working with small molecules.

° 

Running a simulation

As soon as the force field can cope with your structure, you are ready to run a simulation. Step-by-step instructions can be found in the help movie 'Accurate Simulations in Water'. Described below is the quickest way of running a simulation.

  • Create a project directory 'MyDir'.

  • Store either a PDB file (MyStructure.pdb) or a YASARA scene file (MyStructure.sce) of your structure there. Scene files must contain a simulation cell, and should therefore be used if you want to place the cell yourself.

  • Click Options > Macro & Movie > Set target and choose MyDir/MyStructure.pdb as the target.

  • Click Options > Macro & Movie > Play and choose your favorite molecular dynamics macro, e.g. md_run.mcr. If you do not like the default parameters, open yasara/mcr/md_run.mcr with a text editor, make the required changes and save it under a different name.

  • Wait and see how the simulation goes. The standard MD macros will take care of 'everything': Cleaning your structure, creating a simulation cell, filling it with water, placing counter ions, predicting pKa values and assigning protonation states, calibration, saving snapshots at regular intervals etc.

  • By default, YASARA updates the screen after each simulation step. While this approach maximizes interactivity, it also slows down the simulation. Click Simulation > Time step and increase the number at 'Update the display every...'. Note that this makes YASARA respond more slowly. Alternatively, you can run a simulation in console mode.

  • If you stopped a simulation and want to continue later, just play the md_run.mcr a second time.

To give you a reference point for the speed to expect: A simulation of Crambin (1CRN) using the default macro md_run.mcr with ~10000 atoms proceeds by 1 picosecond per minute or 1.44 nanoseconds per day on an AMD Opteron 1.8 GHz.

Figure: Crambin, ready for simulation in a water box

A note on simulating proteins with a net charge : the Particle Mesh Ewald algorithm used to calculate long range electrostatic interactions requires that the net charge of the simulation cell is zero, just like the charge of macroscopic objects in the real world. The macro md_run.mcr described here achieves this goal by adding counter ions to the system using the Neutralization Experiment, e.g. if the protein's net charge is negative due to excess Asp and Glu residues, YASARA will add Na+ ions to compensate.

° 

Analyzing a trajectory

It strongly depends on your question how you are going to analyze a simulation trajectory.

  • Click Options > Macro & Movie > Set target and choose MyDir/MyStructure (no extension!) as the target.

  • Click Options > Macro & Movie > Play and choose an analysis macro, e.g. md_analyze.mcr.

This macro will create a self-explanatory table MyDir/MyStructure_Analysis.tab which you can easily import in your favorite data visualization program (Excel, OpenOffice, XMGrace etc.). If you want more than the standard indicators (force field energies and RMSD from the starting structure), open yasara/mcr/md_analyze.mcr in a text editor and create your own version. Note that the energies in the table depend on the force field and cutoff that are chosen in the macro, so you may have to adapt it for your own simulations.

Here are a few hints for the analysis:

  • If you obtain an unexpected result, check how reproducible it is by running the simulation a second time with a different random number seed.

  • Care has been taken that molecular dynamics trajectories are entirely reproducible, i.e. if you run the same md_run.mcr a second time, using the same YASARA version on the same computer and assigning the same number of processors, you will obtain exactly the same trajectory again. Note however that the reproducibility may get lost when comparing different CPUs, operating systems or YASARA versions: AMD and Intel CPUs support different instruction sets for high performance calculations, which yield slightly different results. Also today's operating systems provide different mathematics libaries, which cause additional deviations. Finally, changes in the YASARA source code may also have a small influence on the results. In short: force field energies calculated for the same scene on different computers are likely to differ in the least significant digits, which are of no relevance anyway. The same is true for the calculated forces, which has a serious impact on simulations however: chaos theory requires that small initial differences grow exponentially with each simulation step, leading to very different trajectories and energies, just as if the simulation had been started with a different random number seed.

  • If you compare simulations run at different pH values and obtain quite different results, compare the covalent connectivity first: use the CompareAtom command to find those residues where YASARA assigned differing protonation states during the neutralization experiment, and investigate the region around these residues for an explanation of the structural deviations. Also check the significance of the result by running the simulation a second time with a different random number seed.

° 

Playing back a trajectory

  • Click Options > Macro & Movie > Set target and choose MyDir/MyStructure (no extension!) as the target.

  • Click Options > Macro & Movie > Play and choose a play-back macro, e.g. md_play.mcr.

Playing back a simulation requires that the force field can be initialized. So you may have create your own version of md_play.mcr, if you use a special force field or want to change some of the parameters (e.g. the play-back speed).

° 

Refine a homology model

To refine a homology model with YASARA, you need:

  • YASARA Dynamics.
  • The homology model, built for example with WHAT IF in the Twinset.

There are two tested ways of improving a homology model, i.e. reducing the RMSD between the model and the target.

° 

The quick method: in vacuo energy minimization with the NOVA force field

This method is described in the article Increasing the precision of comparative models with YASARA NOVA - a self-parameterizing force field , Proteins 47,393-402:

  • Click File > Load > PDB file to load the homology model.
  • Click Simulation > Force field and select the NOVA force field with a 10.24 A force cutoff.
  • Click Options > Choose experiment > Energy minimization and wait until the procedure is completed.

If your homology model contains unusual ligands, you may have to add NOVA force field parameters.

° 

The slow method: explicit solvent molecular dynamics simulation with the YAMBER force field

This method is described in the article Making optimal use of empirical energy functions: force field parameterization in crystal space Proteins 57,678-683:

  • Create a project directory 'MyDir'.

  • Store a PDB file of your model there (MyStructure.pdb).

  • Click Options > Macro & Movie > Set target and choose MyDir/MyStructure.pdb as the target.

  • Click Options > Macro & Movie > Play and choose md_refine.mcr.

  • Wait until the 500 ps simulation is finished.

As described in the publication cited above, this method does not work for all models, there are some which move in the wrong direction. These cases can be detected by looking at the structural quality (force field energy, Ramachandran plot, packing). YASARA will therefore analyze the snapshots at the end and save the results as a table, so that the lowest energy structure can be easily identified. If you have the Twinset installed, this table will also contain the above mentioned WHAT IF quality indicators.

° 

Solve an NMR structure

To solve an NMR structure with YASARA, you need:

Then follow these steps:

  • Create a directory for your project.
  • Save the sequence there, using the filename 'sequence.fasta' or 'sequence.pdb'.
  • Save the file with XPLOR restraints using the filename 'restraints.tbl'
  • Click Options > Macro & Movie > Set target and choose the project directory.
  • Click Options > Macro & Movie > Play macro and choose the macro 'nmr_solve'
  • Some time later, the file 'ensemble.pdb' contains your ensemble, and 'result.log' an analysis.

The entire refinement procedure has been implemented using the Yanaconda macro language, so that you can easily adapt everything to your own needs in a very flexible way.

The following sections describe the various stages needed to solve an NMR structure.

° 

Setting the parameters

In addition to the straightforward way of solving a structure , you can adapt the protocol at every step to your particular needs.

Before any work can be done, you need to set the default parameters, which is handled by the macro 'nmr_setdefaults'.

Obvious choices are:
  • The filenames for sequence, restraints, structure ensemble and analysis results.
  • The number of structures you want in the ensemble.

Not so obvious choices are:
  • The pH at which the NMR spectrum was recorded: this information will be used in the final explicit solvent refinement step to assign protonation states of amino acid side-chains.
  • The restraining function: YASARA supports the same functions as XPLOR, described in detail here, the usual default is the 'SoftSquare' function.
  • The restraining parameters: These globally affect all restraints and define the distance averaging as well as the overall scaling factors. Scaling factors are usually larger than the XPLOR equivalents due to internal force calculation differences. YASARA uses two different parameter sets: 'defaultpar' for the final refinement stage and analysis, and 'strongpar' for crude refinement with stronger forces.
  • The option to correct cis-peptide bonds before prolines: Contrary to the other 19 amino acids, prolines have a reasonable chance of allowing a preceding cis-peptide bond. If the 'correctcispro' flag is set to 'yes', these cis-peptide bonds will always be corrected, and any cis-peptide bonds that really occur in the structure will thus be missed. It is therefore a good idea to also generate an ensemble with this flag set to 'no', and check if the lowest energy members all share a certain cis-proline.
  • The list of cysteines that are bridged: If you already know which pairs of cysteines are bridged, store their numbers in 'cysbridgelist'. Alternatively, set 'cysbridgelist' to 'Auto' and let YASARA automatically link cysteines close in space.

° 

Folding the structure

The first step is to fold the protein from the stretched-out conformation . This is done by the macro 'nmr_fold' using the NMRFolding experiment , the process is described in more detail there.

At this stage, speed is more important than accuracy, and the structures generated this way are not realistic proteins yet. But they have helices at the right spot and the peptide chain running in the right direction to quickly arrive at the correct solution during the following molecular dynamics refinement.

The number of structures generated in this stage is specified by the 'structures' parameter in 'nmr_setdefaults' and is equal to the final number of ensemble members.

If you are running Linux, you can of course also skip this step and use other programs like Concoord to fold the structures.

° 

Refining the structure in vacuo

The second step is to convert the roughly folded decoys to realistic proteins with correct hydrogen bonding patterns. This is done by the macro 'nmr_refinevacuo', which runs molecular dynamics simulations in vacuo using the NOVA force field. It performs several refinement cycles, some of which are done without non-bonded interactions, so that atoms can pass through each other and kinetic traps like knots in the peptide chain can be resolved. For each ensemble member, the structure with the lowest restraint violation energy is kept and passed on to the next step.

° 

Refining the structure in explicit solvent

The third step is to improve the structural quality of the final ensemble members (like Ramachandran plot or 3D packing interactions ) by using YASARA's most accurate simulation techniques: explicit solvent , electrostatics without cutoff and force fields optimized for protein structure refinement . This is done by the macro 'nmr_refinewater' in several cycles. The structure with the lowest restraint violation energy then becomes the final ensemble member.

° 

Analyzing the ensemble

The fourth and final step is to calculate a total energy for each ensemble member, which is now not limited to restraint violations , but includes force field and solvation energies . The results are saved as 'result.log'. If you own the Twinset, they also include the three most important WHAT_CHECKs: RAMCHK, BBCCHK and QUACHK.

The ensemble members are then sorted with respect to this energy, so that the first structure in the ensemble is the best, then superposed on their secondary structure elements and saved together in one PDB file, by default 'ensemble.pdb'.

All the details can be found in the macro 'nmr_analyze'.

° 

Handling special cases like Cys-bridges, oligomers and metalloproteins

If you are trying to solve an unusual protein structure, here are some hints:

  • Treating cysteine bridges: The initial folding step is always done with protonated SG atoms and thus without cysteine bridges. If you know from some other experiment which cysteines are bridged, store the residue numbers in 'cysbridgelist' (see nmr_setdefaults). Alternatively, YASARA can automatically link cysteines that get close enough during the refinement in vacuo.

  • Treating metal binding sites with ions: Put the distance restraints involving ions into a second restraint file and run the initial folding step without any ions present. Then add the ions, place them at the center of the protein and then continue with step 2 and both restraint files.

  • Special residue numbering: If the first residue in the sequence does not have the number '1' in the restraint file, the easiest solution is to renumber the residues using the RenumberRes command, right after the linear peptide chain has been built in nmr_fold by the BuildMol command.

° 

Avoiding problems with hydrogen nomenclature

There are currently three common schemes for naming equivalent hydrogens bound to the same heavy atom: PDB, IUPAC/PDB3 and XPLOR. From an objective point of view, the PDB scheme is the smartest one: numbering hydrogens in the first column not only avoids a problematic misalignment of longer names, but also increases the information content: if hydrogen names differ only by the number in the first column, they are known to be bound to the same heavy atom. In July 2007, the PDB changed the naming scheme in all files to PDB V3, which is mostly like IUPAC and thus inherits its consistency problems. Trouble with hydrogen nomenclature are typically a major source for loss of time in NMR structure determination and analysis.

Since user friendlyness is a primary goal of YASARA, we searched hard for a solution. In the end, it turned out that many hydrogen related problems in computational chemistry can be magically solved by learning from nature: if quantum chemistry itself can hardly distinguish these hydrogens and MD force fields thus assign identical charges, why force different names upon them? Consequently YASARA simply removes the numbers from equivalent hydrogens.

While this approach is consistent within YASARA, other programs heavily depend on a certain hydrogen nomenclature. To ensure optimal interoperability, YASARA therefore lets you choose a certain atom naming scheme when saving a molecule in PDB or other formats .

Here are a few answers to common questions concerning this approach:

Q: My high quality NMR spectrum allows me to distinguish the chemical shifts of two methylene hydrogens. How do I assign the stereo-specific restraints?

A: Number the hydrogens any way you want in the XPLOR formatted restraint input file and use a floating assignment to let YASARA resolve the ambiguity. As a rule of thumb: if a side-chain rotamer depends on whether a floating assignment or an exact stereo-specific one is used, then the structure is underdetermined anyway. If you need to use stereo-specific assignments, check the nomenclature translation tables below and beware of 'quantum mechanical' tunneling during high temperature simulations, which can lead to deviations from naming conventions.

Q: I am analyzing Protein/DNA interactions and looking at double hydrogen bonds between Asn/Gln side-chains and DNA bases. How do I select the IUPAC HD22/HE22 hydrogen which is on the same side as the OD1/OE1 oxygen?

A: Use the 'with minimum distance' selection operator, here shown for residue 'i':

ListAtom HD2 Res (i) with minimum distance from OD1 Res (i)

Q: I need to visually debug the hydrogen naming mess of another program, how shall I do that in YASARA if the hydrogen numbers are removed? I really want the hydrogen numbers back.

A: Well, then continue reading.

  • When YASARA loads a PDB file, it sorts the equivalent hydrogens bound to the same heavy atom by their name in ascending order, then the hydrogen number is removed. The information about the original hydrogen numbering is thus implicitly retained by the rank order of the hydrogens in the soup.

  • One exception applies to the above procedure: if the hydrogens are part of a methylene or amide group and XPLOR atom names are used (HB1 instead of 1HB etc.), then the sort order is reversed. The reason is that XPLOR uses a reversed nomenclature for methylene and amide hydrogens when compared to the official PDB standard.

  • Unless the PDB file is loaded with corrections disabled ('Correct=No'), YASARA then swaps the hydrogens in methylene, amide and guanidine groups so that the official PDB conventions are met.

  • When you click on a hydrogen atom, its name is displayed in the HUD, together with its rank order in the group of equivalent hydrogens, e.g. HB (1 of 2). The hydrogen with the lowest atom number in the soup is ranked first.

  • This leads to the following translation tables for the various nomenclatures, in all of which YASARA adopts the official PDB numbering:

Methylene hydrogens:
YASARA PDB IUPAC/PDB3XPLOR
HX (1 of 2)1HXHX2 HX2
HX (2 of 2)2HXHX3 HX1

Amide hydrogens (Asn/Gln):
YASARA PDB IUPAC/PDB3 XPLOR
HD2 (1 of 2) 1HD2HD22 HD22
HD2 (2 of 2)2HD2HD21 HD21
HE2 (1 of 2)1HE2HE22 HE22
HE2 (2 of 2) 2HE2HE21 HE21

All other hydrogens:
YASARA PDB IUPAC/PDB3XPLOR
HX (1 of 2,3)1HXHX1 HX1
HX (2 of 2,3)2HXHX2 HX2
HX (3 of 3) 3HXHX3 HX3

In short: the rank order displayed by YASARA is normally the same as the hydrogen number in the original PDB file. In methylene groups however, YASARA's rank order is one lower than the IUPAC/PDB3 nomenclature and flipped with respect to the XPLOR nomenclature. In amide groups, YASARA's rank order is flipped with respect to IUPAC/PDB3 and XPLOR nomenclatures.

  • When saving a PDB file or NMR restraints for use with other programs, YASARA provides a Format parameter that allows to select a specific hydrogen nomenclature.

° 

Making floating assignments

To activate floating assignments for all hydrogen atoms, set floating= 'Element H' in nmr_setdefaults.mcr.

The technical details of floating assignments

When you assign two resonances (e.g. 1.5ppm and 1.6ppm) of which you know that they are HG1# and HG2# of a certain valine residue, but you do not know whether HG1# belongs to 1.5 or 1.6 ppm (and the same for HG2#), you can make a 'floating assignment' and leave the choice to YASARA. During the structure determination process, YASARA will then automatically pick the assignment that minimizes the restraint violations.

Since such an uncertainty in the assignment translates to an uncertainty in the atom positions, YASARA borrows the classical uncertainty indicator from X-ray crystallography - the B-factor - to handle floating assignments.

The procedure is as follows:

  • The default B-factor of atoms (e.g. a peptide chain built with BuildMol) is 0. If an assignment involves atoms with B-factor 0, it is assumed to be certain. Before you start a simulation with distance restraints or calculate violation energies, you tell YASARA which atoms to consider for floating assignments by setting their B-factors to 25. This is done automatically in the macro nmr_solve.mcr using the command 'BFactorAtom (floating),25'. If all atoms in an assignment have a B-factor > 0, this shows YASARA that some uncertainty is involved.

  • YASARA then analyzes all atoms with a B-factor > 0 (normally 25, see above) to find atoms or atom groups whose assignments could potentially be swapped to improve the fit to the restraints. This requires a) that the residue contains a second, chemically equivalent atom (group), b) that there is at least one restraint assigned to the atom (group), and c) that there are no other restraints assigned to a subset of the atom group (which would be a bug in the restraint file). Typical examples are the two hydrogens of methylene groups (CBeta of many amino acids etc.) and the two methyl groups of valine and leucine. The procedure does not rely on a priori knowledge about certain residues and will thus also work with unusual amino acids. YASARA sets the B-factors of all atoms identified as being part of floating assignments to 50.

  • During a simulation, or before calculating violation energies, YASARA analyzes the floating assignments to see if the fit can be improved by swapping the assignments (the percentage of assignments analyzed each simulation step can be influenced with the 'FloatGroups' parameter of the RestrainPar command). If the violation energy could be reduced by swapping the assignment, the B-factors of the involved atoms are set to 75.

Using the B-factor to encode the floating assignment status has a number of advantages:

B-factor Color Meaning
0 Blue Assignment is certain, this atom is not considered for floating assignments (the default)
25 Magenta This atom is allowed to be part of floating assignments (set by you)
50 Red This atom is permanently checked for floating assignments (set by YASARA)
75 Yellow This atom is part of a swapped assignment (set by YASARA)

  • The floating assignment status is also saved in PDB files (the B-factor is the number on the far right side).

  • YASARA's selection language can be used to restrict floating assignments to certain atom groups:


# Activate floating assignments...
# ...for all hydrogens
BFactorAtom Element H, 25
# ...for the methyl groups of Leu 18:
BFactorAtom HD? Res Leu 18, 25
# ...for all methylene groups:
BFactorAtom Element H with bond to Element C and with 1 bond angle to Element H, 25
# ...for all methyl groups:
BFactorAtom Element H with bond to Element C and with 2 bond angles to Element H, 25


# List atoms that are allowed to be part of floating assignments:
ListAtom BFactor>0
# List atoms that are permanently checked for floating assignments:
ListAtom BFactor>25
# List atoms that are part of swapped assignments:
ListAtom BFactor>50
# Save a list of atoms that are part of swapped assignments:
LogAs swapped.lst,append=No, ListAtom BFactor=75

Since the output of the above commands may be inconvenient to parse when YASARA is coupled to an automated assignment program like ARIA, the ListFloat command provides a compressed output:


# List all floating assignments in object 3gb1
ListFloatObj 3gb1
# List only the swapped floating assignments in object 3gb1
ListFloatObj 3gb1,Type=swapped
# Save the swapped assignments to disk
LogAs swapped.tbl,append=No, ListFloatObj 3gb1,Type=swapped

To pass information about many different floating assignments back to YASARA, simply collect them in a text file, e.g. 'floating.txt':


HG? Residue 65 Segment "   A"
HB? Residue 73 Segment "   B"

Read this file in the YASARA NMR macro, and activate floating assignments for the listed atom groups by setting their B-factors to 25:


for group in file floating.txt
  BFactorAtom (group),25

The similarity to XPLOR syntax can be maximized by replacing 'Residue' with 'resid' and 'Segment' with 'segid'.

° 

Create your own YASARA Movies

YASARA Movies (available from www.yasara.org/movies ) are multimedia presentations containing molecular animations you could not create with standard office software.

° 

Movies are written in Yanaconda

Because movies usually feature very complex animations, they cannot be clicked together with the mouse, but are written in any text editor using the Yanaconda macro language. This allows you to quickly copy and paste blocks from different movies to create a scaffold for your own productions.

° 

Movies are stored in the yasara/mov directory

Go to the yasara/mov directory and create a subdirectory with the name of your movie. In this subdirectory, you must then store all the data needed by your movie, including the Yanaconda macro (named like your movie, but with a .mcr extension).

As a start, open the macro yasara/mov/mdintro/mdintro.mcr, an introduction to molecular dynamics simulations. If you do not have this movie, download it from www.yasara.org/movies .

Change the header to match your movie. The chapter numbers in the TOPIC and TITLE fields define where your movie appears when you click on Help > Play help movie.

Example:

# YASARA MOVIE
# TOPIC:    6. NMR Spectroscopy
# TITLE:    6.4. KING - Killing Inaccurate NOEs by Guessing
# REQUIRES: Dynamics
# AUTHOR:   MyName
# LICENSE:  GPL

° 

Movies start with typical commands

Use the following commands at the beginning:

CD (MacroDir)
Change the working directory to the place of this macro, so that you do not have to specify paths when loading your files
Menu Off
Hide the menus, as a presentation with YASARA menus would look strange.
Console Off
Hide the console as well, which would otherwise display the commands run by the macro.
FullScreen On
Switch to fullscreen mode (helpful when connected to a video projector)
PointerStyle Large
Choose a large mouse pointer, so that even people in the last row can follow your presentation

° 

Movies can wait for a specified time or until you press a button

The probably most crucial command in a movie is Wait, which waits for a given condition and updates the screen while waiting .

Normally, YASARA redraws the screen after each command. This is convenient when working interactively, but it is not what you want in a movie, because often more than one command is needed to do something. If the intermediate steps were all shown on screen, this would be very annoying. Consequently YASARA stops updating the screen as soon as the console is switched off.

It is then your duty to tell YASARA that the macro has reached a point where it can be held to update the screen. This is achieved with the Wait command:

Wait 1
Wait for one screen update, then continue
Wait 60
Wait for 60 screen updates (= 1 second if FramesPerSec is 60)
Wait LeftButton
Wait for a click on the left mouse button
Wait RightButton
Wait for a click on the right mouse button
Wait ContinueButton
Display a continue button and wait for a click

While the macro is waiting, all automatic movements or rotations continue normally:


Console off
LoadPDB 1crn
# Start an automatic rotation in steps of 2 degrees about the Y-axis
AutoRotateObj 1crn,Y=2
# Wait for 180 screen updates (=360/2, one full rotation of 1crn)
Wait 180
# And stop after one full rotation
AutoRotateObj 1crn,Y=0

Alternatively, this macro would do the same:


Console off
LoadPDB 1crn
# Run 180 single rotation steps in a loop
for i=1 to 180
  RotateObj 1crn,Y=2
  Wait 1

Nevertheless, the first macro is the preferred one, because it allows YASARA to speedup the rotation on slow computers: if the graphics card is not capable of displaying 180 rotation steps fast enough, YASARA may decide to show only 90 and rotate 1crn in steps of four instead of two degrees.

Waiting for a pressed button is helpful for tutorials and talks. However, you maybe want two different versions of your movie: one which you can use during your talk, and one which actually displays the things you want to say on screen, so that people who missed your talk can download the movie instead. The second one would then also use a 'ContinueButton', so that the audience can e.g. rotate a protein structure before continuing.

This can be achieved with the 'Help' flag. If a movie is run via Help > Play help movie, the variable 'Help' is set to 1. You can therefore start a movie with the following code:


if Help
  # If we are showing a help movie, wait for clicks on the ContinueButton
  button='ContinueButton'
else
  # Otherwise just wait for a click on the left mouse button
  button='LeftButton'

And then use

Wait (button)

to wait for the button you selected above.

If you want, add your text:

if Help
  ShowMessage "Welcome everyone to my first YASARA movie. Today, I will tell you about.."

° 

Labels allow to jump back and forth between movie sections

Every new section in your movie should start with a label, followed by the Clear command.


#
# SLIDE 1: Title Page
#
TitlePage:
Clear
LoadPDB 1crn

The label makes sure that you can jump to this point using any of these methods:

  • The forward and backward buttons in the top menu line.

  • The Go commands you can find at Options > Macro & Movie.

  • If the menus are switched off, look at the Macro & Movie option in the context menu (RightClick on the background).

  • Pass the name of the label to the PlayMacro command.

The Clear command sets YASARA to a defined state, which is important because the user can jump to a label at any time while a macro is executed. In case you do not want to clear everything (e.g. some logos you loaded as image 1 and which you want to display during the entire presentation), use your own Clear command:


TitlePage:
# MyClear: Delete all objects and all but the first image
DelObj All
DelImage !1

If you want to temporarily skip a section during development, just comment it out using triple quotes """ in the beginning and the end.

° 

Work with YASARA and your text editor in parallel

To quickly develop a movie, follow these steps:

  • Write the initial part in your text editor (e.g. loading a structure or image), save it. You can use the StopMacro command to stop the macro at a certain point.

  • Run the macro in YASARA (click Options > Macro & Movie > Play macro, browse to the directory of your movie and select the .mcr file, or simply press <Ctrl>+<M> to rerun the last macro. At this point it is important that you run your movie as a normal macro, otherwise YASARA will restore the current scene after the movie ended - which is usually not what you want while debugging).

  • Move the protein you loaded to the right spot, get its current position and orientation and use these numbers in the macro with Pos and Ori or AutoMoveTo and AutoRotateTo to animate the protein. After some time, when you are used to YASARA's coordinate system, you will know the right numbers just by looking at the screen.

  • Iterate these steps.

° 

Movies can be imported from OpenOffice or PowerPoint

When preparing movies for a scientific presentation that may last several hours, writing everything from scratch in Yanaconda takes too long. An efficient solution is to prepare the slides in OpenOffice Impress or Microsoft Powerpoint, to convert them to a YASARA movie and to finally add animations on those slides that require them.

The procedure is as follows:

  • Create the slides in OpenOffice Impress and save them in the standard format of your OpenOffice version (e.g. *.sxi). To import a PowerPoint presentation (e.g. in *.ppt format), first open it in OpenOffice Impress, correct the little details that OpenOffice's import filter got wrong and save it also in OpenOffice standard format.

  • Start YASARA and click Options > Macro & Movie > Import movie, select the OpenOffice *.sxi file, then choose a name for your movie and the animation to change the slides. In addition to the default AlphaBlend, the following animation types are supported for entering the screen: fromLeft, fromRight, fromTop, fromBottom, Circle, Jump, BlendLeft, BlendRight, BlendTop, BlendBottom, Pop. Leaving the sceen can be done with either toLeft, toRight, toTop, toBottom, Circle, Drop, BlendLeft, BlendRight, BlendTop, BlendBottom or Pop.

  • YASARA then creates a new movie directory in yasara/mov/YourMovieName, runs OpenOffice to convert the slides to platform independent PNG images, closes OpenOffice, and creates a Yanaconda macro to show the slides with the requested transition animations. Due to a bug in the Windows version of OpenOffice, YASARA may hang right after the OpenOffice window has been closed. In this case, press <Ctrl>+<Alt>+<Del> to bring up the task manager and manually kill the process named 'soffice.exe'. Then the conversion can be completed.

  • Click Help > Play help movie and select your movie in the Multimedia section 6.99.

  • Finally, open the macro yasara/mov/YourMovieName/YourMovieName.mcr in a text editor and add molecular animations where you intended. You can also give each slide a clear name, so that you can easily jump to a certain part of your presentation when required by a question from the audience.

° 

Bitmap images can be displayed and animated

YASARA can load bitmap images in PNG or (worse, because uncompressed) BMP format. These bitmaps can be displayed as (transparent) back- and foreground graphics or attached as textures to objects.

In practice, you use the LoadPNG and LoadBMP commands to load the image, then show, move or animate it. Objects with attached images need to be created separately.

Example to show five images in a row:


for i=1 to 5
  LoadPNG MyImage(i)
  ShowImage MyImage(i)
  Wait (Button)
  DelImage MyImage(i)

° 

Text can be displayed using 3D letters

If you have only a small amount of text, you can display and animate it directly by creating a text object and printing to it using your favorite font:


MakeTextObj Title,Width=800,Height=80
Font Name=Baikal,Height=40%,Color=White,Spacing=2,Depth=100,DepthCol=ff8000
PosText 50%,50%,Justify=Center
Print "Molecular Dynamics Simulations\n"
Print "Watching Nature@Work"
PosObj Title,0,180,670

° 

The look of the movie depends on the aspect ratio of the YASARA window

YASARA automatically scales the scene such that movies always look the same, independent of the actual window size, as long as the aspect ratio of the view area does not change. E.g. if you create your movie with a view area of 1024x768 pixels, it will look exactly the same when played back in a view area of 640x480 pixels.

The 'view area' is the part of the YASARA window where you can see molecules. If you disable menus, then the view area is identical to the YASARA window size, otherwise it is lower by 48 pixels (the top and bottom menu bar are each 24 pixels high).

Example: You want to create a movie that will be played back on a cheap video beamer with 640x480 pixels resolution, but you do not want to make your window that small while creating the movie. In addition you still want to use the menus, but you do of course not want to show the menus with the beamer.

So the final movie resolution will be 640x480 pixels, that's an aspect ratio of 640/480 = 1.333. Now you need a window size that gives you the same aspect ratio for the view area. A typical choice for the view area is 1024x768 pixels (1024/768 = 1.333). As we also want to show the menu bars which require 48 pixels, the final window size is 1024x(768+48) = 1024x816. Press <Space> to bring up the console and type:

ScreenSize 1024,816

Now create your movie, and to play it back on the beamer, add the following commands:

ScreenSize 640,480
Menu off

Two little exceptions from these rules:

  • The head up displays are not rescaled. E.g. the location of the message created with ShowMessage depends on the window size.

  • The rules above apply only as long as the view area is <=1280x960 pixels. Above 1280x960 pixels YASARA starts to show you more of the scene and does not scale it up any more.

° 

Hints for using movies in important presentations

When you base your talk at a scientific conference or job interview on a YASARA movie, consider the following hints to make everything run smoothly:

  • Use Labels for each section to make sure that you can easily jump there via the context menu in reply to questions from the audience. Choose obvious names, not 'SlideX'.

  • Keep in mind that jumping to a label does not influence the scene or simulation parameters. Even if the first command after a label is Clear, the force field will still be the same as before the jump. Hence if a certain slide does not specify a force field but runs a simulation, it will use whichever force field was selected before the jump. In short, make sure to set ForceField, Cutoff, Interactions, Boundary, Longrange, TimeStep, SimSteps and TempCtrl correctly in each slide that shows a simulation.

  • When running a simulation, use 'Sim Pause,in=...' to make sure that it does not continue forever. Otherwise, if your talk takes longer than expected, you will impress your audience with a real-time simulation you have never seen before yourself - including potential surprises.