| Before any work can be done, you need to set the default parameters, which is handled by the macro
'nmr_setdefaults'. Obvious choices are:
- The filenames for sequence, restraints, structure ensemble and analysis results.
- The number of structures you want in the ensemble.
Not so obvious choices are:
- The pH at which the NMR spectrum was recorded: this information will be used in
the final explicit solvent refinement step to assign protonation states of amino
acid side-chains.
- The restraining function: YASARA supports the same functions as XPLOR,
described in detail here, the usual default is the 'SoftSquare' function.
- The restraining parameters: These globally affect all restraints and define
the distance averaging as well as the overall scaling factors.
Scaling factors are usually larger than the XPLOR equivalents due to internal
force calculation differences. YASARA uses two different parameter sets:
'defaultpar' for the final refinement stage and analysis, and 'strongpar'
for crude refinement with stronger forces.
- The option to correct cis-peptide bonds before prolines: Contrary to the
other 19 amino acids, prolines have a reasonable chance of allowing a preceding
cis-peptide bond. If the 'correctcispro' flag is set to 'yes', these cis-peptide
bonds will always be corrected, and any cis-peptide bonds that really occur in
the structure will thus be missed. It is therefore a good idea to also generate
an ensemble with this flag set to 'no', and check if the lowest energy members
all share a certain cis-proline.
- The list of cysteines that are bridged: If you already know which pairs of
cysteines are bridged, store their numbers in 'cysbridgelist'. Alternatively,
set 'cysbridgelist' to 'Auto' and let YASARA automatically link cysteines
close in space.
At this stage,
speed is more important than accuracy, and the structures generated this way are not realistic proteins yet. But they have helices at the right spot and the peptide chain running in the right direction to quickly arrive at the correct solution during the following molecular dynamics refinement.
The number of structures generated in this stage is specified by the
'structures' parameter in 'nmr_setdefaults' and is equal to the final number of ensemble members.
If you are running Linux, you can of course also skip this step and use other programs like Concoord to fold the structures.
The second step is to convert the roughly folded decoys to realistic proteins with correct hydrogen bonding patterns. This is done by the macro
'nmr_refinevacuo', which runs molecular dynamics simulations in vacuo using the NOVA force field. It performs several refinement cycles,
some of which are done without non-bonded interactions, so that atoms can pass through each other and kinetic traps like knots in the peptide chain can be resolved. For each ensemble member,
the structure with the lowest restraint violation energy is kept and passed on to the next step.
The ensemble members are then sorted with respect to this energy,
so that the first structure in the ensemble is the best, then superposed on their secondary structure elements and saved together in one PDB file,
by default 'ensemble.pdb'. All the details can be found in the macro
'nmr_analyze'. If you are trying to solve an unusual protein structure,
here are some hints:
- Treating cysteine bridges: The initial folding step is always done
with protonated SG atoms and thus without cysteine bridges. If you know from some other experiment
which cysteines are bridged, store the residue numbers in 'cysbridgelist' (see nmr_setdefaults).
Alternatively, YASARA can automatically link cysteines that get close enough during
the refinement in vacuo.
- Treating metal binding sites with ions: Put the distance restraints involving
ions into a second restraint file and run the initial folding step without any ions
present. Then add the ions, place them at the
center of the protein and then continue with step 2 and both restraint files.
- Special residue numbering: If the first residue in the sequence does not
have the number '1' in the restraint file, the easiest solution is to renumber
the residues using the RenumberRes command, right after the linear peptide
chain has been built in nmr_fold by the BuildMol command.
There are currently three common schemes for naming equivalent hydrogens bound to the same heavy atom: PDB,
IUPAC/PDB3 and XPLOR. From an objective point of view, the PDB scheme is the smartest one: numbering hydrogens in the first column not only avoids a problematic misalignment of longer names,
but also increases the information content: if hydrogen names differ only by the number in the first column,
they are known to be bound to the same heavy atom. In July 2007, the PDB changed the naming scheme in all files to PDB V3,
which is mostly like IUPAC and thus inherits its consistency problems. Trouble with hydrogen nomenclature are typically a major source for loss of time in NMR structure determination and analysis.
Since user friendlyness is a primary goal of YASARA, we searched hard for a solution. In the end,
it turned out that many hydrogen related problems in computational chemistry can be magically solved by learning from nature: if quantum chemistry itself can hardly distinguish these hydrogens and MD force fields thus assign identical charges,
why force different names upon them? Consequently YASARA simply removes the numbers from equivalent hydrogens.
While this approach is consistent within YASARA, other programs heavily depend on a certain hydrogen nomenclature. To ensure optimal interoperability,
YASARA therefore lets you choose a certain atom naming scheme when saving a molecule in
PDB or other formats
. Here are a few answers to common questions concerning this approach:
Q: My high quality NMR spectrum allows me to distinguish the chemical shifts of two methylene hydrogens. How do I assign the stereo-specific restraints?
A: Number the hydrogens any way you want in the XPLOR formatted restraint input file and use a
floating assignment to let YASARA resolve the ambiguity. As a rule of thumb: if a side-chain rotamer depends on whether a floating assignment or an exact stereo-specific one is used,
then the structure is underdetermined anyway. If you need to use stereo-specific assignments,
check the nomenclature translation tables below and beware of
'quantum mechanical' tunneling during high temperature simulations, which can lead to deviations from naming conventions.
Q: I am analyzing Protein/DNA interactions and looking at double hydrogen bonds between Asn/Gln side-chains and DNA bases. How do I select the IUPAC HD22/HE22 hydrogen which is on the same side as the OD1/OE1 oxygen?
A: Use the 'with minimum distance' selection operator,
here shown for residue 'i':
ListAtom HD2 Res (i) with minimum distance from OD1 Res (i)
Q: I need to visually debug the hydrogen naming mess of another program,
how shall I do that in YASARA if the hydrogen numbers are removed? I really want the hydrogen numbers back.
A: Well, then continue reading.
- When YASARA loads a PDB file, it sorts the equivalent hydrogens bound to the
same heavy atom by their name in ascending order, then the hydrogen number is removed.
The information about the original hydrogen numbering is thus implicitly retained
by the rank order of the hydrogens in the soup.
- One exception applies to the above procedure: if the hydrogens are part of
a methylene or amide group and XPLOR atom names are used (HB1 instead of 1HB etc.),
then the sort order is reversed. The reason is that XPLOR uses a reversed nomenclature
for methylene and amide hydrogens when compared to the official PDB standard.
- Unless the PDB file is loaded with corrections disabled ('Correct=No'),
YASARA then swaps the hydrogens in methylene, amide and guanidine groups so that the
official PDB conventions are met.
- When you click on a hydrogen atom, its name is displayed in the HUD, together
with its rank order in the group of equivalent hydrogens, e.g. HB (1 of 2). The
hydrogen with the lowest atom number in the soup is ranked first.
- This leads to the following translation tables for the various nomenclatures,
in all of which YASARA adopts the official PDB numbering:
Methylene hydrogens: | | YASARA | PDB |
IUPAC/PDB3 | XPLOR | | HX
(1 of 2) | 1HX | HX2 | HX2 |
| HX (2 of 2) | 2HX | HX3
| HX1 | |
Amide hydrogens (Asn/Gln): |
| | YASARA | PDB | IUPAC/PDB3 |
XPLOR | | HD2 (1 of 2) |
1HD2 | HD22 | HD22 | | HD2
(2 of 2) | 2HD2 | HD21 | HD21 |
| HE2 (1 of 2) | 1HE2 | HE22
| HE22 | | HE2 (2 of 2) |
2HE2 | HE21 | HE21 | |
|
All other hydrogens: | | YASARA | PDB |
IUPAC/PDB3 | XPLOR | | HX
(1 of 2,3) | 1HX | HX1 | HX1 |
| HX (2 of 2,3) | 2HX | HX2
| HX2 | | HX (3 of 3) |
3HX | HX3 | HX3 | |
|
In short: the rank order displayed by YASARA is normally the same as the hydrogen number in the original PDB file. In methylene groups however,
YASARA's rank order is one lower than the IUPAC/PDB3 nomenclature and flipped with respect to the XPLOR nomenclature. In amide groups,
YASARA's rank order is flipped with respect to IUPAC/PDB3 and XPLOR nomenclatures.
- When saving a PDB file or NMR restraints for use with other programs,
YASARA provides a Format parameter that allows to select a specific
hydrogen nomenclature.
To activate floating assignments for all hydrogen atoms, set floating=
'Element H' in nmr_setdefaults.mcr. The technical details of floating assignments
When you assign two resonances (e.g. 1.5ppm and 1.6ppm) of which you know that they are HG1# and HG2# of a certain valine residue,
but you do not know whether HG1# belongs to 1.5 or 1.6 ppm (and the same for HG2#),
you can make a 'floating assignment' and leave the choice to YASARA. During the structure determination process,
YASARA will then automatically pick the assignment that minimizes the restraint violations.
Since such an uncertainty in the assignment translates to an uncertainty in the atom positions,
YASARA borrows the classical uncertainty indicator from X-ray crystallography -
the B-factor - to handle floating assignments. The procedure is as follows:
- The default B-factor of atoms (e.g. a peptide chain built with BuildMol) is 0.
If an assignment involves atoms with B-factor 0, it is assumed to be certain.
Before you start a simulation with distance restraints
or calculate violation energies, you tell YASARA which atoms to
consider for floating assignments by setting their B-factors to 25.
This is done automatically in the macro nmr_solve.mcr using the command
'BFactorAtom (floating),25'.
If all atoms in an assignment have a B-factor > 0, this shows YASARA that some uncertainty is involved.
- YASARA then analyzes all atoms with a B-factor > 0 (normally 25, see above) to find atoms or
atom groups whose assignments could potentially be swapped to improve the fit to the restraints.
This requires a) that the residue contains a second, chemically equivalent atom (group),
b) that there is at least one restraint assigned to the atom (group), and c) that
there are no other restraints assigned to a subset of the atom group (which would
be a bug in the restraint file).
Typical examples are the two hydrogens of methylene groups (CBeta of many amino acids etc.)
and the two methyl groups of valine and leucine. The procedure does not rely
on a priori knowledge about certain residues and will thus also work with unusual amino acids.
YASARA sets the B-factors of all atoms identified as being part of floating assignments to 50.
- During a simulation, or before calculating violation energies, YASARA analyzes the
floating assignments to see if the fit can be improved by swapping the assignments
(the percentage of assignments analyzed each simulation step can be influenced with the
'FloatGroups' parameter of the RestrainPar command).
If the violation energy could be reduced by swapping the assignment, the B-factors of the involved atoms are set to 75.
Using the B-factor to encode the floating assignment status has a number of advantages:
| | B-factor |
Color | Meaning | | 0
| Blue | Assignment is certain, this atom is not considered for floating assignments
(the default) | | 25 | Magenta |
This atom is allowed to be part of floating assignments (set by you) |
| 50 | Red | This atom is permanently checked for floating assignments
(set by YASARA) | | 75 | Yellow |
This atom is part of a swapped assignment (set by YASARA) |
|
- The floating assignment status is also saved in PDB files (the B-factor is
the number on the far right side).
- YASARA's selection language can be used to restrict floating assignments to certain
atom groups:
# Activate floating assignments...
# ...for all hydrogens
BFactorAtom Element H, 25
# ...for the methyl groups of Leu 18:
BFactorAtom HD? Res Leu 18, 25
# ...for all methylene groups:
BFactorAtom Element H with bond to Element C and with 1 bond angle to Element H, 25
# ...for all methyl groups:
BFactorAtom Element H with bond to Element C and with 2 bond angles to Element H, 25
# List atoms that are allowed to be part of floating assignments:
ListAtom BFactor>0
# List atoms that are permanently checked for floating assignments:
ListAtom BFactor>25
# List atoms that are part of swapped assignments:
ListAtom BFactor>50
# Save a list of atoms that are part of swapped assignments:
LogAs swapped.lst,append=No, ListAtom BFactor=75
Since the output of the above commands may be inconvenient to parse when YASARA is coupled to an automated assignment program like ARIA,
the ListFloat command provides a compressed output:
# List all floating assignments in object 3gb1
ListFloatObj 3gb1
# List only the swapped floating assignments in object 3gb1
ListFloatObj 3gb1,Type=swapped
# Save the swapped assignments to disk
LogAs swapped.tbl,append=No, ListFloatObj 3gb1,Type=swapped
To pass information about many different floating assignments back to YASARA,
simply collect them in a text file, e.g. 'floating.txt':
HG? Residue 65 Segment " A"
HB? Residue 73 Segment " B"
Read this file in the YASARA NMR macro, and activate floating assignments for the listed atom groups by setting their B-factors to
25:
for group in file floating.txt
BFactorAtom (group),25
The similarity to XPLOR syntax can be maximized by replacing
'Residue' with 'resid' and 'Segment' with 'segid'. |