Endonuclease PvuII (1PVI) DNA - GATTACAGATTACA
CAP - Catabolite gene Activating Protein (1BER)
DNA - GATTACAGATTACAGATTACA Endonuclease PvuII bound to palindromic DNA recognition site CAGCTG (1PVI) DNA - GATTACAGATTACAGATTACA TBP - TATA box Binding Protein (1C9B)
CAP - Catabolite gene Activating Protein (1BER)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
GCN4 - leucine zipper transcription factor bound to palindromic DNA recognition site ATGAC(G)TCAT (1YSA)
TBP - TATA box Binding Protein (1C9B)
 

° 

Avoiding problems with hydrogen nomenclature

There are currently three common schemes for naming equivalent hydrogens bound to the same heavy atom: PDB, IUPAC/PDB3 and XPLOR. From an objective point of view, the PDB scheme is the smartest one: numbering hydrogens in the first column not only avoids a problematic misalignment of longer names, but also increases the information content: if hydrogen names differ only by the number in the first column, they are known to be bound to the same heavy atom. In July 2007, the PDB changed the naming scheme in all files to PDB V3, which is mostly like IUPAC and thus inherits its consistency problems. Trouble with hydrogen nomenclature are typically a major source for loss of time in NMR structure determination and analysis.

Since user friendlyness is a primary goal of YASARA, we searched hard for a solution. In the end, it turned out that many hydrogen related problems in computational chemistry can be magically solved by learning from nature: if quantum chemistry itself can hardly distinguish these hydrogens and MD force fields thus assign identical charges, why force different names upon them? Consequently YASARA simply removes the numbers from equivalent hydrogens.

While this approach is consistent within YASARA, other programs heavily depend on a certain hydrogen nomenclature. To ensure optimal interoperability, YASARA therefore lets you choose a certain atom naming scheme when saving a molecule in PDB or other formats .

Here are a few answers to common questions concerning this approach:

Q: My high quality NMR spectrum allows me to distinguish the chemical shifts of two methylene hydrogens. How do I assign the stereo-specific restraints?

A: Number the hydrogens any way you want in the XPLOR formatted restraint input file and use a floating assignment to let YASARA resolve the ambiguity. As a rule of thumb: if a side-chain rotamer depends on whether a floating assignment or an exact stereo-specific one is used, then the structure is underdetermined anyway. If you need to use stereo-specific assignments, check the nomenclature translation tables below and beware of 'quantum mechanical' tunneling during high temperature simulations, which can lead to deviations from naming conventions.

Q: I am analyzing Protein/DNA interactions and looking at double hydrogen bonds between Asn/Gln side-chains and DNA bases. How do I select the IUPAC HD22/HE22 hydrogen which is on the same side as the OD1/OE1 oxygen?

A: Use the 'with minimum distance' selection operator, here shown for residue 'i':

ListAtom HD2 Res (i) with minimum distance from OD1 Res (i)

Q: I need to visually debug the hydrogen naming mess of another program, how shall I do that in YASARA if the hydrogen numbers are removed? I really want the hydrogen numbers back.

A: Well, then continue reading.

  • When YASARA loads a PDB file, it sorts the equivalent hydrogens bound to the same heavy atom by their name in ascending order, then the hydrogen number is removed. The information about the original hydrogen numbering is thus implicitly retained by the rank order of the hydrogens in the soup.

  • One exception applies to the above procedure: if the hydrogens are part of a methylene or amide group and XPLOR atom names are used (HB1 instead of 1HB etc.), then the sort order is reversed. The reason is that XPLOR uses a reversed nomenclature for methylene and amide hydrogens when compared to the official PDB standard.

  • Unless the PDB file is loaded with corrections disabled ('Correct=No'), YASARA then swaps the hydrogens in methylene, amide and guanidine groups so that the official PDB conventions are met.

  • When you click on a hydrogen atom, its name is displayed in the HUD, together with its rank order in the group of equivalent hydrogens, e.g. HB (1 of 2). The hydrogen with the lowest atom number in the soup is ranked first.

  • This leads to the following translation tables for the various nomenclatures, in all of which YASARA adopts the official PDB numbering:

Methylene hydrogens:
YASARA PDB IUPAC/PDB3XPLOR
HX (1 of 2)1HXHX2 HX2
HX (2 of 2)2HXHX3 HX1

Amide hydrogens (Asn/Gln):
YASARA PDB IUPAC/PDB3 XPLOR
HD2 (1 of 2) 1HD2HD22 HD22
HD2 (2 of 2)2HD2HD21 HD21
HE2 (1 of 2)1HE2HE22 HE22
HE2 (2 of 2) 2HE2HE21 HE21

All other hydrogens:
YASARA PDB IUPAC/PDB3XPLOR
HX (1 of 2,3)1HXHX1 HX1
HX (2 of 2,3)2HXHX2 HX2
HX (3 of 3) 3HXHX3 HX3

In short: the rank order displayed by YASARA is normally the same as the hydrogen number in the original PDB file. In methylene groups however, YASARA's rank order is one lower than the IUPAC/PDB3 nomenclature and flipped with respect to the XPLOR nomenclature. In amide groups, YASARA's rank order is flipped with respect to IUPAC/PDB3 and XPLOR nomenclatures.

  • When saving a PDB file or NMR restraints for use with other programs, YASARA provides a Format parameter that allows to select a specific hydrogen nomenclature.