Open Babel Error in reading PDB file

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Open Babel Error in reading PDB file

Sheng-Hung Wang
The error messages are showing below:
----------------------------------------------
WARNING: Problems reading a PDB file
Problems reading a HETATM or ATOM record.
According to the PDB specification,
columns 77-78 should contain the element symbol of an atom.
----------------------------------------------

Please note that the columns 77-78 in PDB format are optional. Actually, columns 77-78 are usually missed for the hetero atoms (ligands) in PDB database. It would be more applicable to read columns 13-14 for atom name.

Shawn

___________________________________________________
 您的生活即時通 - 溝通、娛樂、生活、工作一次搞定!
 http://messenger.yahoo.com.tw/

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Open Babel Error in reading PDB file

Andrew Dalke
On Nov 20, 2009, at 12:14 AM, Sheng-Hung Wang wrote:

> The error messages are showing below:
> ----------------------------------------------
> WARNING: Problems reading a PDB file
> Problems reading a HETATM or ATOM record.
> According to the PDB specification,
> columns 77-78 should contain the element symbol of an atom.
> ----------------------------------------------
>
> Please note that the columns 77-78 in PDB format are optional.  
> Actually, columns 77-78 are usually missed for the hetero atoms  
> (ligands) in PDB database. It would be more applicable to read  
> columns 13-14 for atom name.

This requires some history. Columns 77-78 are not optional according  
to the current spec

http://www.wwpdb.org/documentation/format32/sect9.html#ATOM
    The element symbol is always present on each ATOM record; charge  
is optional.

The change occurred with PDB format 2.0, which was in the mid-1990s.  
Before then the ATOM record ended at column 70 and those additional  
columns were ignored.

However, older records used the final columns to store line number  
and record number information. This was important for punch cards, in  
case you dropped your deck on the floor and needed to reorder them.

Here for example is a line from 1BRD with a REVDAT of 15-JAN-93.

ATOM     54  N   PRO     8      20.397 -15.569 -13.739  1.00  
20.00      1BRD 136

This also included a version number, so the '1BRDA' here

REMARK  10 THE DATA WAS COLLECTED ON 2-DIMENSIONAL CRYSTALS AND  
HENCE   1BRD  87
REMARK  10 THE C-AXIS REPEAT DOES NOT CORRESPOND TO A REAL REPEAT,  
BUT  1BRDA  2
REMARK  10 INSTEAD REFERS TO THE SAMPLING THAT IS USED TO DESCRIBE  
THE  1BRDA  3

means those lines came from the change made in 15-JUL-91. (Don't ask  
me why the line numbers restart for each version number, that is,  
going from line 87 to line 2. I have no idea.)

So some older PDB files follow a spec where that field should be  
ignored, while newer (less than about 15 years old) files follow a  
spec which *require* an element symbol in those columns.

On top of that, most programs which generate PDB files don't follow  
the PDB specification, and instead approximate it, for good reasons.  
That's okay because most PDB readers don't require strict adherence  
to the spec.


The OpenBabel code wants the element symbol to be present, which  
means it generates warnings on some PDB files generated based on the  
pre-2.0 format. Those files do exist. I have several here.

But OB only generates a warning for them, which seems acceptable.  
(Though it would be nice to also know the line number of the problem.)



What does your file have in those columns, and why is OB's behavior a  
problem for your code?



Also, your suggestion "It would be more applicable to read columns  
13-14 for atom name." has problems. It isn't that simple. Here is the  
comment from the 2.1 specification draft I have handy, which goes  
into much better detail than the current 3.x specification online.

> * Columns 77 - 78 contain the atom's element symbol (as given in the
> periodic table), right-justified. This is especially needed because  
> in some
> cases it has not been possible to follow the convention that  
> columns 13 - 14
> of the atom name contain the element symbol. The most common cases  
> are:
>
>      - In large het groups it sometimes is not possible to follow the
>      convention of having the first two characters be the chemical
>      symbol and still use atom names that are meaningful to users. A
>      example is nicotinamide adenine dinucleotide, atom names begin
>      with an A or N, depending on which portion of the molecule they
>      appear in, e.g., AC6 or NC6, AN1 or NN1.
>
>      - Hydrogen naming sometimes conflicts with IUPAC conventions. For
>      example, a hydrogen named HG11 in columns 13 - 16 is
>      differentiated from a mercury atom by the element symbol in
>      columns 77 - 78. Columns 13 - 16 present a unique name for each
>      atom.


Cheers,
>

                                Andrew
                                [hidden email]



------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Open Babel Error in reading PDB file

Geoffrey Hutchison

On Nov 20, 2009, at 1:00 PM, Andrew Dalke wrote:

> But OB only generates a warning for them, which seems acceptable.  
> (Though it would be nice to also know the line number of the problem.)

I think we can arrange line numbers in the warning. :-)

> Also, your suggestion "It would be more applicable to read columns  
> 13-14 for atom name." has problems. It isn't that simple. Here is the  
> comment from the 2.1 specification draft I have handy, which goes  
> into much better detail than the current 3.x specification online.

Indeed, Open Babel will use columns 13-14 when necessary, but stick to the 2.1 specification that they prefer the unambiguous data when present. The code to work out the element from columns 13-14 is a large set of special cases.

Thanks to Andrew for a very complete description of the issue and OB behavior.

-Geoff
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss