Converting PDB to SMILES and matching atom orders

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Converting PDB to SMILES and matching atom orders

Sam Tonddast-Navaei
I am trying to read a small molecule from PDB file and match its atom numbers to the same molecule in a SDF file. I have tried both matching SMART patterns and using OBIsomorphismMapper, both work for 70% of cases. However there are cases for which OpenBabel can not simply get the right SMILES from just the PDB file (e.g.  adds extra double bonds or miss ones). For example in PDB ID 1KF6 and ligand name IPE, if you generate SMILES from PDB you will get OCCO/C=C/OCCO/C=C/O/C=C/O while the correct one should be OCCOCCOCCOCCOCCO. I would appreciate if someone can give me some advise on this.

Thanks,
Sam

------------------------------------------------------------------------------

_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Converting PDB to SMILES and matching atom orders

Stefano Forli
Sam,
I wouldn't trust the PDB for either bond order or connectivity.
In OB the order is inferred from the distances, which are dependent on the quality of the
structure and the software used for the model refinement.

For example, I found that in structures generated a version of Refmac, the oxygen double
bond lenght in the DMSO is wrong, so OB assigns an sp3 hybridization to it.

If you have to do that in a high-throughput fashion, I would use the ligand ID to download
the actual SMILES from the ligand repository at the PDB and use that for your analysis.

Best,

S


On 08/31/2016 01:51 PM, Sam Tonddast-Navaei wrote:

> I am trying to read a small molecule from PDB file and match its atom numbers to the same
> molecule in a SDF file. I have tried both matching SMART patterns and
> using OBIsomorphismMapper, both work for 70% of cases. However there are cases for which
> OpenBabel can not simply get the right SMILES from just the PDB file (e.g.  adds extra
> double bonds or miss ones). For example in PDB ID 1KF6 and ligand name IPE, if you
> generate SMILES from PDB you will get OCCO/C=C/OCCO/C=C/O/C=C/O while the correct one
> should be OCCOCCOCCOCCOCCO. I would appreciate if someone can give me some advise on this.
>
> Thanks,
> Sam
> ------------------------------------------------------------------------------------------
> NOTE: This message was trained as non-spam. If this is wrong, please correct the training
> as soon as possible.
>
>
> ------------------------------------------------------------------------------
>
>
>
> _______________________________________________
> OpenBabel-discuss mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>

--

  Stefano Forli, PhD

  Assistant Professor of ISCB
  Molecular Graphics Laboratory

  Dept. of Integrative Structural
  and Computational Biology, MB-112A
  The Scripps Research Institute
  10550  North Torrey Pines Road
  La Jolla,  CA 92037-1000,  USA.

     tel: +1 (858)784-2055
     fax: +1 (858)784-2860
     email: [hidden email]
     http://www.scripps.edu/~forli/

------------------------------------------------------------------------------
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Converting PDB to SMILES and matching atom orders

Sam Tonddast-Navaei
Dear Sefano,

 Thanks for the reply, I was suspecting the same. However I need the PDB 3D coordinates mapped on the original SMILES, that's why I was trying reading the PDB. I am wondering if there are any methods to map the PDB coordinates of a molecule to its original SMILES?

Thanks,
Sam 

On Wed, Aug 31, 2016 at 5:14 PM, Stefano Forli <[hidden email]> wrote:
Sam,
I wouldn't trust the PDB for either bond order or connectivity.
In OB the order is inferred from the distances, which are dependent on the quality of the structure and the software used for the model refinement.

For example, I found that in structures generated a version of Refmac, the oxygen double bond lenght in the DMSO is wrong, so OB assigns an sp3 hybridization to it.

If you have to do that in a high-throughput fashion, I would use the ligand ID to download the actual SMILES from the ligand repository at the PDB and use that for your analysis.

Best,

S



On 08/31/2016 01:51 PM, Sam Tonddast-Navaei wrote:
I am trying to read a small molecule from PDB file and match its atom numbers to the same
molecule in a SDF file. I have tried both matching SMART patterns and
using OBIsomorphismMapper, both work for 70% of cases. However there are cases for which
OpenBabel can not simply get the right SMILES from just the PDB file (e.g.  adds extra
double bonds or miss ones). For example in PDB ID 1KF6 and ligand name IPE, if you
generate SMILES from PDB you will get OCCO/C=C/OCCO/C=C/O/C=C/O while the correct one
should be OCCOCCOCCOCCOCCO. I would appreciate if someone can give me some advise on this.

Thanks,
Sam
------------------------------------------------------------------------------------------
NOTE: This message was trained as non-spam. If this is wrong, please correct the training
as soon as possible.


------------------------------------------------------------------------------



_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss


--

 Stefano Forli, PhD

 Assistant Professor of ISCB
 Molecular Graphics Laboratory

 Dept. of Integrative Structural
 and Computational Biology, MB-112A
 The Scripps Research Institute
 10550  North Torrey Pines Road
 La Jolla,  CA 92037-1000,  USA.

    tel: <a href="tel:%2B1%20%28858%29784-2055" value="+18587842055" target="_blank">+1 (858)784-2055
    fax: <a href="tel:%2B1%20%28858%29784-2860" value="+18587842860" target="_blank">+1 (858)784-2860
    email: [hidden email]
    http://www.scripps.edu/~forli/


------------------------------------------------------------------------------

_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Converting PDB to SMILES and matching atom orders

David Hall
In reply to this post by Sam Tonddast-Navaei
It is a bit of a hack, but for certain applications where I know it is likely fine, I use networkx to do subgraph isomorphism while considering only the elements and not looking at the order of the bonds. I can trivially draw a molecule where this does the wrong thing, but the majority of the time, it does the right thing.

For the system you are discussing, here's an example using the attached script:

$ ./isomorph.py 1PE_model.sdf 1kf6_1pe.pdb 
[{0: 0,
  1: 1,
  2: 2,
  3: 3,
  4: 4,
  5: 5,
  6: 6,
  7: 7,
  8: 8,
  9: 9,
  10: 10,
  11: 11,
  12: 12,
  13: 13,
  14: 14,
  15: 15},
 {0: 15,
  1: 13,
  2: 14,
  3: 12,
  4: 11,
  5: 10,
  6: 9,
  7: 8,
  8: 7,
  9: 6,
  10: 5,
  11: 4,
  12: 3,
  13: 1,
  14: 2,
  15: 0}]

Since 1PE is symmetric, it outputs the forward ordering of the 1st -> 1st, 2nd -> 2nd, etc. and reverse ordering of 1st->16th, 2nd  -> 15th.

Hopefully the attached script gives an idea of how to incorporate this idea into your code.


On Wed, Aug 31, 2016 at 4:51 PM, Sam Tonddast-Navaei <[hidden email]> wrote:
I am trying to read a small molecule from PDB file and match its atom numbers to the same molecule in a SDF file. I have tried both matching SMART patterns and using OBIsomorphismMapper, both work for 70% of cases. However there are cases for which OpenBabel can not simply get the right SMILES from just the PDB file (e.g.  adds extra double bonds or miss ones). For example in PDB ID 1KF6 and ligand name IPE, if you generate SMILES from PDB you will get OCCO/C=C/OCCO/C=C/O/C=C/O while the correct one should be OCCOCCOCCOCCOCCO. I would appreciate if someone can give me some advise on this.

Thanks,
Sam

------------------------------------------------------------------------------

_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss



------------------------------------------------------------------------------

_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss

isomorph.py (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Converting PDB to SMILES and matching atom orders

Sam Tonddast-Navaei
David,

 Thank you very much for the reply, very smart hack!

Sam

On Wed, Aug 31, 2016 at 7:27 PM, David Hall <[hidden email]> wrote:
It is a bit of a hack, but for certain applications where I know it is likely fine, I use networkx to do subgraph isomorphism while considering only the elements and not looking at the order of the bonds. I can trivially draw a molecule where this does the wrong thing, but the majority of the time, it does the right thing.

For the system you are discussing, here's an example using the attached script:

$ ./isomorph.py 1PE_model.sdf 1kf6_1pe.pdb 
[{0: 0,
  1: 1,
  2: 2,
  3: 3,
  4: 4,
  5: 5,
  6: 6,
  7: 7,
  8: 8,
  9: 9,
  10: 10,
  11: 11,
  12: 12,
  13: 13,
  14: 14,
  15: 15},
 {0: 15,
  1: 13,
  2: 14,
  3: 12,
  4: 11,
  5: 10,
  6: 9,
  7: 8,
  8: 7,
  9: 6,
  10: 5,
  11: 4,
  12: 3,
  13: 1,
  14: 2,
  15: 0}]

Since 1PE is symmetric, it outputs the forward ordering of the 1st -> 1st, 2nd -> 2nd, etc. and reverse ordering of 1st->16th, 2nd  -> 15th.

Hopefully the attached script gives an idea of how to incorporate this idea into your code.


On Wed, Aug 31, 2016 at 4:51 PM, Sam Tonddast-Navaei <[hidden email]> wrote:
I am trying to read a small molecule from PDB file and match its atom numbers to the same molecule in a SDF file. I have tried both matching SMART patterns and using OBIsomorphismMapper, both work for 70% of cases. However there are cases for which OpenBabel can not simply get the right SMILES from just the PDB file (e.g.  adds extra double bonds or miss ones). For example in PDB ID 1KF6 and ligand name IPE, if you generate SMILES from PDB you will get OCCO/C=C/OCCO/C=C/O/C=C/O while the correct one should be OCCOCCOCCOCCOCCO. I would appreciate if someone can give me some advise on this.

Thanks,
Sam

------------------------------------------------------------------------------

_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss




------------------------------------------------------------------------------

_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Loading...