[Open Babel] More User docs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

[Open Babel] More User docs

drc-2
Fingerprints

You can see the available fingerprints by typing the following command

PROMPT>babel –F
FP2 -- Indexes linear fragments up to 7 atoms.
FP3 -- SMARTS patterns specified in the file patterns.txt

At present there are two types of fingerprints, FP2
The other fingerprint type FP3 uses a series of SMARTS queries that are stored in /usr/local/share/openbabel/patterns.txt (You can add your own queries to this file).
These fingerprints can be used in a variety of ways:- the command

PROMPT> babel  ‘mymols.sdf’  –ofpt

>MOL_00000067
>MOL_00000083   Tanimoto from MOL_00000067 = 0.810811
>MOL_00000105   Tanimoto from MOL_00000067 = 0.833333
>MOL_00000296   Tanimoto from MOL_00000067 = 0.425926
>MOL_00000320   Tanimoto from MOL_00000067 = 0.534884
>MOL_00000328   Tanimoto from MOL_00000067 = 0.511111
>MOL_00000338   Tanimoto from MOL_00000067 = 0.522727
>MOL_00000354   Tanimoto from MOL_00000067 = 0.534884
>MOL_00000378   Tanimoto from MOL_00000067 = 0.489362
>MOL_00000391   Tanimoto from MOL_00000067 = 0.489362
10 molecules converted

will give you the Tanimoto coefficient between the first molecule in mymols.sdf and each of the subsequent ones. You don’t have to have all the structures in the same file or
format. So the following command gives you the Tanimoto coefficient between a SMILES string in mysmiles.smi and all the molecules in mymols.sdf
 
PROMPT> babel  ‘mysmiles.smi’  ‘mymols.sdf’ –ofpt

>MOL_00000067   Tanimoto from first mol = 0.0888889
>MOL_00000083   Tanimoto from first mol = 0.0869565
>MOL_00000105   Tanimoto from first mol = 0.0888889
>MOL_00000296   Tanimoto from first mol = 0.0714286
>MOL_00000320   Tanimoto from first mol = 0.0888889
>MOL_00000328   Tanimoto from first mol = 0.0851064
>MOL_00000338   Tanimoto from first mol = 0.0869565
>MOL_00000354   Tanimoto from first mol = 0.0888889
>MOL_00000378   Tanimoto from first mol = 0.0816327
>MOL_00000391   Tanimoto from first mol = 0.0816327
11 molecules converted

If you wanted to know the similarity between only the substituted bromobenzenes in mymols.sdf then you might combine commands like this.

babel ‘mymols.sdf’ -ofpt -s 'c1ccccc1Br'
>MOL_00000067
>MOL_00000083   Tanimoto from MOL_00000067 = 0.810811
>MOL_00000105   Tanimoto from MOL_00000067 = 0.833333

You change the fingerprint using the following command.

babel ‘mymols.sdf’ -ofpt -xfFP3

================================

A quick question, is it possible to use the fingerprints to select the 5 most similar molecules or all molecules with a similarity >0.8
or do I need to build an index first?

Thanks

Chris



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: [Open Babel] More User docs

Chris Morley-3
[hidden email] wrote:

> Fingerprints
>
> You can see the available fingerprints by typing the following command
>
> PROMPT>babel –F
> FP2 -- Indexes linear fragments up to 7 atoms.
> FP3 -- SMARTS patterns specified in the file patterns.txt
>
> At present there are two types of fingerprints, FP2
> The other fingerprint type FP3 uses a series of SMARTS queries that are stored in /usr/local/share/openbabel/patterns.txt (You can add your own queries to this file).
> These fingerprints can be used in a variety of ways:- the command
>
> PROMPT> babel  ‘mymols.sdf’  –ofpt
>
>
>>MOL_00000067
>>MOL_00000083   Tanimoto from MOL_00000067 = 0.810811
>>MOL_00000105   Tanimoto from MOL_00000067 = 0.833333
>>MOL_00000296   Tanimoto from MOL_00000067 = 0.425926
>>MOL_00000320   Tanimoto from MOL_00000067 = 0.534884
>>MOL_00000328   Tanimoto from MOL_00000067 = 0.511111
>>MOL_00000338   Tanimoto from MOL_00000067 = 0.522727
>>MOL_00000354   Tanimoto from MOL_00000067 = 0.534884
>>MOL_00000378   Tanimoto from MOL_00000067 = 0.489362
>>MOL_00000391   Tanimoto from MOL_00000067 = 0.489362
>
> 10 molecules converted
>
> will give you the Tanimoto coefficient between the first molecule in mymols.sdf and each of the subsequent ones. You don’t have to have all the structures in the same file or
> format. So the following command gives you the Tanimoto coefficient between a SMILES string in mysmiles.smi and all the molecules in mymols.sdf
>  
> PROMPT> babel  ‘mysmiles.smi’  ‘mymols.sdf’ –ofpt
>
>
>>MOL_00000067   Tanimoto from first mol = 0.0888889
>>MOL_00000083   Tanimoto from first mol = 0.0869565
>>MOL_00000105   Tanimoto from first mol = 0.0888889
>>MOL_00000296   Tanimoto from first mol = 0.0714286
>>MOL_00000320   Tanimoto from first mol = 0.0888889
>>MOL_00000328   Tanimoto from first mol = 0.0851064
>>MOL_00000338   Tanimoto from first mol = 0.0869565
>>MOL_00000354   Tanimoto from first mol = 0.0888889
>>MOL_00000378   Tanimoto from first mol = 0.0816327
>>MOL_00000391   Tanimoto from first mol = 0.0816327
>
> 11 molecules converted
>
> If you wanted to know the similarity between only the substituted bromobenzenes in mymols.sdf then you might combine commands like this.
>
> babel ‘mymols.sdf’ -ofpt -s 'c1ccccc1Br'
>
>>MOL_00000067
>>MOL_00000083   Tanimoto from MOL_00000067 = 0.810811
>>MOL_00000105   Tanimoto from MOL_00000067 = 0.833333
>
>
> You change the fingerprint using the following command.
>
> babel ‘mymols.sdf’ -ofpt -xfFP3
>
> ================================
>
> A quick question, is it possible to use the fingerprints to select the 5 most similar molecules or all molecules with a similarity >0.8
> or do I need to build an index first?
>
At present you do have to build the index. To follow your
lead and give plenty of examples, you make the index by:

   babel mymols.sdf -ofs

This builds mymols.fs with the default fingerprint,
unfolded. To use it to find the top 5 matches to a
molecule in target.sdf:

   babel mymols.fs results.sdf -Starget.sdf -at5

or to get the matches with Tanimoto>0.6 to 1,2-dicyanobenzene:

   babel mymols.fs results.sdf -sN#Cc1ccccc1C#N -at0.6

FingerprintFormat was orginally intended for debugging,
but was extended as demonstrated in the original post. It
could be extended further to do the searching without an
index, which might be ok for small data sets. But the
output would have to be only to the console and the
command line options would start to get confusing - they
would be output options like -xt5 rather than input
options -at5 when using the FastsearchFormat. I'm inclined
not to extend FingerprintFormat in this way.

Chris



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss