state of OB features

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
15 messages Options
Reply | Threaded
Open this post in threaded view
|

state of OB features

Andrew Dalke
Hi all,

  I've been asked by a company to evaluate different toolkits for them to use in-house. Geoff thought it might be better to ask on the list than ask him directly and privately, so I'm doing that.

  I've gone through their internal requirements now I'm seeing how it matched up to OB. Some of my conclusions are likely wrong and based on older version of OB, so I'm hoping people here can correct me.

  I'll apologize in advance that I won't be that accessible over the next couple of days, and likely won't be able to respond until Saturday or so. But I felt it was best to get it out now rather than wait.

  =====

  As with all cheminformatics toolkits, OB has ways to get access to the atoms and bonds of a molecule. It support molecular editing, so that atoms and bonds can be added, deleted, or modified as desired. Atoms, bonds, and molecules may have additional user-defined data associated with them.

  OB supports coordinates as part of the molecule (meaning that deleting the atom deletes the associated coordinates), and it supports multiple conformer structures.

  OB follows the Daylight approach where it has a standard chemistry model and all input structures are reperceived based on that model. There is no way to disable that option.

  OB is the foremost program for structure format support and interconversion.

  SD file support is complete for the v2000 and v3000, both for reading and writing. What I'm not sure about is the level of support for v3000. It's mostly support for the chemistry which is in v2000 but expressed differently in v3000, and for support for more than 999 atoms?

  SMILES support is good, although it doesn't have support for stereochemistry around double bonds. Excepting this lack, canonicalization is also good and widely used.

  OB does have PDB file support. I can't tell how good the chemistry perception is. For example, can it detect that a C-C bond is a double or triple bond instead of a single (eg, by looking at the bond length, or by understanding the residue names)?

  While OB does have a nearly uniform reader API (ie, I can point it to an SD file, SMILES file, etc and get molecules), and built-in gzip support, I do have to specify the format type manually. That is, there's no support for guessing the format based, for example, on the extension.

  OpenBabel has SMARTS support, but I can't tell how complete it is. I know it doesn't support double bond stereochemistry, but I think it's otherwise complete, including recursive SMARTS. Is there anything missing?

  OB also supports using a molecule as the query rather than a SMARTS.

  Once the match is made, it's easy to get access to the matched atoms and bonds, and match them up to the corresponding query atoms and bonds.

  The topic I know the least about is reactions. OB supports reaction SMILES and SMARTS, as well as RXN files. I don't have a good idea for how good that support is, and it's not something I used much, although my client does.

  In addition to the support for the query languages/formats, I can't tell how to use the reactions. How would I do a unimolecular reaction (eg, convert all of the carbons in CCCN to OOON)? How would I use a reaction for library generation (eg, convert CCC to first OCCN, then COCN, and lastly CCON)? Is it even possible? I looked but didn't find it.

  OB does support some fingerprints. There's a linear hash fingerprint similar to Daylight's and two feature fingerprint implementations, although only one is suggested. There's no MACCS key implementation. There is no support for large/sparse fingerprints, and the only implemented comparison method is the Tanimoto similarity.

 OB does not do depiction. For that case people should turn to other libraries, such as OASA.

 There's no MCS or scaffold identification code in OB. There is a descriptor framework system, support for different forcefields and minimization, and InChI support. There's no nomenclature support.

  OB is cross-platform (here meaning "Windows and Linux"), with access to the library from C++, Python, .Net and Java. The documentation is incomplete and sketchy, but because OB is used by a large number of people, there is support both through the mailing list and by doing a web search for others who have used the code.

  I have a metric for testing usability, and that's the number of lines of code needed to count the total number of atoms of all of the records in an input file, using one toolkit vs. pybel. OpenBabel suffers because of the overhead of creating an OBConversion.

  I have another metric for comparing error handling, which is to read an SD file with records containing errors (format errors and chemistry errors) and seeing if I can find the number of records which failed to be read in and the reason for the failure. I haven't figured how out to do that with OB.

  ====

Thanks in advance!

                                Andrew
                                [hidden email]



------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Noel O'Boyle
Administrator
On 8 June 2010 06:14, Andrew Dalke <[hidden email]> wrote:
> Hi all,
>
>  I've been asked by a company to evaluate different toolkits for them to use in-house. Geoff thought it might be better to ask on the list than ask him directly and privately, so I'm doing that.
>
>  I've gone through their internal requirements now I'm seeing how it matched up to OB. Some of my conclusions are likely wrong and based on older version of OB, so I'm hoping people here can correct me.
>
>  I'll apologize in advance that I won't be that accessible over the next couple of days, and likely won't be able to respond until Saturday or so. But I felt it was best to get it out now rather than wait.
>
>  =====

I don't know the answers to most of these things, but those I do I've answered:

>  As with all cheminformatics toolkits, OB has ways to get access to the atoms and bonds of a molecule. It support molecular editing, so that atoms and bonds can be added, deleted, or modified as desired. Atoms, bonds, and molecules may have additional user-defined data associated with them.
>
>  OB supports coordinates as part of the molecule (meaning that deleting the atom deletes the associated coordinates), and it supports multiple conformer structures.
>
>  OB follows the Daylight approach where it has a standard chemistry model and all input structures are reperceived based on that model. There is no way to disable that option.
>
>  OB is the foremost program for structure format support and interconversion.
>
>  SD file support is complete for the v2000 and v3000, both for reading and writing. What I'm not sure about is the level of support for v3000. It's mostly support for the chemistry which is in v2000 but expressed differently in v3000, and for support for more than 999 atoms?
>
>  SMILES support is good, although it doesn't have support for stereochemistry around double bonds. Excepting this lack, canonicalization is also good and widely used.

It does have support for stereochemistry around double bonds.
Stereochemistry support is much improved for other formats in the
development code though.

>  OB does have PDB file support. I can't tell how good the chemistry perception is. For example, can it detect that a C-C bond is a double or triple bond instead of a single (eg, by looking at the bond length, or by understanding the residue names)?
>
>  While OB does have a nearly uniform reader API (ie, I can point it to an SD file, SMILES file, etc and get molecules), and built-in gzip support, I do have to specify the format type manually. That is, there's no support for guessing the format based, for example, on the extension.

In most cases, the format type is the extension, but I suppose what
you say is correct.

>  OpenBabel has SMARTS support, but I can't tell how complete it is. I know it doesn't support double bond stereochemistry, but I think it's otherwise complete, including recursive SMARTS. Is there anything missing?
>
>  OB also supports using a molecule as the query rather than a SMARTS.

Hmm...not sure about this. Does it?

>  Once the match is made, it's easy to get access to the matched atoms and bonds, and match them up to the corresponding query atoms and bonds.
>
>  The topic I know the least about is reactions. OB supports reaction SMILES and SMARTS, as well as RXN files. I don't have a good idea for how good that support is, and it's not something I used much, although my client does.
>
>  In addition to the support for the query languages/formats, I can't tell how to use the reactions. How would I do a unimolecular reaction (eg, convert all of the carbons in CCCN to OOON)? How would I use a reaction for library generation (eg, convert CCC to first OCCN, then COCN, and lastly CCON)? Is it even possible? I looked but didn't find it.

Perhaps OBChemTsfm does something here?
http://openbabel.org/api/2.2.0/classOpenBabel_1_1OBChemTsfm.shtml.

>  OB does support some fingerprints. There's a linear hash fingerprint similar to Daylight's and two feature fingerprint implementations, although only one is suggested. There's no MACCS key implementation. There is no support for large/sparse fingerprints, and the only implemented comparison method is the Tanimoto similarity.

MACCS key is in there. There is also support for user-defined fingerprints.

>  OB does not do depiction. For that case people should turn to other libraries, such as OASA.

OB can do depiction, at least in the development version.

>  There's no MCS or scaffold identification code in OB. There is a descriptor framework system, support for different forcefields and minimization, and InChI support. There's no nomenclature support.

>  OB is cross-platform (here meaning "Windows and Linux"), with access to the library from C++, Python, .Net and Java. The documentation is incomplete and sketchy, but because OB is used by a large number of people, there is support both through the mailing list and by doing a web search for others who have used the code.

Also MacOSX and Ruby. Also Cygwin, MinGW. Works with all of G++, MSVC,
Intel Compiler. Also, support is available from a number of
independent consultants (as far as I am aware).

>  I have a metric for testing usability, and that's the number of lines of code needed to count the total number of atoms of all of the records in an input file, using one toolkit vs. pybel. OpenBabel suffers because of the overhead of creating an OBConversion.

Don't forget Pybel is part of OpenBabel.

>  I have another metric for comparing error handling, which is to read an SD file with records containing errors (format errors and chemistry errors) and seeing if I can find the number of records which failed to be read in and the reason for the failure. I haven't figured how out to do that with OB.

One other feature you haven't mentioned is that it has a plugin
architecture for fingerprints, formats, operations, charge models and
so forth (it's the same architecture in each case). This means that
internally a company could create a single .cpp file and compile it in
as a format or operation or whatever. This can be easily called from
babel and can do anything under the sun.

>  ====
>
> Thanks in advance!
>
>                                Andrew
>                                [hidden email]
>
>
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> OpenBabel-discuss mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Chris Morley-3
In reply to this post by Andrew Dalke
Some extra info on bits of OB I know about.

On 08/06/2010 06:14, Andrew Dalke wrote:

> Hi all,
>
>    I've been asked by a company to evaluate different toolkits for them to use in-house. Geoff thought it might be better to ask on the list than ask him directly and privately, so I'm doing that.
>
>    I've gone through their internal requirements now I'm seeing how it matched up to OB. Some of my conclusions are likely wrong and based on older version of OB, so I'm hoping people here can correct me.
>
>    I'll apologize in advance that I won't be that accessible over the next couple of days, and likely won't be able to respond until Saturday or so. But I felt it was best to get it out now rather than wait.
>
>    =====
>
>    As with all cheminformatics toolkits, OB has ways to get access to the atoms and bonds of a molecule. It support molecular editing, so that atoms and bonds can be added, deleted, or modified as desired. Atoms, bonds, and molecules may have additional user-defined data associated with them.
>
>    OB supports coordinates as part of the molecule (meaning that deleting the atom deletes the associated coordinates), and it supports multiple conformer structures.
>
>    OB follows the Daylight approach where it has a standard chemistry model and all input structures are reperceived based on that model. There is no way to disable that option.
>
>    OB is the foremost program for structure format support and interconversion.
>
>    SD file support is complete for the v2000 and v3000, both for reading and writing. What I'm not sure about is the level of support for v3000. It's mostly support for the chemistry which is in v2000 but expressed differently in v3000, and for support for more than 999 atoms?
I think this is right for v3000, for which the support is fairly
basic. v2000 has more but is not complete, e.g. no S groups.
>
>    SMILES support is good, although it doesn't have support for stereochemistry around double bonds. Excepting this lack, canonicalization is also good and widely used.
>
>    OB does have PDB file support. I can't tell how good the chemistry perception is. For example, can it detect that a C-C bond is a double or triple bond instead of a single (eg, by looking at the bond length, or by understanding the residue names)?
>
>    While OB does have a nearly uniform reader API (ie, I can point it to an SD file, SMILES file, etc and get molecules), and built-in gzip support, I do have to specify the format type manually. That is, there's no support for guessing the format based, for example, on the extension.
There is automatic recognition of which of several computational
chemistry programs a .out and .log files came from.
>
>    OpenBabel has SMARTS support, but I can't tell how complete it is. I know it doesn't support double bond stereochemistry, but I think it's otherwise complete, including recursive SMARTS. Is there anything missing?
>
>    OB also supports using a molecule as the query rather than a SMARTS.
This is true for fastsearch (indexed by fingerprints) but not, as far
as I know, for ordinary SMARTS, without an explicit conversion to SMILES.
>
>    Once the match is made, it's easy to get access to the matched atoms and bonds, and match them up to the corresponding query atoms and bonds.
>
>    The topic I know the least about is reactions. OB supports reaction SMILES and SMARTS, as well as RXN files. I don't have a good idea for how good that support is, and it's not something I used much, although my client does.
Reactions are also supported in CML.
>
>    In addition to the support for the query languages/formats, I can't tell how to use the reactions. How would I do a unimolecular reaction (eg, convert all of the carbons in CCCN to OOON)? How would I use a reaction for library generation (eg, convert CCC to first OCCN, then COCN, and lastly CCON)? Is it even possible? I looked but didn't find it.
>
>    OB does support some fingerprints. There's a linear hash fingerprint similar to Daylight's and two feature fingerprint implementations, although only one is suggested. There's no MACCS key implementation. There is no support for large/sparse fingerprints, and the only implemented comparison method is the Tanimoto similarity.
MACCS key is supported (using a datafile from RDKit).
>
>   OB does not do depiction. For that case people should turn to other libraries, such as OASA.
It is beginning to. It has 2D coordinate generation/layout. The next
version v2.3.0 will have svg depiction of single and multiple molecules.

>
>   There's no MCS or scaffold identification code in OB. There is a descriptor framework system, support for different forcefields and minimization, and InChI support. There's no nomenclature support.
>
>    OB is cross-platform (here meaning "Windows and Linux"), with access to the library from C++, Python, .Net and Java. The documentation is incomplete and sketchy, but because OB is used by a large number of people, there is support both through the mailing list and by doing a web search for others who have used the code.
>
>    I have a metric for testing usability, and that's the number of lines of code needed to count the total number of atoms of all of the records in an input file, using one toolkit vs. pybel. OpenBabel suffers because of the overhead of creating an OBConversion.
>
>    I have another metric for comparing error handling, which is to read an SD file with records containing errors (format errors and chemistry errors) and seeing if I can find the number of records which failed to be read in and the reason for the failure. I haven't figured how out to do that with OB.
>
>    ====
>
> Thanks in advance!
>
> Andrew
> [hidden email]
>
>
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> OpenBabel-discuss mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
>
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.829 / Virus Database: 271.1.1/2923 - Release Date: 06/07/10 07:35:00
>


------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Geoffrey Hutchison
In reply to this post by Andrew Dalke
Chris and Noel have made comments already, and I generally agree with them. I have only added comments where I felt it was needed.

>  SMILES support is good, although it doesn't have support for stereochemistry around double bonds. Excepting this lack, canonicalization is also good and widely used.

As Noel said, we do have support for stereochemistry around double bonds in SMILES. Stereochemistry is much improved thanks to Noel and Tim Vandermeersch in the soon-to-be-releasesd v2.3. (SMARTS support for double-bond stereo is another matter.)

>  OB does have PDB file support. I can't tell how good the chemistry perception is. For example, can it detect that a C-C bond is a double or triple bond instead of a single (eg, by looking at the bond length, or by understanding the residue names)?

Yes. On any file format like PDB or XYZ which does not support bond types, perception is run to determine connectivity and bond order. For PDB, this is also done first via residue names. Bond perception can also be turned off via the command-line or programmatically (e.g., some users run MD simulations and have their own topology file).

>  While OB does have a nearly uniform reader API (ie, I can point it to an SD file, SMILES file, etc and get molecules), and built-in gzip support, I do have to specify the format type manually. That is, there's no support for guessing the format based, for example, on the extension.

Less than 1% of the time do I have to specify a format type manually. Formats can be guessed from file extensions, and for some file types (e.g., quantum packages that like the .out, .log, or .dat extensions), OB will attempt to guess the format from contents.

Certainly common extensions like .pdb, .mol, .mol2, .sdf, .sdf.gz, .pdb.gz, are all recognized.

>  OB also supports using a molecule as the query rather than a SMARTS.

Well, you can output a SMILES from a molecule and use that as a SMARTS. That's a unit test, so we can guarantee that always works. As Chris said, there's also the fastsearch format.

>  In addition to the support for the query languages/formats, I can't tell how to use the reactions. How would I do a unimolecular reaction (eg, convert all of the carbons in CCCN to OOON)? How would I use a reaction for library generation (eg, convert CCC to first OCCN, then COCN, and lastly CCON)? Is it even possible? I looked but didn't find it.

It's not currently exposed to users, but the OBChemTsfm class is used to handle pH-dependent protonation. It can handle this task too. The syntax is basically reaction SMILES.

> OB does not do depiction. For that case people should turn to other libraries, such as OASA.

As Chris said, there is depiction in v2.3. It's evidently solid enough that Craig uses it for a service on eMolecules.com.

>  OB is cross-platform (here meaning "Windows and Linux"), with access to the library from C++, Python, .Net and Java. The documentation is incomplete and sketchy, but because OB is used by a large number of people, there is support both through the mailing list and by doing a web search for others who have used the code.

We're always open to feedback about areas of documentation needing clarification. Telling us it's sketchy and/or incomplete doesn't help much. Pointers to areas needing improvement will be met with applause (and fixes).

>  I have a metric for testing usability, and that's the number of lines of code needed to count the total number of atoms of all of the records in an input file, using one toolkit vs. pybel. OpenBabel suffers because of the overhead of creating an OBConversion.

As Noel mentioned, we *are* pybel. So I think we win that comparison. Yes, the C++ interface is slightly more verbose, but that's also true of C++ versus Python in general.

>  I have another metric for comparing error handling, which is to read an SD file with records containing errors (format errors and chemistry errors) and seeing if I can find the number of records which failed to be read in and the reason for the failure. I haven't figured how out to do that with OB.

We keep an audit log. From the command-line you get a summary:

[ghutchis@Iridium]: babel tpy-Ru.sdf tpy.mol2
1 molecule converted
1 info messages 23 audit log messages

You can programmatically interrogate the error log to get the warnings, severity level, etc. The audit level is intended to cover any code which may change chemical interpretation (e.g., Kekulization, adding implicit hydrogens, bond perception, etc.).

Hope that helps,
-Geoff
------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Craig James-2
On 6/8/10 5:31 AM, Geoffrey Hutchison wrote:
> As Chris said, there is depiction in v2.3. It's evidently solid enough that Craig uses it for a service on eMolecules.com.

http://depict.emolecules.com

Craig

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Andrew Dalke
In reply to this post by Noel O'Boyle
On Jun 8, 2010, at 2:43 AM, Noel O'Boyle wrote:
>>  OB also supports using a molecule as the query rather than a SMARTS.
>
> Hmm...not sure about this. Does it?

I thought it did, but you and others disagree. I'll look again.

> MACCS key is in there. There is also support for user-defined fingerprints.

Huh. That was added last year with 2.2.1 and I missed it. Thanks!

On the topic of documentation quality (which Geoff asked about), how does one find this out?

Go to http://openbabel.org/wiki/Develop and there's mention of FP2, FP3 and FP4 but not MACCS fingerprints.

Go to http://openbabel.org/wiki/Developer:API and search 2.2.0 documentation for MACCS and it isn't there. There's no link to the 2.2.1 documentation and for some reason there's a link to the 2.1.x beta API docs, which would be very out of date now, and pointless, yes?

(BTW, what's the reason that OB prefers these point releases, like
  - 2009-07-31 Open Babel 2.2.3 Released
  - 2009-07-10 Open Babel 2.2.2 Released
  - 2009-02-03 Open Babel 2.2.1 Released
  - 2008-07-04 Open Babel 2.2.0 Released
? There's been some major changes during those releases, so would seem to warrant 2.3, 2.4, and higher.)



>>  OB does not do depiction. For that case people should turn to other libraries, such as OASA.
>
> OB can do depiction, at least in the development version.

One of the difficulties I have in making my report is that all of the libraries have things in development, available for the next release. Is there a roadmap/timeline for that? The one at http://openbabel.org/wiki/Roadmap seems out of date since the last edit was Dec 2006.


At the last Python conference, Mark Shuttleworth's keynote address was on "Cadence, Quality, and Design"
  http://us.pycon.org/2010/conference/schedule/event/122/
and one lesson I learned was the usefulness of having predictable schedules. It's still something I'm thinking about, and I thought might be interesting in this context.

Again regarding documentation, how does one learn how to do this?

http://www.google.com/search?client=safari&rls=en&q=depiction+site:openbabel.org&ie=UTF-8&oe=UTF-8

finds all of two pages, both for pybel saying it uses OASA. Will pybel in 2.2.4 (or 2.3?) use the internal depicter instead of OASA? I see it will support SVG, but will it also support a bitmap format of some sort?



> Also MacOSX and Ruby. Also Cygwin, ...

Indeed, but my client doesn't use those. ;)

> Don't forget Pybel is part of OpenBabel.

*smacks* *head*

D'oh! I was thinking about the C++ interface here, comparing it to Python. Brain reset.

> One other feature you haven't mentioned is that it has a plugin
> architecture for fingerprints, formats, operations, charge models and
> so forth (it's the same architecture in each case).

Despite my subject line, this thread is more "OB features my client is interested in, based on talking with them."

While I know that it's possible, I can't find the documentation about it. How does one develop a new fingerprint? During testing can the new .so be in the user's directory space, or must extensions be in the OpenBabel installation?


                                Andrew
                                [hidden email]



------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Geoffrey Hutchison

On Jun 8, 2010, at 10:43 AM, Andrew Dalke wrote:

> On the topic of documentation quality (which Geoff asked about), how does one find this out?
> Go to http://openbabel.org/wiki/Develop and there's mention of FP2, FP3 and FP4 but not MACCS fingerprints.

Point taken.

> (BTW, what's the reason that OB prefers these point releases, like
>  - 2009-07-31 Open Babel 2.2.3 Released
>  - 2009-07-10 Open Babel 2.2.2 Released
>  - 2009-02-03 Open Babel 2.2.1 Released
>  - 2008-07-04 Open Babel 2.2.0 Released
> ? There's been some major changes during those releases, so would seem to warrant 2.3, 2.4, and higher.)

Our version number policy is that plugins (e.g., new fingerprints, formats, etc.) can be added along with "point releases" since they only affect one minor section of the code. In other words, adding MACCS fingerprints doesn't make any other part of the code more or less stable.

The major release (1.x., 2.x, 3.x) reflect breaking backwards compatibility with other code. There will be a v3.0, and classes and methods will be removed and reorganized.

The minor release (2.0.x, 2.1.x, 2.2.x, 2.3.x) reflect adding additional major features (force fields, 3D coordinate generation, 2D depiction, re-written stereo engine, etc.) and API calls. Nothing is removed, although binaries may need to be recompiled. New plugin types can be added as API enhancements. (For example, v2.3 will bring point charge models, so we can calculate charges beyond Gasteiger, now including MMFF94, QEq and the new QTPIE method).

The point release reflects improved bug-fixing and added plugins. The "plugin" nature is critical, since we (or others) can add code modules without affecting anyone. We could add a dozen fingerprints, descriptors, formats, and they're all accessed the same way. Users and code can test for the presence of different plugins and fail gracefully.

If I used another version numbering scheme, we could easily be on Babel 5.0 or Babel 2010 SP2.3. I understand adding 3D generation and 2D generation are big features, but our version numbering also emphasizes stability. We don't break other people's programs lightly, and in turn, we're used by over 30 open source projects and uncounted others.

> One of the difficulties I have in making my report is that all of the libraries have things in development, available for the next release. Is there a roadmap/timeline for that? The one at http://openbabel.org/wiki/Roadmap seems out of date since the last edit was Dec 2006.

Version 2.3 should be released this summer. The code is already production-quality, but we obviously like to clean up bug reports and ensure our API enhancements are enough to support users for another year.

> http://www.google.com/search?client=safari&rls=en&q=depiction+site:openbabel.org&ie=UTF-8&oe=UTF-8
> finds all of two pages, both for pybel saying it uses OASA. Will pybel in 2.2.4 (or 2.3?) use the internal depicter instead of OASA? I see it will support SVG, but will it also support a bitmap format of some sort?

Since the wiki covers released versions (to prevent confusion), there are no pages on the 2.3 release. The OB code will not support a bitmap format, as that would require an additional dependency. I suspect Pybel will integrate the internal depiction routines for v2.3.

> While I know that it's possible, I can't find the documentation about it. How does one develop a new fingerprint? During testing can the new .so be in the user's directory space, or must extensions be in the OpenBabel installation?

The easiest way to develop a new fingerprint is to copy an existing one as model. You raise a good point that there is not yet an example fingerprint code. For testing, the new .so can be in a user-specified location using the environment variable BABEL_LIBDIR. This is how I do *my* development.

Hope that helps,
-Geoff
------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Noel O'Boyle
Administrator
In reply to this post by Andrew Dalke
On 8 June 2010 15:43, Andrew Dalke <[hidden email]> wrote:

> On Jun 8, 2010, at 2:43 AM, Noel O'Boyle wrote:
>>>  OB also supports using a molecule as the query rather than a SMARTS.
>>
>> Hmm...not sure about this. Does it?
>
> I thought it did, but you and others disagree. I'll look again.
>
>> MACCS key is in there. There is also support for user-defined fingerprints.
>
> Huh. That was added last year with 2.2.1 and I missed it. Thanks!
>
> On the topic of documentation quality (which Geoff asked about), how does one find this out?
>
> Go to http://openbabel.org/wiki/Develop and there's mention of FP2, FP3 and FP4 but not MACCS fingerprints.
>
> Go to http://openbabel.org/wiki/Developer:API and search 2.2.0 documentation for MACCS and it isn't there. There's no link to the 2.2.1 documentation and for some reason there's a link to the 2.1.x beta API docs, which would be very out of date now, and pointless, yes?

We do appreciate feedback on this, as we receive very little. I
usually google "openbabel fingerprints" which points you to
http://openbabel.org/wiki/Tutorial:Fingerprints. I'll fix the other
pages.

> (BTW, what's the reason that OB prefers these point releases, like
>  - 2009-07-31 Open Babel 2.2.3 Released
>  - 2009-07-10 Open Babel 2.2.2 Released
>  - 2009-02-03 Open Babel 2.2.1 Released
>  - 2008-07-04 Open Babel 2.2.0 Released
> ? There's been some major changes during those releases, so would seem to warrant 2.3, 2.4, and higher.)

The numbering is based on API changes, not features (or marketing!).
The next version is 2.3.0 for example.

>>>  OB does not do depiction. For that case people should turn to other libraries, such as OASA.
>>
>> OB can do depiction, at least in the development version.
>
> One of the difficulties I have in making my report is that all of the libraries have things in development, available for the next release. Is there a roadmap/timeline for that? The one at http://openbabel.org/wiki/Roadmap seems out of date since the last edit was Dec 2006.

There isn't.

> At the last Python conference, Mark Shuttleworth's keynote address was on "Cadence, Quality, and Design"
>  http://us.pycon.org/2010/conference/schedule/event/122/
> and one lesson I learned was the usefulness of having predictable schedules. It's still something I'm thinking about, and I thought might be interesting in this context.

I don't think this is feasible, unless we have more developers who
will fix bugs.

> Again regarding documentation, how does one learn how to do this?
>
> http://www.google.com/search?client=safari&rls=en&q=depiction+site:openbabel.org&ie=UTF-8&oe=UTF-8
>
> finds all of two pages, both for pybel saying it uses OASA. Will pybel in 2.2.4 (or 2.3?) use the internal depicter instead of OASA? I see it will support SVG, but will it also support a bitmap format of some sort?

Our documentation focuses on the released version.

I'm not sure right now what Pybel will do; I think I may support both
OASA and the internal depictor but it depends on the time constraints.

>> Also MacOSX and Ruby. Also Cygwin, ...
>
> Indeed, but my client doesn't use those. ;)
>
>> Don't forget Pybel is part of OpenBabel.
>
> *smacks* *head*
>
> D'oh! I was thinking about the C++ interface here, comparing it to Python. Brain reset.
>
>> One other feature you haven't mentioned is that it has a plugin
>> architecture for fingerprints, formats, operations, charge models and
>> so forth (it's the same architecture in each case).
>
> Despite my subject line, this thread is more "OB features my client is interested in, based on talking with them."
>
> While I know that it's possible, I can't find the documentation about it. How does one develop a new fingerprint? During testing can the new .so be in the user's directory space, or must extensions be in the OpenBabel installation?

I'm not sure.

>
>                                Andrew
>                                [hidden email]
>
>
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> OpenBabel-discuss mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Andrew Dalke
In reply to this post by Chris Morley-3
On Jun 8, 2010, at 4:56 AM, Chris Morley wrote:
> I think this is right for v3000, for which the support is fairly
> basic. v2000 has more but is not complete, e.g. no S groups.

So far I haven't found anyone that says they support all of v3000. Then again, I've heard that the spec isn't complete, so what OB does is what everyone else does. I haven't yet come across a v3000 file in the wild, so it's not that important.

> There is automatic recognition of which of several computational
> chemistry programs a .out and .log files came from.

 My understanding of the format support, admittedly incomplete, is that one specifies the format, and that's mapped to the proper shared library. There's nothing which arbitrates which shared library to use. So I assume here you mean there's a single shared library which understands different .out and .log files?

Cheers!

                                Andrew
                                [hidden email]



------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Geoffrey Hutchison

On Jun 8, 2010, at 11:20 AM, Andrew Dalke wrote:

>> There is automatic recognition of which of several computational
>> chemistry programs a .out and .log files came from.
>
> My understanding of the format support, admittedly incomplete, is that one specifies the format, and that's mapped to the proper shared library. There's nothing which arbitrates which shared library to use. So I assume here you mean there's a single shared library which understands different .out and .log files?

No. There's a shared library which "sniffs" the appropriate shared library to load and passes the file on. The "outformat" will recognize:
* GAMESS-US
* Gaussian
* MOPAC
* PWSCF
* GULP
* Q-Chem
* ADF
* NWChem
* MPQC
* Molpro
* Jaguar

There's no reason the code can't support more detection patterns, but these are the easiest, since they declare at the start of the file, and many QM users use a generic extension like .out or .log.

Hope that helps,
-Geoff
------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Andrew Dalke
In reply to this post by Geoffrey Hutchison
On Jun 8, 2010, at 6:31 AM, Geoffrey Hutchison wrote:
> As Noel said, we do have support for stereochemistry around double bonds in SMILES. Stereochemistry is much improved thanks to Noel and Tim Vandermeersch in the soon-to-be-releasesd v2.3. (SMARTS support for double-bond stereo is another matter.)

Is there an expected date for that? If it's within the next couple of weeks than I can put in the 2.3 information.

> Yes. On any file format like PDB or XYZ which does not support bond types, perception is run to determine connectivity and bond order. For PDB, this is also done first via residue names. Bond perception can also be turned off via the command-line or programmatically (e.g., some users run MD simulations and have their own topology file).

Where is this documented? Searching for "PDB perception" on the OpenBabel site doesn't find much of anything about the algorithm used. My own history of working with the PDB says it's a lot of work to get those details, and a quick look at the OEChem release notes has things like:

 - Added PDB support for the following:
   - sidechain recognition for the RNA residue ‘YG’ and ‘H2U’
   - naming of PDB residue ‘BME’
   - the N-terminal modification ‘FOR’
   - the cofactor ‘FMT’ (which is “formic acid” or “formate”)

but I don't see mention of the quality/robustness of the PDB reader in OB.


> Less than 1% of the time do I have to specify a format type manually. Formats can be guessed from file extensions, and for some file types (e.g., quantum packages that like the .out, .log, or .dat extensions), OB will attempt to guess the format from contents.


The examples I've seen are all like

====== straight openbabel
import openbabel as ob
 
obconversion = ob.OBConversion()
obconversion.SetInFormat("sdf")

 
obmol = ob.OBMol()

notatend = obconversion.ReadFile(obmol, "benzodiazepine.sdf.gz")
while notatend:
    ...
    notatend = obconversion.Read(obmol)


==== pybel

import  pybel
 

for mol in pybel.readfile("sdf", "benzodiazepine.sdf.gz"):
    ...

===


where the format is explicitly specified. The documentation at

http://openbabel.org/dev-api/classOpenBabel_1_1OBConversion.shtml
under "To add automatic format conversion to an existing program."

uses

      ifstream ifs(filename); //Original code
      OBConversion conv;
      OBFormat* inFormat = conv.FormatFromExt(filename);
      OBFormat* outFormat = conv.GetFormat("ORIG");
      istream* pIn = &ifs;
      stringstream newstream;
      if (inFormat && outFormat)
      {
         conv.SetInAndOutFormats(inFormat,outFormat);
         conv.Convert(pIn,&newstream);
         pIn=&newstream;
      }

which allows automatic format detection based on the extension, but it's a lot of boilerplate code.

I didn't realize I could leave the format name out and the code would autodetect.

>> OB also supports using a molecule as the query rather than a SMARTS.
>
> Well, you can output a SMILES from a molecule and use that as a SMARTS. That's a unit test, so we can guarantee that always works. As Chris said, there's also the fastsearch format.

Perhaps that's what I was looking at. I'll have to dig into that again.

> It's not currently exposed to users, but the OBChemTsfm class is used to handle pH-dependent protonation. It can handle this task too. The syntax is basically reaction SMILES.

I'm assuming this will be exposed when someone volunteers to do it? ;)

> We're always open to feedback about areas of documentation needing clarification. Telling us it's sketchy and/or incomplete doesn't help much. Pointers to areas needing improvement will be met with applause (and fixes).

I thought it was pretty clear that the documentation in OpenBabel was sketchy, in comparison to some of other toolkits, like OEChem or ChemAxon. I've mentioned a few of these places in this and my other responses.


> As Noel mentioned, we *are* pybel. So I think we win that comparison. Yes, the C++ interface is slightly more verbose, but that's also true of C++ versus Python in general.

I've been thinking about this since replying to Noel's email. OpenBabel publishes two different Python APIs - the one with the C++ interface and the Pybel interface.

The Zen of Python includes

  There should be one-- and preferably only one --obvious way to do it.

Is pybel the preferred way to do things on the Python level?


> We keep an audit log. From the command-line you get a summary:
>
> [ghutchis@Iridium]: babel tpy-Ru.sdf tpy.mol2
> 1 molecule converted
> 1 info messages 23 audit log messages
>
> You can programmatically interrogate the error log to get the warnings, severity level, etc. The audit level is intended to cover any code which may change chemical interpretation (e.g., Kekulization, adding implicit hydrogens, bond perception, etc.).

That's also what OpenEye does, but getting access to the error log, synchronized with the reader, is nasty hard. Can someone show me how to get that? For example, if Pybel is the preferred way to get this data, then how do I get the error logs for each molecule in

for mol in pybel.readfile("sdf", "benzodiazepine.sdf.gz"):

 ?


> Hope that helps,

It does. Thanks!


                                Andrew
                                [hidden email]



------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Noel O'Boyle
Administrator
On 8 June 2010 17:03, Andrew Dalke <[hidden email]> wrote:

> On Jun 8, 2010, at 6:31 AM, Geoffrey Hutchison wrote:
>> As Noel said, we do have support for stereochemistry around double bonds in SMILES. Stereochemistry is much improved thanks to Noel and Tim Vandermeersch in the soon-to-be-releasesd v2.3. (SMARTS support for double-bond stereo is another matter.)
>
> Is there an expected date for that? If it's within the next couple of weeks than I can put in the 2.3 information.
>
>> Yes. On any file format like PDB or XYZ which does not support bond types, perception is run to determine connectivity and bond order. For PDB, this is also done first via residue names. Bond perception can also be turned off via the command-line or programmatically (e.g., some users run MD simulations and have their own topology file).
>
> Where is this documented? Searching for "PDB perception" on the OpenBabel site doesn't find much of anything about the algorithm used. My own history of working with the PDB says it's a lot of work to get those details, and a quick look at the OEChem release notes has things like:
>
>  - Added PDB support for the following:
>   - sidechain recognition for the RNA residue ‘YG’ and ‘H2U’
>   - naming of PDB residue ‘BME’
>   - the N-terminal modification ‘FOR’
>   - the cofactor ‘FMT’ (which is “formic acid” or “formate”)
>
> but I don't see mention of the quality/robustness of the PDB reader in OB.
>
>
>> Less than 1% of the time do I have to specify a format type manually. Formats can be guessed from file extensions, and for some file types (e.g., quantum packages that like the .out, .log, or .dat extensions), OB will attempt to guess the format from contents.
>
>
> The examples I've seen are all like
>
> ====== straight openbabel
> import openbabel as ob
>
> obconversion = ob.OBConversion()
> obconversion.SetInFormat("sdf")
>
>
> obmol = ob.OBMol()
>
> notatend = obconversion.ReadFile(obmol, "benzodiazepine.sdf.gz")
> while notatend:
>    ...
>    notatend = obconversion.Read(obmol)
>
>
> ==== pybel
>
> import  pybel
>
>
> for mol in pybel.readfile("sdf", "benzodiazepine.sdf.gz"):
>    ...
>
> ===
>
>
> where the format is explicitly specified. The documentation at
>
> http://openbabel.org/dev-api/classOpenBabel_1_1OBConversion.shtml
> under "To add automatic format conversion to an existing program."
>
> uses
>
>      ifstream ifs(filename); //Original code
>      OBConversion conv;
>      OBFormat* inFormat = conv.FormatFromExt(filename);
>      OBFormat* outFormat = conv.GetFormat("ORIG");
>      istream* pIn = &ifs;
>      stringstream newstream;
>      if (inFormat && outFormat)
>      {
>         conv.SetInAndOutFormats(inFormat,outFormat);
>         conv.Convert(pIn,&newstream);
>         pIn=&newstream;
>      }
>
> which allows automatic format detection based on the extension, but it's a lot of boilerplate code.
>
> I didn't realize I could leave the format name out and the code would autodetect.
>
>>> OB also supports using a molecule as the query rather than a SMARTS.
>>
>> Well, you can output a SMILES from a molecule and use that as a SMARTS. That's a unit test, so we can guarantee that always works. As Chris said, there's also the fastsearch format.
>
> Perhaps that's what I was looking at. I'll have to dig into that again.
>
>> It's not currently exposed to users, but the OBChemTsfm class is used to handle pH-dependent protonation. It can handle this task too. The syntax is basically reaction SMILES.
>
> I'm assuming this will be exposed when someone volunteers to do it? ;)
>
>> We're always open to feedback about areas of documentation needing clarification. Telling us it's sketchy and/or incomplete doesn't help much. Pointers to areas needing improvement will be met with applause (and fixes).
>
> I thought it was pretty clear that the documentation in OpenBabel was sketchy, in comparison to some of other toolkits, like OEChem or ChemAxon. I've mentioned a few of these places in this and my other responses.
>
>
>> As Noel mentioned, we *are* pybel. So I think we win that comparison. Yes, the C++ interface is slightly more verbose, but that's also true of C++ versus Python in general.
>
> I've been thinking about this since replying to Noel's email. OpenBabel publishes two different Python APIs - the one with the C++ interface and the Pybel interface.
>
> The Zen of Python includes
>
>  There should be one-- and preferably only one --obvious way to do it.

I'm well aware of the Zen of Python - it governed my design decisions.
If you think that this means Pybel should not exist, I would disagree,
and I think many users would too.

> Is pybel the preferred way to do things on the Python level?

That requires a poll of users. Personally, IMHO, I don't know why you
would use the bindings directly where you could use Pybel. But usually
I end up using a combination.

>> We keep an audit log. From the command-line you get a summary:
>>
>> [ghutchis@Iridium]: babel tpy-Ru.sdf tpy.mol2
>> 1 molecule converted
>> 1 info messages 23 audit log messages
>>
>> You can programmatically interrogate the error log to get the warnings, severity level, etc. The audit level is intended to cover any code which may change chemical interpretation (e.g., Kekulization, adding implicit hydrogens, bond perception, etc.).
>
> That's also what OpenEye does, but getting access to the error log, synchronized with the reader, is nasty hard. Can someone show me how to get that? For example, if Pybel is the preferred way to get this data, then how do I get the error logs for each molecule in
>
> for mol in pybel.readfile("sdf", "benzodiazepine.sdf.gz"):
>
>  ?

This will take me a while to dig into.

>> Hope that helps,
>
> It does. Thanks!
>
>
>                                Andrew
>                                [hidden email]
>
>
>
> ------------------------------------------------------------------------------
> ThinkGeek and WIRED's GeekDad team up for the Ultimate
> GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
> lucky parental unit.  See the prize list and enter to win:
> http://p.sf.net/sfu/thinkgeek-promo
> _______________________________________________
> OpenBabel-discuss mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Andrew Dalke
In reply to this post by Geoffrey Hutchison
On Jun 8, 2010, at 9:04 AM, Geoffrey Hutchison wrote:
> Our version number policy is that plugins (e.g., new fingerprints, formats, etc.) can be added along with "point releases" since they only affect one minor section of the code. In other words, adding MACCS fingerprints doesn't make any other part of the code more or less stable.

I know the Python policy best, which is

x level releases are backwards incompatible
x.y level releases add new features
x.y.z level releases are bug and security fixes only

The difference seems to be in this:

> The point release reflects improved bug-fixing and added plugins. The "plugin" nature is critical, since we (or others) can add code modules without affecting anyone. We could add a dozen fingerprints, descriptors, formats, and they're all accessed the same way. Users and code can test for the presence of different plugins and fail gracefully.

This means that OB's versions depends on the architecture, which seems to be an unexpected coupling. Take for example Eclipse, which is massively built on a plug-in architecture. I don't think their version scheme is dependent only on changes to the core plug-in framework.

In any case, I understand the scheme now.

> our version numbering also emphasizes stability. We don't break other people's programs lightly, and in turn, we're used by over 30 open source projects and uncounted others.

To point out, stability doesn't depend on version numbers, it's a process thing. Python is used by a lot more projects than OpenBabel, and with more developers, while having larger steps in their version numbers.

> The code is already production-quality, but we obviously like to clean up bug reports and ensure our API enhancements are enough to support users for another year.

> Since the wiki covers released versions (to prevent confusion), there are no pages on the 2.3 release. The OB code will not support a bitmap format, as that would require an additional dependency. I suspect Pybel will integrate the internal depiction routines for v2.3.

How do you get people to try out the new features, in order to see if it's production-quality  and get bug reports, without having access to documentation about what's available?

Historically speaking, the wiki has had references to development versions. You can see traces of that at

  http://openbabel.org/wiki/Developer:API

   Development Snapshots

    -  2.1.x beta API Documentation (Updated monthly)
     - This is intended for developers working with the bleeding-edge
         development code from the SVN repository trunk.


> The easiest way to develop a new fingerprint is to copy an existing one as model. You raise a good point that there is not yet an example fingerprint code. For testing, the new .so can be in a user-specified location using the environment variable BABEL_LIBDIR. This is how I do *my* development.

Where is the documentation for BABEL_LIBDIR ? I looked at all 28 hits for "BABEL_LIBDIR" from Google, but didn't find any description of this.

I found a few mentions in the changelog, including this from 2005-09-11

        * src/dlhandler_unix.cpp: Add support for multiple search paths
        for shared format modules. Uses environmental variable
        BABEL_LIBDIR which can be a colon separated list of paths
        (e.g. /usr/lib/openbabel:/usr/local/lib/openbabel ...)

Note that the code also allows "\n" and "\r" as separators, though I can't figure out why. It also looks like at some point " " was allowed as a separator, but that was removed a few months ago in the dev version because it broke when people had paths with a space in the name.

Cheers!


                                Andrew
                                [hidden email]



------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Andrew Dalke
In reply to this post by Noel O'Boyle
On Jun 8, 2010, at 10:25 AM, Noel O'Boyle wrote:
> I'm well aware of the Zen of Python - it governed my design decisions.
> If you think that this means Pybel should not exist, I would disagree,
> and I think many users would too.

Not at all. I've been talking about how the Pybel/Cinfony interface is by far the easiest API to use among all of the toolkits. The above was contextual lead-up to my next question.

(Besides, that Zen point doesn't mean there can't be two ways.)

>> Is pybel the preferred way to do things on the Python level?
>
> That requires a poll of users. Personally, IMHO, I don't know why you
> would use the bindings directly where you could use Pybel. But usually
> I end up using a combination.

Users, speak up! :)


                                Andrew
                                [hidden email]



------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: state of OB features

Noel O'Boyle
Administrator
In reply to this post by Noel O'Boyle
>>> We keep an audit log. From the command-line you get a summary:
>>>
>>> [ghutchis@Iridium]: babel tpy-Ru.sdf tpy.mol2
>>> 1 molecule converted
>>> 1 info messages 23 audit log messages
>>>
>>> You can programmatically interrogate the error log to get the warnings, severity level, etc. The audit level is intended to cover any code which may change chemical interpretation (e.g., Kekulization, adding implicit hydrogens, bond perception, etc.).
>>
>> That's also what OpenEye does, but getting access to the error log, synchronized with the reader, is nasty hard. Can someone show me how to get that? For example, if Pybel is the preferred way to get this data, then how do I get the error logs for each molecule in
>>
>> for mol in pybel.readfile("sdf", "benzodiazepine.sdf.gz"):
>>
>>  ?
>
> This will take me a while to dig into.

There is a global instance of the error log object at openbabel.obErrorLog:

Here's an example of usage, but see the API docs or dir() the object
for further methods:

    for i, mol in enumerate(pybel.readfile("sdf",
os.path.join("not-backed-up", "all-sdf.sdf"))):
        print i, pybel.ob.obErrorLog.GetErrorMessageCount()

- Noel

------------------------------------------------------------------------------
ThinkGeek and WIRED's GeekDad team up for the Ultimate
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the
lucky parental unit.  See the prize list and enter to win:
http://p.sf.net/sfu/thinkgeek-promo
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss