Join sdf files

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Join sdf files

Pascal Muller-3
Hello,

I would like to merge several sdf files (2D). Some molecules are in
duplicates (same id), but may share different sdf fields, or same
field, but with different values.
Eg, a molecule will be present 2 times, with these fields:
<id>
<activity> = 12
<property A>
for the first occurence,

and for the 2nd occurence
<id>
<id> (two times the same field, with the same value)
<activity> = 16
<property B>

I would like in my final sdf just one occurence of the molecule (no
duplicate), with cleaned and merged fields, like :
<id>
<activity> = 12; 16; mean = 14
[or <activity> = 12; 16 and <mean_activity> = 14]
<property A>
<property B>

I can easily extract data, clean / merge them with bash command or
small perl scripts, remove duplicates, and create a new sdf with the
cleaned fields, but it's a several steps process, and the fields names
will be changed the next time I'll need to do it...

So, I would like to know if somebody knows an existing program able to
doing that? (like JoinSDFiles.pl from Mayachemtools, but it's just
like a cat command!)
Or any link for an openbabel exemple program which I could modify?
(I'm not able to create a C or python program, but I'm able to
slightly modify them according to my needs...)

Many thanks for any advice,
Regards,
Pascal

------------------------------------------------------------------------------
Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
Be part of this innovative community and reach millions of netbook users
worldwide. Take advantage of special opportunities to increase revenue and
speed time-to-market. Join now, and jumpstart your future.
http://p.sf.net/sfu/intel-atom-d2d
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Reply | Threaded
Open this post in threaded view
|

Re: Join sdf files

Chris Morley-3
On 25/08/2010 11:44, Pascal Muller wrote:

> Hello,
>
> I would like to merge several sdf files (2D). Some molecules are in
> duplicates (same id), but may share different sdf fields, or same
> field, but with different values.
> Eg, a molecule will be present 2 times, with these fields:
> <id>
> <activity>  = 12
> <property A>
> for the first occurence,
>
> and for the 2nd occurence
> <id>
> <id>  (two times the same field, with the same value)
> <activity>  = 16
> <property B>
>
> I would like in my final sdf just one occurence of the molecule (no
> duplicate), with cleaned and merged fields, like :
> <id>
> <activity>  = 12; 16; mean = 14
> [or<activity>  = 12; 16 and<mean_activity>  = 14]
> <property A>
> <property B>
>
> I can easily extract data, clean / merge them with bash command or
> small perl scripts, remove duplicates, and create a new sdf with the
> cleaned fields, but it's a several steps process, and the fields names
> will be changed the next time I'll need to do it...
>
> So, I would like to know if somebody knows an existing program able to
> doing that? (like JoinSDFiles.pl from Mayachemtools, but it's just
> like a cat command!)
> Or any link for an openbabel exemple program which I could modify?
> (I'm not able to create a C or python program, but I'm able to
> slightly modify them according to my needs...)

You will need a program/script (perhaps using pybel) to do the whole
job, but the babel command line option -C can do some of what you
want. It combines properties/structure of molecules with the same
name, but it would not do the cleaning and merging. Unfortunately it
has been broken for some time, but I have repaired it in the
development code ready for the upcoming release.

Chris



> Many thanks for any advice,
> Regards,
> Pascal
>
> ------------------------------------------------------------------------------
> Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
> Be part of this innovative community and reach millions of netbook users
> worldwide. Take advantage of special opportunities to increase revenue and
> speed time-to-market. Join now, and jumpstart your future.
> http://p.sf.net/sfu/intel-atom-d2d
> _______________________________________________
> OpenBabel-discuss mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
>
>
>
>
> No virus found in this incoming message.
> Checked by AVG - www.avg.com
> Version: 9.0.851 / Virus Database: 271.1.1/3092 - Release Date: 08/24/10 16:31:00
>


------------------------------------------------------------------------------
Sell apps to millions through the Intel(R) Atom(Tm) Developer Program
Be part of this innovative community and reach millions of netbook users
worldwide. Take advantage of special opportunities to increase revenue and
speed time-to-market. Join now, and jumpstart your future.
http://p.sf.net/sfu/intel-atom-d2d
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss