[Open Babel] Complete User docs

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

[Open Babel] Complete User docs

drc-2
Comments/suggestions/typos/etc very welcome

Openbabel User Documentation

The aim of this document is to provide real world examples of the syntax needed to use OpenBabel, it is not a developers guide.

To get help using OpenBabel at the command using the Terminal at the prompt type babel –H

PROMPT>babel –H

This will output the general syntax followed by a list of conversion options and the file formats currently supported.

Open Babel converts chemical structures from one file format to another

Usage: babel <input spec> <output spec> [Options]

Each spec can be a file whose extension decides the format.
Optionally the format can be specified by preceding the file by
-i<format-type> e.g. -icml, for input and -o<format-type> for output

See below for available format-types, which are the same as the
file extensions and are case independent.
If no input or output file is given stdin or stdout are used instead.

More than one input file can be specified and their names can contain
wildcard chars (* and ?).The molecules are aggregated in the output file.

Conversion options
  -f <#> Start import at molecule # specified
  -l <#> End import at molecule # specified
  -t All input files describe a single molecule
  -e Continue with next object after error, if possible
  -z Compress the output with gzip
  -H Outputs this help text
  -Hxxx (xxx is file format ID e.g. -Hcml) gives format info
  -Hall Outputs details of all formats
  -V Outputs version number
  -F Outputs the available fingerprint types
  -m Produces multiple output files, to allow:
     Splitting: e.g.        babel infile.mol new.smi -m
       puts each molecule into new1.smi new2.smi etc
     Batch conversion: e.g. babel *.mol -osmi -m
       converts each input file to a .smi file
For conversions of molecules
   Additional options :
   -d Delete Hydrogens
   -h Add Hydrogens
   -p Add Hydrogens appropriate for pH
   -b Convert dative bonds e.g.[N+]([O-])=O to N(=O)=O
   -c Center Coordinates
   -j Join all input molecules into a single output molecule
   -s"smarts" Convert only molecules matching SMARTS:
   -v"smarts" Convert only molecules NOT matching SMARTS:

Interface to OBAPI internals
 API options, e.g. ---errorlevel 2
  errorlevel # to control logging and reporting
 
The following file formats are recognized:
  alc -- Alchemy format
  bgf -- MSI BGF format
  box -- Dock 3.5 Box format
  bs -- Ball and Stick format
  c3d1 -- Chem3D Cartesian 1 format
  c3d2 -- Chem3D Cartesian 2 format
  caccrt -- Cacao Cartesian format
  cache -- CAChe MolStruct format [Write-only]
  cacint -- Cacao Internal format [Write-only]
  car -- Accelrys/MSI Biosym/Insight II CAR format [Read-only]
  ccc -- CCC format [Read-only]
  cht -- Chemtool format [Write-only]
  cml --  Chemical Markup Language
  cmlr --  CML Reaction format
  com -- Gaussian 98/03 Cartesian Input [Write-only]
  copy -- Copies raw text [Write-only]
  crk2d -- Chemical Resource Kit diagram format (2D)
  crk3d -- Chemical Resource Kit 3D format
  csr -- Accelrys/MSI Quanta CSR format [Write-only]
  cssr -- CSD CSSR format [Write-only]
  ct -- ChemDraw Connection Table format
  dmol -- DMol3 coordinates format
  ent -- Protein Data Bank format
  feat -- Feature format
  fh -- Fenske-Hall Z-Matrix format [Write-only]
  fix -- SMILES FIX format [Write-only]
  fpt -- Fingerprint format [Write-only]
  fract -- Free Form Fractional format
  fs -- FastSearching
  g03 -- Gaussian98/03 Output [Read-only]
  g98 -- Gaussian98/03 Output [Read-only]
  gam -- GAMESS Output [Read-only]
  gamin -- GAMESS Input [Write-only]
  gamout -- GAMESS Output [Read-only]
  gau -- Gaussian 98/03 Cartesian Input [Write-only]
  gpr -- Ghemical format
  gr96 -- GROMOS96 format [Write-only]
  hin -- HyperChem HIN format
  inp -- GAMESS Input [Write-only]
  ins -- ShelX format [Read-only]
  jin -- Jaguar input format [Write-only]
  jout -- Jaguar output format [Read-only]
  mdl -- MDL MOL format
  mmd -- MacroModel format
  mmod -- MacroModel format
  mol -- MDL MOL format
  mol2 -- Sybyl Mol2 format
  mopcrt -- MOPAC Cartesian format
  mopout -- MOPAC Output format [Read-only]
  mpd -- Sybyl descriptor format [Write-only]
  mpqc -- MPQC output format [Read-only]
  mpqcin -- MPQC simplified input format [Write-only]
  nw -- NWChem input format [Write-only]
  nwo -- NWChem output format [Read-only]
  pc --  PubChem format  [Read-only]
  pdb -- Protein Data Bank format
  pov -- POV-Ray input format [Write-only]
  pqs -- Parallel Quantum Solutions format
  prep -- Amber Prep format [Read-only]
  qcin -- Q-Chem input format [Write-only]
  qcout -- Q-Chem output format [Read-only]
  report -- Open Babel report format [Write-only]
  res -- ShelX format [Read-only]
  rxn -- MDL RXN format
  sd -- MDL MOL format
  sdf -- MDL MOL format
  smi -- SMILES format
  tmol -- TurboMole Coordinate format
  txyz -- Tinker MM2 format [Write-only]
  unixyz -- UniChem XYZ format
  vmol -- ViewMol format
  xed -- XED format [Write-only]
  xml --  General XML format [Read-only]
  xyz -- XYZ cartesian coordinates format
  yob -- YASARA.org YOB format
  zin -- ZINDO input format [Write-only]

For many of the file types there are additional options these can be listed using

PROMPT>babel –Hall

alc -- Alchemy format
             No comments yet
bgf -- MSI BGF format
             No comments yet
box -- Dock 3.5 Box format
             No comments yet
Specification http://dock.compbio.ucsf.edu/
bs -- Ball and Stick format
             No comments yet
Specification http://ocwww.chemie.uni-linz.ac.at/mueller/ball_and_stick.html
c3d1 -- Chem3D Cartesian 1 format
             No comments yet
c3d2 -- Chem3D Cartesian 2 format
             No comments yet
caccrt -- Cacao Cartesian format
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification http://www.chembio.uoguelph.ca/oakley/310/cacao/cacao.htm
cache -- CAChe MolStruct format [Write-only]
             No comments yet
cacint -- Cacao Internal format [Write-only]
             No comments yet
   Specification http://www.chembio.uoguelph.ca/oakley/310/cacao/cacao.htm
car -- Accelrys/MSI Biosym/Insight II CAR format [Read-only]
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
ccc -- CCC format [Read-only]
             No comments yet
cht -- Chemtool format [Write-only]
             No comments yet
Specification http://ruby.chemie.uni-freiburg.de/~martin/chemtool/chemtool.html

cml --  Chemical Markup Language
 XML format. This implementation uses libxml2.
 Write options for CML: -x[flags] (e.g. -x1ac)
 1  output CML1 (rather than CML2)
 a  output array format for atoms and bonds
 h  use hydrogenCount for all hydrogens
 m  output metadata
 x  omit XML and namespace declarations
 N<prefix> add namespace prefix to elements
Specification at: http://wwmm.ch.cam.ac.uk/moin/ChemicalMarkupLanguage
cmlr --  CML Reaction format
 Minimal implementation
 This implementation uses libxml2.
 Write options (e.g. -x1ac)
 1  output CML V1.0  or
 2  output CML V2.0 (default)
 a  output array format for atoms and bonds
 l  molecules in list
 h  use hydrogenCount for all hydrogens
 x  omit XML declaration
 N<prefix> add namespace prefix to elements
com -- Gaussian 98/03 Cartesian Input [Write-only]
             No comments yet
Specification at: http://www.gaussian.com/g_ur/m_input.htm
copy -- Copies raw text [Write-only]
 Objects can be chemically filtered without the risk
 of losing any additional information they contain,
 since no format conversion is done.
 Note that XML files may be missing non-object elements
 at the start or end and so may no longer be well formed.
crk2d -- Chemical Resource Kit diagram format (2D)
             No comments yet
Specification at: http://crk.sourceforge.net/
crk3d -- Chemical Resource Kit 3D format
             No comments yet
Specification at: http://crk.sourceforge.net/
csr -- Accelrys/MSI Quanta CSR format [Write-only]
             No comments yet
cssr -- CSD CSSR format [Write-only]
             No comments yet
ct -- ChemDraw Connection Table format
             No comments yet
dmol -- DMol3 coordinates format
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
ent -- Protein Data Bank format
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification http://www.rcsb.org/pdb/docs/format/pdbguide2.2/guide2.2_frame.html
feat -- Feature format
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
fh -- Fenske-Hall Z-Matrix format [Write-only]
             No comments yet
fix -- SMILES FIX format [Write-only]
             No comments yet
fpt -- Fingerprint format [Write-only]
 Constructs and displays fingerprints and (for multiple input objects)
 the Tanimoto coefficient and whether a superstructure of the first object
 Options e.g. -xfFP3 -xn128
  f<id> fingerprint type
  N# fold to specified number of bits, 32, 64, 128, etc.
  h  hex output when multiple molecules
  F  displays the available fingerprint types
fract -- Free Form Fractional format
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
fs -- FastSearching
 Uses molecular fingerprints in an index file.
 Writing to the fs format makes an index (a very slow process)
   babel datafile.xxx index.fs
 Reading from the fs format does a fast search for:
   Substructure
     babel index.fs -sSMILES outfile.yyy   or
     babel datafile.xxx -ifs -sSMILES outfile.yyy
   Molecular similarity based on Tanimoto coefficient
     babel index.fs -sSMILES outfile.yyy -t0.7  (Tanimoto >0.7)
     babel index.fs -sSMILES outfile.yyy -t15   (best 15 molecules)
   The structure spec can be a molecule from a file: -Spatternfile.zzz
 Write Options (when making index) e.g. -xfFP3
  f# Fingerprint type
  N# Fold fingerprint to # bits
 Read Options (when searching) e.g. -at0.7
  t# Do similarity search: #mols or # as min Tanimoto
  a  Add Tanimoto to title
  l# Maximum number of candidates. Default<4000>
g03 -- Gaussian98/03 Output [Read-only]
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification at: http://www.gaussian.com/
g98 -- Gaussian98/03 Output [Read-only]
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification at: http://www.gaussian.com/
gam -- GAMESS Output [Read-only]
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification at: http://www.msg.ameslab.gov/GAMESS/doc.menu.html
gamin -- GAMESS Input [Write-only]
             No comments yet
Specification at: http://www.msg.ameslab.gov/GAMESS/doc.menu.html
gamout -- GAMESS Output [Read-only]
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification at: http://www.msg.ameslab.gov/GAMESS/doc.menu.html
gau -- Gaussian 98/03 Cartesian Input [Write-only]
             No comments yet
Specification at: http://www.gaussian.com/g_ur/m_input.htm
gpr -- Ghemical format
             Open source molecular modelling
Specification at: http://www.uku.fi/~thassine/ghemical/
gr96 -- GROMOS96 format [Write-only]
        Write Options e.g. -xn
        n output nm (not Angstroms)
hin -- HyperChem HIN format
              No comments yet
inp -- GAMESS Input [Write-only]
             No comments yet
Specification at: http://www.msg.ameslab.gov/GAMESS/doc.menu.html
ins -- ShelX format [Read-only]
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification at: http://shelx.uni-ac.gwdg.de/SHELX/
jin -- Jaguar input format [Write-only]
Specification at: http://www.schrodinger.com/
jout -- Jaguar output format [Read-only]
       Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification at: http://www.schrodinger.com/
mdl -- MDL MOL format
 Reads and writes V2000 and V3000 versions
 Write Options, e.g. -x3
  2  output V2000 (default) or
  3  output V3000 (used for >999 atoms/bonds)
 Specification at: http://www.mdl.com/downloads/public/ctfile/ctfile.jsp
mmd -- MacroModel format
             No comments yet
mmod -- MacroModel format
             No comments yet
mol -- MDL MOL format
 Reads and writes V2000 and V3000 versions
 Write Options, e.g. -x3
  2  output V2000 (default) or
  3  output V3000 (used for >999 atoms/bonds)
 Specification at: http://www.mdl.com/downloads/public/ctfile/ctfile.jsp
mol2 -- Sybyl Mol2 format
             No comments yet
Specification at: http://www.tripos.com/data/support/mol2.pdf
mopcrt -- MOPAC Cartesian format
        Options e.g. -xs
        s  Output single bonds only
        b  Disable bonding entirely
mopout -- MOPAC Output format [Read-only]
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
mpd -- Sybyl descriptor format [Write-only]
             [Molec_name]\t[atomtype];[layer]-[frequency]-[neighbour_type];            Options: e.g. -xnc
             n prefix molecule names with name of file
             c use XML style separators instead
             i use IDX atom types of babel internal
Specification at: http://dx.doi.org/10.1021/ci034207y

mpqc -- MPQC output format [Read-only]
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification at: http://www.mpqc.org/mpqc-html/mpqcinp.html
mpqcin -- MPQC simplified input format [Write-only]
             No comments yet
Specification at: http://www.mpqc.org/mpqc-html/mpqcinp.html
nw -- NWChem input format [Write-only]
             No comments yet
Specification at: http://www.emsl.pnl.gov/docs/nwchem/
nwo -- NWChem output format [Read-only]
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification at: http://www.emsl.pnl.gov/docs/nwchem/
pc --  PubChem format  [Read-only]
 Minimal extraction of chemical structure information only.
Specification at: ftp://ftp.ncbi.nlm.nih.gov/pubchem/data_spec/pubchem.xsd
pdb -- Protein Data Bank format
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specificationhttp://www.rcsb.org/pdb/docs/format/pdbguide2.2/guide2.2_frame.html
pov -- POV-Ray input format [Write-only]
             No comments yet
Specification at: http://www.povray.org/
pqs -- Parallel Quantum Solutions format
        No comments yetSpecification at: http://www.pqs-chem.com/
prep -- Amber Prep format [Read-only]
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification at: http://www.amber.ucsf.edu/amber/formats.html
qcin -- Q-Chem input format [Write-only]
             No comments yet
Specification at: http://www.q-chem.com/
qcout -- Q-Chem output format [Read-only]
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification at: http://www.q-chem.com/
report -- Open Babel report format [Write-only]
             No comments yet
res -- ShelX format [Read-only]
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification at: http://shelx.uni-ac.gwdg.de/SHELX/
rxn -- MDL RXN format
sd -- MDL MOL format
 Reads and writes V2000 and V3000 versions
 Write Options, e.g. -x3
  2  output V2000 (default) or
  3  output V3000 (used for >999 atoms/bonds)
 Specification at: http://www.mdl.com/downloads/public/ctfile/ctfile.jsp

sdf -- MDL MOL format
 Reads and writes V2000 and V3000 versions
 Write Options, e.g. -x3
  2  output V2000 (default) or
  3  output V3000 (used for >999 atoms/bonds)
 Specification at: http://www.mdl.com/downloads/public/ctfile/ctfile.jsp
smi -- SMILES format
             A linear text format which can describe the connectivity
             and chirality of a molecule
             Write Options e.g. -xt
             -n no molecule name
             -t molecule name only
            -r radicals lower case eg ethyl is Cc
Specification at: http://www.daylight.com/dayhtml/smiles/
tmol -- TurboMole Coordinate format
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification at: http://www.turbomole.com/
txyz -- Tinker MM2 format [Write-only]
             No comments yet
Specification at: http://dasher.wustl.edu/tinker/
unixyz -- UniChem XYZ format
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
vmol -- ViewMol format
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
Specification at: http://viewmol.sourceforge.net/
xed -- XED format [Write-only]
             No comments yet
xml --  General XML format [Read-only]
 Calls a particular XML format depending on the XML namespace,
 or the default format (which is probably CML).
 This implementation uses libxml2.
xyz -- XYZ cartesian coordinates format
        Read Options e.g. -as
        s  Output single bonds only
        b  Disable bonding entirely
yob -- YASARA.org YOB format
             No comments yet
             Specification at: http://www.yasara.org
zin -- ZINDO input format [Write-only]
             No comments yet



File Conversion

To convert mymols.sdf to SMILES format.

PROMPT>babel -isdf  'mymols.sdf' -osmi 'outputfile.smi'

You may need to include the full path to the files e.g. '/Users/username/Desktop/mymols.sdf'. If no input or output specification is defined OpenBabel will try to assign the filetype
based on the file suffix. Openbabel cannot generate coordinates so whilst the conversion from SMILES to sdf will generate a file the resulting sdf will not contain coordinates.
Similarly OpenBabel cannot generate a 3D structure from a 2D structure file. (However if you would like to introduce this functionality you would be very welcome).
If you want to remove all hydrogens when doing the conversion the command would be:-

PROMPT>babel -isdf  'mymols.sdf' -osmi 'outputfile.smi' –d

If you want to add all hydrogens when doing the conversion the command would be:-

PROMPT>babel -isdf  'mymols.sdf' -osmi 'outputfile.smi' –h

If you want to add hydrogens appropriate for pH7.4 when doing the conversion the command would be:-

PROMPT>babel -isdf  'mymols.sdf' -osmi 'outputfile.smi' –p

The protonation appears to be done an atom-by-atom basis so molecules with multiple ionizable centers will have all centers ionized.

Of course you don’t actually need to change the file type to modify the hydrogens if you want to add all hydrogens the command would be:-

PROMPT>babel -isdf  'mymols.sdf' -osmi 'mymols_H.sdf' ' –h

Some functional groups e.g. nitro or sulphone can be represented either as [N+]([O-])=O or N(=O)=O, to convert all to the dative bond form.

PROMPT>babel -isdf  'mymols.sdf'  -osmi 'outputfile.smi' –b

If you only want to convert a subset of molecules you can define them using –f and –l, so to convert molecules 2-4 of the file mymols.sdf type

PROMPT>/babel   'mymols.sdf' -f 2 -l 4 -osdf 'outputfile.sdf'      

Alternatively you can select a subset matching a SMARTS pattern, so to select all molecules containing bromobenzene use.

PROMPT>babel   mymols.sdf  -osdf  ‘selected.sdf’    -s ‘c1ccccc1Br’

You can select a subset that do not match a SMARTS pattern, so to select all molecules not containing bromobenzene use.

PROMPT>babel   mymols.sdf  -osdf  ‘selected.sdf’    -v ‘c1ccccc1Br’


You can of course combine options, so to join molecules and add hydrogens type

PROMPT>babel   mymols.sdf’ –osdf ‘ myjoined.sdf’ –h   -j

The output file can be compressed with gzip, but note if you don’t specify the “.gz” suffix it will not be added, which could cause problems when you try to open the file.

PROMPT> babel   ‘ /mymols.sdf’ –osdf ‘outputfile.sdf.gz’     -z

Fingerprints

You can see the available fingerprints by typing the following command

PROMPT>babel –F
FP2 -- Indexes linear fragments up to 7 atoms.
FP3 -- SMARTS patterns specified in the file patterns.txt

At present there are two types of fingerprints, FP2
The other fingerprint type FP3 uses a series of SMARTS queries that are stored in /usr/local/share/openbabel/patterns.txt (You can add your own SMARTS queries to this file).

For relatively small datasets (<10,000’s) it is possible to do similarity searches without the need to build a similarity index, however larger datasets (upto 100,000’s) can be
searched rapidly once the index has been built.

So on small datasets these fingerprints can be used in a variety of ways:- the command

PROMPT> babel  ‘mymols.sdf’  –ofpt

>MOL_00000067
>MOL_00000083   Tanimoto from MOL_00000067 = 0.810811
>MOL_00000105   Tanimoto from MOL_00000067 = 0.833333
>MOL_00000296   Tanimoto from MOL_00000067 = 0.425926
>MOL_00000320   Tanimoto from MOL_00000067 = 0.534884
>MOL_00000328   Tanimoto from MOL_00000067 = 0.511111
>MOL_00000338   Tanimoto from MOL_00000067 = 0.522727
>MOL_00000354   Tanimoto from MOL_00000067 = 0.534884
>MOL_00000378   Tanimoto from MOL_00000067 = 0.489362
>MOL_00000391   Tanimoto from MOL_00000067 = 0.489362
10 molecules converted

will give you the Tanimoto coefficient between the first molecule in mymols.sdf and each of the subsequent ones. You don’t have to have all the structures in the same file or
format. So the following command gives you the Tanimoto coefficient between a SMILES string in mysmiles.smi and all the molecules in mymols.sdf
 
PROMPT> babel  ‘mysmiles.smi’  ‘mymols.sdf’ –ofpt

>MOL_00000067   Tanimoto from first mol = 0.0888889
>MOL_00000083   Tanimoto from first mol = 0.0869565
>MOL_00000105   Tanimoto from first mol = 0.0888889
>MOL_00000296   Tanimoto from first mol = 0.0714286
>MOL_00000320   Tanimoto from first mol = 0.0888889
>MOL_00000328   Tanimoto from first mol = 0.0851064
>MOL_00000338   Tanimoto from first mol = 0.0869565
>MOL_00000354   Tanimoto from first mol = 0.0888889
>MOL_00000378   Tanimoto from first mol = 0.0816327
>MOL_00000391   Tanimoto from first mol = 0.0816327
11 molecules converted

If you wanted to know the similarity between only the substituted bromobenzenes in mymols.sdf then you might combine commands like this.

PROMPT> babel ‘mymols.sdf’ -ofpt -s 'c1ccccc1Br'
>MOL_00000067
>MOL_00000083   Tanimoto from MOL_00000067 = 0.810811
>MOL_00000105   Tanimoto from MOL_00000067 = 0.833333

You change the fingerprint using the following command.

PROMPT> babel ‘mymols.sdf’ -ofpt -xfFP3

On larger datasets it is necessary to first build the index using the command

PROMPT> babel mymols.sdf -ofs

This builds mymols.fs with the default fingerprint, unfolded. To use it to find the top 5 matches to molecule in target.sdf:

PROMPT> babel mymols.fs results.sdf -Starget.sdf -at5

or to get the matches with Tanimoto>0.6 to 1,2-dicyanobenzene:

PROMPT> babel mymols.fs results.sdf -sN#Cc1ccccc1C#N -at0.6

You can also do substructure searching using the index, so this command will find all molecules containing 1,2-dicyanobenzene and return the results as SMILES strings

PROMPT> babel mymols.fs –ifs -sN#Cc1ccccc1C#N results.smi

If all you want are the molecule names then adding –xt will return just the moleculae names.

PROMPT> babel mymols.fs –ifs -sN#Cc1ccccc1C#N results.smi -xt



OpenBabel Tools

There are a number of included tools that have been built using OpenBabel,

obprop calculates a couple of simple molecular properties (MWt and ring count).

PROMPT>obprop ' mymols.sdf'  > 'outputfile.txt'

PROMPT> cat  outputfile.txt

name MOL_00000067
mol_weight 191.989
num_rings 1
$$$$
name MOL_00000083
mol_weight 191.989
num_rings 1
$$$$
name MOL_00000105
mol_weight 191.989
num_rings 1
$$$$
name MOL_00000296
mol_weight 207.077
num_rings 1
$$$$

obgrep  can be used for structure-based searching for molecules inside multi-molecule files (e.g., SMILES, SDF, etc.) or across multiple files.

-c    Print the number of matches

     -f    Full match, print matching-molecules only when the number of heavy
           atoms is also equal to the number of atoms in the SMARTS pattern

     -i format
           Specifies input and output format,

     -n    Only print the name of the molecules

     -t #  Print a molecule only if the pattern occurs # times inside the mol-
           ecule

     -v    Invert the matching, print non-matching molecules

To print the names of those molecules containing bromobenzne

PROMPT> obgrep  -n 'c1ccccc1Br'  'mymols.sdf'
MOL_00000067
MOL_00000083
MOL_00000105
MOL_00016985
MOL_00042466
MOL_00045017
MOL_00077464
MOL_00191850
MOL_00857068
MOL_01812494
MOL_02660600
MOL_02683063
MOL_02683851
MOL_02683853
MOL_02683854
MOL_03411551
MOL_03428533
MOL_03428552
MOL_03789141
MOL_04038961

To simply count the number of molecules that match

PROMPT> obgrep  -c –t 2 'c1ccccc1Br'  'mymols.sdf'
20

To only select those compounds where the pattern occurs twice use:-

PROMPT> obgrep  -n –t 2 'c1ccccc1Br'  'mymols.sdf'
MOL_04038961


obchiral prints molecular chirality information

PROMPT> mymols.sdf
Molecule 1:  mol_1
Atom 2 Is Chiral C3
Volume= 8.35262
Atom refs= 1 2 3 4
Clockwise? 0
Molecule 2: mol_2
Atom 2 Is Chiral C3
Volume= -8.43506
Atom refs= 1 2 3 4
Clockwise? 0
Atom 3 Is Chiral N3
Volume= 2.15701
Atom refs= 1 2 3 4
Clockwise? 0
Molecule 3:  mol_3
Atom 2 Is Chiral C3
Volume= -8.37849
Atom refs= 1 2 3 4
Clockwise? 0
Atom 3 Is Chiral N3
Volume= 1.84595
Atom refs= 1 2 3 4
Clockwise? 0
Atom 19 Is Chiral C3
Volume= 10.331
Atom refs= 1 2 3 4
Clockwise? 0

This information can be piped into a file like this.

PROMPT> obchiral 'mymols.sdf' > '’outputfile'.txt

obfit -- superimpose molecules based on a pattern. Superimpose molecules using a quaternion fit. The atoms used to fit the two molecules are defined by the SMARTS pattern
given by the user. It is useful to align congeneric series of molecules on a common structural scaffold for 3D-QSAR studies. It can also be useful for displaying the results of
conformational generation.

PROMPT> obfit ' c1ccccc1Br ' static.sdf mymols.sdf

obrotate -- batch-rotate dihedral angles matching SMARTS patterns
The obrotate program rotates the torsional (dihedral) angle of a specified bond in molecules to that defined by the user. In other words, it does the same as a user setting an angle
in a molecular modelling package, but much faster and in batch mode (i.e. across multiple molecules in a file). The four atom IDs required are indexes into the SMARTS pattern,
which starts at atom 0 (zero). The angle supplied is in degrees. The two atoms used to set the dihedral angle <atom1> and <atom4> do not need to be connected to the atoms of
the bond <atom2> and <atom3> in any way. The order of the atoms matters -- the portion of the molecule attached to <atom1> and <atom2> remain fixed, but the portion
bonded to <atom3> and & <atom4> moves.
Let's say that you want to define the conformation of a large number of molecules with a pyridyl scaffold and substituted with an aliphatic chain at the 3-position, for example for
docking or 3D-QSAR purposes.

To set the value of the first dihedral angle to 90 degrees:

 PROMPT> obrotate 'c1ccncc1CCC' pyridines.sdf 5 6 7 8 90

Here 6 and 7 define the bond to rotate in the SMARTS patter, i.e., c1-C and atoms 5 and 8 define the particular dihedral angle to rotate.

Since the atoms to define the dihedral do not need to be directly connected, the nitrogen in the pyridine can be used:

 PROMPT> obrotate 'c1ccncc1CCC' pyridines.sdf 4 6 7 8 90

Keep the pyridyl ring fixed and moves the aliphatic chain:

 PROMPT> obrotate 'c1ccncc1CCC' pyridines.sdf 5 6 7 8 90

Keep the aliphatic chain fixed and move the pyridyl ring:
 PROMPT> obrotate 'c1ccncc1CCC' pyridines.sdf 8 7 6 5 90




-------------------------------------------------------
This SF.Net email is sponsored by the JBoss Inc.  Get Certified Today
Register for a JBoss Training Course.  Free Certification Exam
for All Training Attendees Through End of 2005. For more info visit:
<a href="http://ads.osdn.com/?ad_idv28&alloc_id845&op=click">http://ads.osdn.com/?ad_idv28&alloc_id845&op=click
_______________________________________________
OpenBabel-discuss mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/openbabel-discuss
Loading...