Chemistry::Mok - molecular awk interpreter


    use Chemistry::Mok;
    $code = '/CS/g{ $n++; $l += $match->bond_map(0)->length }
        END { printf "Average C-S bond length: %.3f\n", $l/$n; }';

    my $mok = Chemistry::Mok->new($code);
    $mok->run({ format => mdlmol }, glob("*.mol"));


This module is the engine behind the mok program. See mok(1) for a detailed description of the language. Mok is part of the PerlMol project,


  • Chemistry::Mok->new($code, %options)
  • Compile the code and return a Chemistry::Mok object. Available options:

    • package
    • If the package option is given, the code runs in the Chemistry::Mok::UserCode::$options{package} package instead of the Chemistry::Mok::UserCode::Default package. Specifying a package name is recommended if you have more than one mok object and you are using global varaibles, in order to avoid namespace clashes.

    • pattern_format
    • The name of the format which will be used for parsing slash-delimited patterns that don't define an explicit format. Mok versions until 0.16 only used the 'smiles' format, but newer versions can use other formats such as 'smarts', 'midas', 'formula_pattern', and 'sln', if available. The default is 'smarts'.

  • $mok->run($options, @args)
  • Run the code on the filenames contained in @args. $options is a hash reference with runtime options. Available options:

    • build_3d
    • Generate 3D coordinates using Chemistry::3DBuilder.

    • aromatize
    • "Aromatize" each molecule as it is read. This is needed for example for matching SMARTS patterns that use aromaticity or ring primitives.

    • delete_dummies
    • Delete dummy atoms after reading each molecule. A dummy atom is defined as an atom with an unknown symbol (i.e., it doesn't appear on the periodic table), or an atomic number of zero.

    • find_bonds
    • If set to a true value, find bonds. Use it when reading files with no bond information but 3D coordinates to detect the bonds if needed (for example, if you want to do match a pattern that includes bonds). If the file has explicit bonds, mok will not try to find the bonds, but it will reassign the bond orders from scratch.

    • format
    • The format used when calling $mol_class->read. If not given, $mol_class->read tries to identify the format automatically.

    • mol_class
    • The molecule class used for reading the files. Defaults to Chemistry::Mol.






Ivan Tubert-Brohman <>


Copyright (c) 2005 Ivan Tubert-Brohman. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.