chemlab.core

This package contains general functions and the most basic data containers such as Atom, Molecule and System. Plus some utility functions to create and edit common Systems.

The Atom class

class chemlab.core.Atom(type, r, export=None)

Create an Atom instance. Atom is a generic container for particle data.

Parameters

type: str
Atomic symbol
r: {np.ndarray [3], list [3]}
Atomic coordinates in nm
export: dict, optional
Additional export information.

Example

>>> Atom('H', [0.0, 0.0, 0.0])

In this example we’re attaching additional data to the Atom instance. The chemlab.io.GroIO can use this information when exporting in the gro format.

>>> Atom('H', [0.0, 0.0, 0.0], {'groname': 'HW1'})
type
Type:str

The atomic symbol e.g. Ar, H, O.

r
Type:np.ndarray(3) of floats

Atomic position in nm.

mass
Type:float

Mass in atomic mass units.

charge
Type:float

Charge in electron charge units.

export
Type:dict

Dictionary containing additional information when importing data from various formats.

See also

chemlab.io.gro.GroIO

fields
Type:tuple

This is a class attribute. The list of attributes that constitute the Atom. This is used to iterate over the Atom attributes at runtime.

copy()

Return a copy of the original Atom.

classmethod from_fields(**kwargs)

Create an Atom instance from a set of fields. This is a slightly faster way to initialize an Atom.

Example

>>> Atom.from_fields(type='Ar',
                     r_array=np.array([0.0, 0.0, 0.0]),
                     mass=39.948,
                     export={})

The Molecule class

class chemlab.core.Molecule(atoms, bonds=None, export=None)

Molecule is a data container for a set of N Atoms.

Parameters

atoms: list of Atom instances
Atoms that constitute the Molecule. Beware that the data gets copied and subsequend changes in the Atom instances will not reflect in the Molecule.
export: dict, optional
Export information for the Molecule
r_array
Type:np.ndarray((N,3), dtype=float)
Derived from:Atom

An array with the coordinates of each Atom.

type_array {numpy.array[N] of str}
Type:np.ndarray(N, dtype=str)
Derived from:Atom

An array containing the chemical symbols of the constituent atoms.

m_array
Type:np.ndarray(N, dtype=float)
Derived from:Atom

Array of masses.

charge_array
Type:np.ndarray(N, dtype=float)
Derived from:Atom

Array of the charges present on the atoms.

atom_export_array
Type:np.ndarray(N, dtype=object) array of dicts
Derived from:Atom

Array of Atom.export dicts.

n_atoms
Type:int

Number of atoms present in the molecule.

export
Type:dict

Export information for the whole Molecule.

bonds
Type:np.ndarray((NBONDS,2), dtype=int)

A list containing the indices of the atoms connected by a bond. Example: [[0 1] [0 2] [3 4]]

mass
Type:float

Mass of the whole molecule in amu.

center_of_mass
Type:float
geometric_center
Type:float
formula
Type:str

The brute formula of the Molecule. i.e. "H2O"

copy()

Return a copy of the molecule instance

classmethod from_arrays(**kwargs)

Create a Molecule from a set of Atom-derived arrays. Please refer to the Molecule Atom Derived Attributes. Only r_array and type_array are absolutely required, the others are optional.

>>> Molecule.from_arrays(r_array=np.array([[0.0, 0.0, 0.0],
                                           [1.0, 0.0, 0.0],
                                           [0.0, 1.0, 0.0]]),
                         type_array=np.array(['O', 'H', 'H']))
molecule(H2O)

Initializing a molecule in this way can be much faster than the default initialization method.

guess_bonds()

Guess the molecular bonds by using covalent radii information.

move_to(r)

Translate the molecule to a new position r.

tojson()

Return a json string representing the Molecule. This is useful for serialization.

The System class

class chemlab.core.System(molecules, box_vectors=None)

A data structure containing information of a set of N Molecules and NA Atoms.

Parameters

molecules: list of molecules
Molecules that constitute the System. The data gets copied to the System, subsequent changes to the Molecule are not reflected in the System.
box_vectors: np.ndarray((3,3), dtype=float), optional
You can specify a periodic box of another shape by giving 3 box vectors.

The System class has attributes derived both from the Molecule and the Atom class.

r_array
Type:np.ndarray((NA, 3), dtype=float)
Derived from:Atom

Atomic coordinates.

m_array
Type:np.ndarray(NA, dtype=float)
Derived from:Atom

Atomic masses.

type_array
Type:np.ndarray(NA, dtype=object) array of str
Derived from:Atom

Array of all the atomic symbols. It can be used to select certain atoms in a system.

charge_array
Type:np.ndarray(N, dtype=float)
Derived from:Atom

Array of the charges present on the atoms.

Example

Suppose you have a box of water defined by the System s, to select all oxygen atoms you can use the numpy selection rules:

>>> oxygens = s.type_array == 'O'
# oxygens is an array of booleans of length NA where
# each True corresponds to an oxygen atom i.e:
# [True, False, False, True, False, False]

You can use the oxygen array to access other properties:

>>> o_coordinates = s.r_array[oxygens]
>>> o_indices = np.arange(s.n_atoms)[oxygens]
bonds
Type:np.ndarray((NBONDS, 2), dtype=int)
Derived from:Molecule

An array of 2d indices that specify the index of the bonded atoms.

atom_export_array
Type:np.ndarray(NA, dtype=object) array of dict
Derived from:Atom
mol_export
Type:np.ndarray(N, dtype=object) array of dict
Derived from:Molecule

Export information relative to the molecule.

box_vectors
Type:np.ndarray((3,3), dtype=float) or None

Those are the three vectors that define of the periodic box of the system.

Example

To define an orthorombic box of size 3, 4, 5 nm:

>>> np.array([[3.0, 0.0, 0.0],  # Vector a
              [0.0, 4.0, 0.0],  # Vector b
              [0.0, 0.0, 5.0]]) # Vector c
n_mol
Type:int

Number of molecules.

n_atoms
Type:int

Number of atoms.

mol_indices
Type:np.ndarray(N, dtype=int)

Gives the starting index for each molecule in the atomic arrays. For example, in a System comprised of 3 water molecules:

>>> s.mol_indices
[0, 3, 6]
>>> s.type_array[0:3]
['O', 'H', 'H']

This array is used internally to retrieve all the Molecule derived data. Do not modify unless you know what you’re doing.

mol_n_atoms
Type:np.ndarray(N, dtype=int)

Contains the number of atoms present in each molecule

add(mol)

Add the molecule mol to a System initialized through System.empty.

atom_to_molecule_indices(selection)

Given the indices over atoms, return the indices over molecules. If an atom is selected, all the containing molecule is selected too.

Parameters

selection: np.ndarray((N,), dtype=int) | np.ndarray((NATOMS,), dtype=book)
Either an index array or a boolean selection array over the atoms

Returns

np.ndarray((N,), dtype=int) an array of molecular indices.

copy()

Return a copy of the current system.

classmethod empty(n_mol, n_atoms, box_vectors=None)

Initialize an empty System containing n_mol Molecules and n_atoms Atoms. The molecules can be added by using the method add().

Example

How to initialize a system of 3 water molecules:

s = System.empty(3, 9)
for i in range(3):
    s.add(water)
classmethod from_arrays(**kwargs)

Initialize a System from its constituent arrays. It is the fastest way to initialize a System, well suited for reading one or more big System from data files.

Parameters

The following parameters are required:

  • r_array
  • type_array
  • mol_indices

To further speed up the initialization process you optionally pass the other derived arrays:

  • m_array
  • mol_n_atoms
  • atom_export_array
  • mol_export

Example

Our classic example of 3 water molecules:

r_array = np.random.random((3, 9))
type_array = ['O', 'H', 'H', 'O', 'H', 'H', 'O', 'H', 'H']
mol_indices = [0, 3, 6]
System.from_arrays(r_array=r_array, type_array=type_array,
                   mol_indices=mol_indices)
classmethod from_json(string)

Create a System instance from a json string. Such strings are produced from the method chemlab.core.System.tojson()

get_molecule(index)

Get the Molecule instance corresponding to the molecule at index.

This method is useful to use Molecule properties that are generated each time, such as Molecule.formula and Molecule.center_of_mass

guess_bonds()

Guess the bonds between the molecules constituent of the system.

mol_to_atom_indices(indices)

Given the indices over molecules, return the indices over atoms.

Parameters

indices: np.ndarray((N,), dtype=int)
Array of integers between 0 and System.n_mol

Returns

np.ndarray((N,), dtype=int) the indices of all the atoms belonging to the selected molecules.

remove_atoms(indices)

Remove the atoms positioned at indices. The molecule containing the atom is removed as well.

If you have a system of 10 water molecules (and 30 atoms), if you remove the atoms at indices 0, 1 and 29 you will remove the first and last water molecules.

Parameters

indices: np.ndarray((N,), dtype=int)
Array of integers between 0 and System.n_atoms
remove_molecules(indices)

Remove the molecules positioned at indices.

For example, if you have a system comprised of 10 water molecules you can remove the first, fifth and nineth by using:

system.remove_molecules([0, 4, 8])

Parameters

indices: np.ndarray((N,), dtype=int)
Array of integers between 0 and System.n_mol
reorder_molecules(new_order)

Reorder the molecules in the system according to new_order.

Parameters

new_order: np.ndarray((NMOL,), dtype=int)
An array of integers containing the new order of the system.
sort()

Sort the molecules in the system according to their brute formula.

tojson()

Serialize a System instance using json.

Routines to manipulate Systems

chemlab.core.subsystem_from_molecules(orig, selection)

Create a system from the orig system by picking the molecules specified in selection.

Parameters

orig: System
The system from where to extract the subsystem
selection: np.ndarray of int or np.ndarray(N) of bool
selection can be either a list of molecular indices to select or a boolean array whose elements are True in correspondence of the molecules to select (it is usually the result of a numpy comparison operation).

Example

In this example we can see how to select the molecules whose center of mass that is in the region of space x > 0.1:

s = System(...) # It is a set of 10 water molecules

select = []
for i range(s.n_mol):
   if s.get_molecule(i).center_of_mass[0] > 0.1:
       select.append(i)

subs = subsystem_from_molecules(s, np.ndarray(select)) 

Note

The API for operating on molecules is not yet fully developed. In the future there will be smarter ways to filter molecule attributes instead of looping and using System.get_molecule.

chemlab.core.subsystem_from_atoms(orig, selection)

Generate a subsystem containing the atoms specified by selection. If an atom belongs to a molecule, the whole molecule is selected.

Example

This function can be useful when selecting a part of a system based on positions. For example, in this snippet you can see how to select the part of the system (a set of molecules) whose x coordinates is bigger than 1.0 nm:

s = System(...)
subs = subsystem_from_atoms(s.r_array[0,:] > 1.0)

Parameters

orig: System
Original system.
selection: np.ndarray of int or np.ndarray(NA) of bool
A boolean array that is True when the ith atom has to be selected or a set of atomic indices to be included.

Returns:

A new System instance.

chemlab.core.merge_systems(sysa, sysb, bounding=0.2)

Generate a system by merging sysa and sysb.

Overlapping molecules are removed by cutting the molecules of sysa that have atoms near the atoms of sysb. The cutoff distance is defined by the bounding parameter.

Parameters

sysa: System
First system
sysb: System
Second system
bounding: float or False
Extra space used when cutting molecules in sysa to make space for sysb. If it is False, no overlap handling will be performed.

Routines to create Systems

chemlab.core.crystal(positions, molecules, group, cellpar=[1.0, 1.0, 1.0, 90, 90, 90], repetitions=[1, 1, 1])

Build a crystal from atomic positions, space group and cell parameters.

Parameters

positions: list of coordinates
A list of the atomic positions
molecules: list of Molecule
The molecules corresponding to the positions, the molecule will be translated in all the equivalent positions.
group: int | str
Space group given either as its number in International Tables or as its Hermann-Mauguin symbol.
repetitions:
Repetition of the unit cell in each direction
cellpar:
Unit cell parameters

This function was taken and adapted from the spacegroup module found in ASE.

The module spacegroup module was originally developed by Jesper Frills.

chemlab.core.random_lattice_box(mol_list, mol_number, size, spacing=<Mock object>)

Make a box by placing the molecules specified in mol_list on random points of an evenly spaced lattice.

Using a lattice automatically ensures that no two molecules are overlapping.

Parameters

mol_list: list of Molecule instances
A list of each kind of molecules to add to the system.
mol_number: list of int
The number of molecules to place for each kind.
size: np.ndarray((3,), float)
The box size in nm
spacing: np.ndarray((3,), float), [0.3 0.3 0.3]
The lattice spacing in nm.

Returns

A System instance.

Example

Typical box with 1000 water molecules randomly placed in a box of size [2.0 2.0 2.0]:

from chemlab.db import ChemlabDB

# Example water molecule
water = ChemlabDB().get('molecule', 'example.water')

s = random_water_box([water], [1000], [2.0, 2.0, 2.0])