There are a lot of file formats used and produced by chemistry applications. Each program has his way to store geometries, trajectories, energies and properties etc. chemlab tries to encompass all of those different properties by using a lightweight way to handle such differences.
The classes responsible for the I/O are subclasses of chemlab.io.handlers.IOHandler. These handlers work all in the same way, here is an example of GroHandler:
from chemlab.io import GroIO
infile = GroIO('waterbox.gro')
system = infile.read('system')
# Modify system as you wish...
outfile = GroIO('waterbox_out.gro')
outfile.write('system', system)
You first create the handler instance for a certain format and then you can read a certain feature provided by the handler. In this example we read and write the system feature.
Some file formats may have some extra data for each atom, molecule or system. For example the ”.gro” file formats have his own way to call the atoms in a water molecule: OW, HW1, HW2. To handle such issues, you can write this information in the export arrays contained in the data structures, such as Atom.export, Molecule.export, and their array-based counterparts Molecule.atom_export_array, System.mol_export and System.atom_export_array.
Those attributes are especially important where you write in some data format, since you may have to provide those attribute when you initialize your Atom, Molecule and System.
You can easily open a data file without even having to search his format handler by using the utility function chemlab.io.datafile():
from chemlab.io import datafile
sys = datafile('waterbox.gro').read('system')
t, coords = datafile('traj.xtc').read('trajectory')
See also
Implementing or improving an existing IOHandler is a great way to partecipate in chemlab development. Fortuately, it’s extremely easy to setup one of them.
It boils down to a few steps:
Here is an example of the xyz handler:
import numpy as np
from chemlab.io.handlers import IOHandler
from chemlab.core import Molecule
class XyzIO(IOHandler):
'''The XYZ format is described in this wikipedia article
http://en.wikipedia.org/wiki/XYZ_file_format.
**Features**
.. method:: read("molecule")
Read the coordinates in a :py:class:`~chemlab.core.Molecule` instance.
.. method:: write("molecule", mol)
Writes a :py:class:`~chemlab.core.Molecule` instance in the XYZ format.
'''
can_read = ['molecule']
can_write = ['molecule']
def __init__(self, filename):
self.filename = filename
def read(self, feature):
self.check_feature(feature, "read")
lines = open(self.filename).readlines()
num = int(lines[0])
title = lines[1]
if feature == 'title':
return title
if feature == 'molecule':
type_array = []
r_array = []
for l in lines[2:]:
type, x, y, z = l.split()
r_array.append([float(x),float(y),float(z)])
type_array.append(type)
r_array = np.array(r_array)/10 # To nm
type_array = np.array(type_array)
return Molecule.from_arrays(r_array=r_array, type_array=type_array)
def write(self, feature, mol):
self.check_feature(feature, "write")
lines = []
if feature == 'molecule':
lines.append(str(mol.n_atoms))
lines.append('Generated by chemlab')
for t, (x, y, z) in zip(mol.type_array, mol.r_array):
lines.append(' %s %.6f %.6f %.6f' %
(t, x*10, y*10, z*10))
open(self.filename, 'w').write('\n'.join(lines))
A few remarks:
check_feature() before performing read/write. This will check that the feature is present in the can_read/can_write list;
Molecule.from_arrays() and System.from_arrays();
EdrIO handler does not read Molecule or System at all;
You can definitely take inspiration from the handlers included in chemlab, Supported File Formats.