...
 
Commits (8)
# The full list of properties is located at
# https://github.com/editorconfig/editorconfig/wiki/EditorConfig-Properties.
root = true
[*.py]
charset = utf-8
indent_style = space
indent_size = 4
[.travis.yml]
indent_style = space
indent_size = 2
......@@ -3,7 +3,7 @@ dist: trusty
language: python
python:
- 2.7
- 3.2
- 3.4
- 3.6
addons:
apt:
......@@ -12,9 +12,31 @@ addons:
- swig
cache:
pip: true
before_install:
- pip install -r requirements.txt
install:
- python setup.py install
- pip install .
before_script:
- |
export DOCS_BRANCH_NAME=master
export DOCS_REPO_NAME=cclib.github.io
export DOCS_REPO_OWNER=cclib
export DOCS_ROOT_DIR="${TRAVIS_BUILD_DIR}"/doc/sphinx
export DOCS_BUILD_DIR="${DOCS_ROOT_DIR}"/_build/html
export THEME_DIR="${DOCS_ROOT_DIR}"/_themes
- install -dm755 "${THEME_DIR}"
script:
- sh travis/run_travis_tests.sh
- bash travis/run_travis_tests.sh
- bash travis/build_docs.bash
- env | sort
- |
if [[ "${TRAVIS_BRANCH}" == master && "${TRAVIS_PULL_REQUEST}" == false && $TRAVIS_PYTHON_VERSION == 3.6 ]];
then
# Commits to master that are not pull requests, that is, only actual
# addition of code to master, should deploy the documentation.
bash ${TRAVIS_BUILD_DIR}/travis/deploy_docs_travis.bash
fi
On behalf of the cclib development team, we are pleased to announce the release of cclib 1.5.3, which is now available for download from https://cclib.github.io. This is a minor update to version 1.5 that includes some new functionality and attributes, as well as bug fixes and small improvements.
On behalf of the cclib development team, we are pleased to announce the release of cclib 1.6, which is now available for download from https://cclib.github.io. This is a new minor version that includes two new parsers, some new functionality and attributes, as well as bug fixes and small improvements.
cclib is an open source library, written in Python, for parsing and interpreting the results of computational chemistry packages. It currently parses output files from 13 different programs: ADF, DALTON, Firefly, GAMESS (US), GAMESS-UK, Gaussian, Jaguar, Molpro, MOPAC, NWChem, ORCA, Psi and QChem.
cclib is an open source library, written in Python, for parsing and interpreting the results of computational chemistry packages. It currently parses output files from 15 different programs: ADF, DALTON, Firefly, GAMESS (US), GAMESS-UK, Gaussian, Jaguar, Molpro, MOLCAS, MOPAC, NWChem, ORCA, Psi, QChem and Turbomole.
Among other data, cclib extracts:
......@@ -28,7 +28,7 @@ If you need help, find a bug, want new features or have any questions, please se
If your published work uses cclib, please support its development by citing the following article:
N. M. O'Boyle, A. L. Tenderholt, K. M. Langner, cclib: a library for package-independent computational chemistry algorithms, J. Comp. Chem. 29 (5), 839-845 (2008)
You can also specifically reference this version of cclib as:
Eric Berquist, Karol M. Langner, Noel M. O'Boyle, and Adam L. Tenderholt. Release of cclib version 1.5. 2016. https://dx.doi.org/10.5281/zenodo.60670
Eric Berquist, Karol M. Langner, Noel M. O'Boyle, and Adam L. Tenderholt. Release of cclib version 1.6. 2018. https://dx.doi.org/10.5281/zenodo.1407790
Regards,
The cclib development team
......
Changes in cclib-1.6
Features:
* New parser: cclib can now parse Molcas files (Kunal Sharma)
* New parser: cclib can now parse Turbomole files (Christopher Rowley, Kunal Sharma)
* New script: ccframe writes data table files from logfiles (Felipe Schneider)
* New method: stoichiometry builds the chemical formula of a system (Jaime Rodríguez-Guerra)
* Support package version in metadata for most parsers
* Support time attribute and BOMD output in Gaussian, NWChem, ORCA and QChem
* Support grads and metadata attributes in ORCA (Jonathon Vandezande)
* Experimental support for CASSCF output in ORCA (Jonathon Vandezande)
* Added entry in metadata for successful completion of jobs
* Updated test file versions to ORCA 4.0
* Update minimum Python3 version to 3.4
Bugfixes:
* Fixed parsing ORCA output with linear molecules (Jonathon Vandezande)
* Fixed parsing NWChem output with incomplete SCF
Changes in cclib-1.5.3
Features:
......@@ -13,20 +32,20 @@ Bugfixes:
* Fixed closed shell determination for Gaussian (Jaime Rodríguez-Guerra)
* Fixed parsing of natom for >9999 atoms in Gaussian (Jaime Rodríguez-Guerra)
* Fixed parsing of ADF jobs with no title
* Fixed parsing of charge and core electrons when usin ECPs in QChem
* Fixed parsing of charge and core electrons when using ECPs in QChem
* Fixed parsing of scfvalues for malformed output in Gaussian
Changes in cclib-1.5.2:
Features:
* Support for writing Molden and WFX files (Sagar Gaur)
* Support for thermochesmitry attributes in ORCA (Jonathon Vandezande)
* Support for thermochemistry attributes in ORCA (Jonathon Vandezande)
* Support for chelpg atomic charges in ORCA (Richard Gowers)
* Updated test file versions to GAMESS-US 2017 (Sagar Gaur)
* Added option to print full arrays with ccget (Sagar Gaur)
Bugfixes:
* Fxied polarizability parsing bug in DALTON (Maxim Stolyarchuk)
* Fixed polarizability parsing bug in DALTON (Maxim Stolyarchuk)
* Fixed IRC parsing in Gaussian for large trajectories (Dénes Berta, LaTruelle)
* Fixed coordinate parsing for heavy elements in ORCA (Jonathon Vandezande)
* Fixed parsing of large mocoeffs in fixed width format for QChem (srtlg)
......@@ -46,7 +65,7 @@ Bugfixes:
* Restore alias cclib.parser.ccopen for backwards compatibility
* Fixed parsing thermochemistry for single atoms in QChem
* Fixed handling of URLs (Alexey Alnatanov)
* Fixed Atom object creation in biopython bridge (Nitish Garg)
* Fixed Atom object creation in Biopython bridge (Nitish Garg)
* Fixed ccopen when working with multiple files
Changes in cclib-1.5:
......@@ -57,7 +76,7 @@ Features:
* New attribute time tracks coordinates for dynamics jobs (Ramon Crehuet)
* New attribute metadata holds miscellaneous information not in other attributes (bwang2453)
* Extract moments attribute for Gaussian (Geoff Hutchison)
* Extract atombasis for ADF in simple cases (Felix Plaser)
* Extract atombasis for ADF in simple cases (Felix Plasser)
* License change to BSD 3-Clause License
Bugfixes:
......@@ -75,7 +94,7 @@ Features:
Bugfixes:
* Fix for non-standard basis sets in DALTON
* Fix for non-standard MO coefficient printin in GAMESS
* Fix for non-standard MO coefficient printing in GAMESS
Changes in cclib-1.4:
......@@ -97,10 +116,10 @@ Bugfixes:
* Fix parsing basis section for Molpro job generated by Avogadro
* Fix parsing multi-job Gaussian output with different orbitals (Geoff Hutchinson)
* Fix parsing ORCA geometry optimization with improper internal coordinates (glideht)
* Fix units in atom corodinates parsed from GAMESS-UK files (mwykes)
* Fix units in atom coordinates parsed from GAMESS-UK files (mwykes)
* Fix test for vibrational frequencies in Turbomole (mwykes)
* Fix parsing vibration symmetries for Molpro (mwykes)
* Fix parsing egenvectors in GAMESS-US (Alexis Otero-Calvis)
* Fix parsing eigenvectors in GAMESS-US (Alexis Otero-Calvis)
* Fix duplicate parsing of symmetry labels for Gaussian (Martin Peeks)
Changes in cclib-1.3.1:
......@@ -140,7 +159,7 @@ Bugfixes:
Changes in cclib-1.2:
Features:
* Move project to github
* Move project to GitHub
* Transition to Python 3 (Python 2.7 will still work)
* Add a multifile mode to ccget script
* Extract vibrational displacements for ORCA
......@@ -150,8 +169,8 @@ Features:
Gaussian09, Molpro 2012 and ORCA 3.0.1
Bugfixes:
* Ignore unicode errors in logfiles
* Handle Guassian jobs with terse output (basis set count not reported)
* Ignore Unicode errors in logfiles
* Handle Gaussian jobs with terse output (basis set count not reported)
* Handle Gaussian jobs using IndoGuess (Scott McKechnie)
* Handle Gaussian file with irregular ONION gradients (Tamilmani S)
* Handle ORCA file with SCF convergence issue (Melchor Sanchez)
......@@ -164,9 +183,9 @@ Changes in cclib-1.1:
Features:
* Add progress info for all parsers
* Support ONIOM calculations in Gaussian (Karen Hemelsoet)
* New attribute atomcharges extracts Mulliken and Lowdin atomic
* New attribute atomcharges extracts Mulliken and Löwdin atomic
charges if present
* New attribute atomspins extracts Mulliken and Lowdin atomic spin
* New attribute atomspins extracts Mulliken and Löwdin atomic spin
densities if present
* New thermodynamic attributes: freeenergy, temperature, enthalpy
(Edward Holland)
......@@ -252,7 +271,7 @@ Features:
* GAMESS-US parser: added 'etoscs' (CIS calculations)
* Jaguar parser: added 'mpenergies' (LMP2 calcualtions)
* Jaguar parser: added 'etenergies' and 'etoscs' (CIS calculations)
* New method: Lowdin Population Analysis (LPA)
* New method: Löwdin Population Analysis (LPA)
* Tests: unittests can be run from the Python interpreter, and for
a single parser; the number of "passed" tests is also counted and shown
......@@ -270,7 +289,7 @@ Features:
* API addition: 'gbasis' holds the Gaussian basis set
* API addition: 'coreelectrons' contains the number of core electrons
in each atom's pseudopotential
* API addition: 'mpenergies' holds the Moller-Plesset corrected
* API addition: 'mpenergies' holds the Møller-Plesset corrected
molecular electronic energies
* API addition: 'vibdisps' holds the Cartesian displacement vectors
* API change: 'mocoeffs' is now a list of rank 2 arrays, rather than a
......
......@@ -55,7 +55,7 @@ To test, trying importing cclib at the Python prompt. You should see something s
History is saved to ~/.pyhistory.
>>> import cclib
>>> cclib.__version__
'1.5.3'
'1.6'
To run the unit tests, change directory into INSTALLDIR/test and run the following command:
......
### cclib
[![DOI](https://zenodo.org/badge/doi/10.5281/zenodo.50324.svg)](https://dx.doi.org/10.5281/zenodo.60670)
[![DOI](https://zenodo.org/badge/doi/10.5281/zenodo.1407790.svg)](https://dx.doi.org/10.5281/zenodo.1407790)
[![PyPI version](http://img.shields.io/pypi/v/cclib.svg?style=flat)](https://pypi.python.org/pypi/cclib)
[![GitHub release](https://img.shields.io/github/release/cclib/cclib.svg?style=flat)](https://github.com/cclib/cclib/releases)
[![build status](http://img.shields.io/travis/cclib/cclib/master.svg?style=flat)](https://travis-ci.org/cclib/cclib)
......
......@@ -23,7 +23,7 @@ To this end, cclib provides a number of bridges to help transfer data to other l
as well as example methods that take parsed data as input.
"""
__version__ = "1.5.3"
__version__ = "1.6"
from cclib import parser
from cclib import progress
......
......@@ -5,20 +5,23 @@
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Contains all writers for standard chemical representations"""
"""Contains all writers for standard chemical representations."""
from cclib.io.cjsonreader import CJSON as CJSONReader
from cclib.io.cjsonwriter import CJSON as CJSONWriter
from cclib.io.cmlwriter import CML
from cclib.io.xyzwriter import XYZ
from cclib.io.moldenwriter import MOLDEN
from cclib.io.wfxwriter import WFXWriter
from cclib.io.cjsonreader import CJSON as CJSONReader
from cclib.io.xyzreader import XYZ as XYZReader
from cclib.io.xyzwriter import XYZ as XYZWriter
# This allows users to type:
# from cclib.io import ccframe
# from cclib.io import ccopen
# from cclib.io import ccread
# from cclib.io import ccwrite
# from cclib.io import URL_PATTERN
from cclib.io.ccio import ccframe
from cclib.io.ccio import ccopen
from cclib.io.ccio import ccread
from cclib.io.ccio import ccwrite
......
......@@ -10,12 +10,12 @@
from __future__ import print_function
import atexit
import io
import os
import sys
import re
from tempfile import NamedTemporaryFile
import atexit
# Python 2->3 changes the default file object hierarchy.
if sys.version_info[0] == 2:
......@@ -23,12 +23,12 @@ if sys.version_info[0] == 2:
from urllib2 import urlopen, URLError
else:
import io
fileclass = io.IOBase
from urllib.request import urlopen
from urllib.error import URLError
from cclib.parser import logfileparser
from cclib.parser import data
......@@ -38,19 +38,23 @@ from cclib.parser.gamessparser import GAMESS
from cclib.parser.gamessukparser import GAMESSUK
from cclib.parser.gaussianparser import Gaussian
from cclib.parser.jaguarparser import Jaguar
from cclib.parser.molcasparser import Molcas
from cclib.parser.molproparser import Molpro
from cclib.parser.mopacparser import MOPAC
from cclib.parser.nwchemparser import NWChem
from cclib.parser.orcaparser import ORCA
from cclib.parser.psiparser import Psi
from cclib.parser.psi3parser import Psi3
from cclib.parser.psi4parser import Psi4
from cclib.parser.qchemparser import QChem
from cclib.parser.turbomoleparser import Turbomole
from cclib.io import cjsonreader
from cclib.io import cjsonwriter
from cclib.io import cmlwriter
from cclib.io import xyzwriter
from cclib.io import moldenwriter
from cclib.io import wfxwriter
from cclib.io import xyzreader
from cclib.io import xyzwriter
try:
from cclib.bridge import cclib2openbabel
......@@ -58,6 +62,11 @@ try:
except ImportError:
_has_cclib2openbabel = False
try:
import pandas as pd
except ImportError:
# Fail silently for now.
pass
# Regular expression for validating URLs
URL_PATTERN = re.compile(
......@@ -78,8 +87,7 @@ URL_PATTERN = re.compile(
# after finding GAMESS in case the more specific phrase is found.
# 2. Molpro log files don't have the program header, but always contain
# the generic string 1PROGRAM, so don't break here either to be cautious.
# 3. The Psi header has two different strings with some variation
# 4. "MOPAC" is used in some packages like GAMESS, so match MOPAC20##
# 3. "MOPAC" is used in some packages like GAMESS, so match MOPAC20##
#
# The triggers are defined by the tuples in the list below like so:
# (parser, phrases, flag whether we should break)
......@@ -92,23 +100,32 @@ triggers = [
(GAMESSUK, ["G A M E S S - U K"], True),
(Gaussian, ["Gaussian, Inc."], True),
(Jaguar, ["Jaguar"], True),
(Molcas, ["MOLCAS"], True),
(Molpro, ["PROGRAM SYSTEM MOLPRO"], True),
(Molpro, ["1PROGRAM"], False),
(MOPAC, ["MOPAC20"], True),
(NWChem, ["Northwest Computational Chemistry Package"], True),
(ORCA, ["O R C A"], True),
(Psi, ["PSI", "Ab Initio Electronic Structure"], True),
(Psi3, ["PSI3: An Open-Source Ab Initio Electronic Structure Package"], True),
(Psi4, ["Psi4: An Open-Source Ab Initio Electronic Structure Package"], True),
(QChem, ["A Quantum Leap Into The Future Of Chemistry"], True),
(Turbomole, ["TURBOMOLE"], True),
]
outputclasses = {
readerclasses = {
'cjson': cjsonreader.CJSON,
'json': cjsonreader.CJSON,
'xyz': xyzreader.XYZ,
}
writerclasses = {
'cjson': cjsonwriter.CJSON,
'json': cjsonwriter.CJSON,
'cml': cmlwriter.CML,
'xyz': xyzwriter.XYZ,
'molden': moldenwriter.MOLDEN,
'wfx': wfxwriter.WFXWriter
'wfx': wfxwriter.WFXWriter,
'xyz': xyzwriter.XYZ,
}
......@@ -171,9 +188,9 @@ def ccopen(source, *args, **kargs):
*args, **kargs - arguments and keyword arguments passed to filetype
Returns:
one of ADF, DALTON, GAMESS, GAMESS UK, Gaussian, Jaguar, Molpro, MOPAC,
NWChem, ORCA, Psi, QChem, CJSON or None (if it cannot figure it out or
the file does not exist).
one of ADF, DALTON, GAMESS, GAMESS UK, Gaussian, Jaguar,
Molpro, MOPAC, NWChem, ORCA, Psi3, Psi/Psi4, QChem, CJSON or None
(if it cannot figure it out or the file does not exist).
"""
inputfile = None
......@@ -255,12 +272,17 @@ def ccopen(source, *args, **kargs):
# Proceed to return an instance of the logfile parser only if the filetype
# could be guessed. Need to make sure the input file is closed before creating
# an instance, because parsers will handle opening/closing on their own.
# If the input file is a CJSON file and not a standard compchemlog file, don't
# guess the file.
if kargs.get("cjson", False):
filetype = cjsonreader.CJSON
else:
filetype = guess_filetype(inputfile)
filetype = guess_filetype(inputfile)
# If the input file isn't a standard compchem log file, try one of
# the readers, falling back to Open Babel.
if not filetype:
if kargs.get("cjson"):
filetype = readerclasses['cjson']
elif source and not is_stream:
ext = os.path.splitext(source)[1][1:].lower()
for extension in readerclasses:
if ext == extension:
filetype = readerclasses[extension]
# Proceed to return an instance of the logfile parser only if the filetype
# could be guessed. Need to make sure the input file is closed before creating
......@@ -275,6 +297,9 @@ def ccopen(source, *args, **kargs):
else:
inputfile.seek(0, 0)
if not is_stream:
if is_listofstrings:
if filetype == Turbomole:
source = sort_turbomole_outputs(source)
inputfile.close()
return filetype(source, *args, **kargs)
return filetype(inputfile, *args, **kargs)
......@@ -381,8 +406,8 @@ def _determine_output_format(outputtype, outputdest):
# First check outputtype.
if isinstance(outputtype, str):
extension = outputtype.lower()
if extension in outputclasses:
outputclass = outputclasses[extension]
if extension in writerclasses:
outputclass = writerclasses[extension]
else:
raise UnknownOutputFormatError(extension)
else:
......@@ -393,9 +418,92 @@ def _determine_output_format(outputtype, outputdest):
extension = os.path.splitext(outputdest.name)[1].lower()
else:
raise UnknownOutputFormatError
if extension in outputclasses:
outputclass = outputclasses[extension]
if extension in writerclasses:
outputclass = writerclasses[extension]
else:
raise UnknownOutputFormatError(extension)
return outputclass
def path_leaf(path):
"""
Splits the path to give the filename. Works irrespective of '\'
or '/' appearing in the path and also with path ending with '/' or '\'.
Inputs:
path - a string path of a logfile.
Returns:
tail - 'directory/subdirectory/logfilename' will return 'logfilename'.
ntpath.basename(head) - 'directory/subdirectory/logfilename/' will return 'logfilename'.
"""
head, tail = os.path.split(path)
return tail or os.path.basename(head)
def sort_turbomole_outputs(filelist):
"""
Sorts a list of inputs (or list of log files) according to the order
defined below. Just appends the unknown files in the end of the sorted list.
Inputs:
filelist - a list of Turbomole log files needed to be parsed.
Returns:
sorted_list - a sorted list of Turbomole files needed for proper parsing.
"""
sorting_order = {
'basis' : 0,
'control' : 1,
'mos' : 2,
'alpha' : 3,
'beta' : 4,
'job.last' : 5,
'coord' : 6,
'gradient' : 7,
'aoforce' : 8,
}
known_files = []
unknown_files = []
sorted_list = []
for fname in filelist:
filename = path_leaf(fname)
if filename in sorting_order:
known_files.append([fname, sorting_order[filename]])
else:
unknown_files.append(fname)
for i in sorted(known_files, key=lambda x: x[1]):
sorted_list.append(i[0])
if unknown_files:
sorted_list.append(known_files)
return sorted_list
def ccframe(ccobjs, *args, **kwargs):
"""Returns a pandas.DataFrame of data attributes parsed by cclib from one
or more logfiles.
Inputs:
ccobjs - an iterable of either cclib jobs (from ccopen) or data (from
job.parse()) objects
Returns:
a pandas.DataFrame
"""
logfiles = []
for ccobj in ccobjs:
# Is ccobj an job object (unparsed), or is it a ccdata object (parsed)?
if isinstance(ccobj, logfileparser.Logfile):
jobfilename = ccobj.filename
ccdata = ccobj.parse()
elif isinstance(ccobj, data.ccData):
jobfilename = None
ccdata = ccobj
else:
raise ValueError
attributes = ccdata.getattributes()
attributes.update({
'jobfilename': jobfilename
})
logfiles.append(pd.Series(attributes))
return pd.DataFrame(logfiles)
......@@ -27,7 +27,8 @@ class CJSON:
def read_cjson(self):
inputfile = self.filename
json_data = json.loads(open(inputfile).read())
with open(inputfile) as cjsonfile:
json_data = json.loads(cjsonfile.read())
# Actual update of attribute dictionary happens here
self.construct(json_data)
......
......@@ -38,9 +38,8 @@ class CJSON(filewriter.Writer):
name = os.path.basename(os.path.splitext(path)[0])
return name
def generate_repr(self):
"""Generate the CJSON representation of the logfile data."""
def as_dict(self):
""" Build a Python dict with the CJSON data"""
cjson_dict = dict()
# Need to decide on a number format.
cjson_dict['chemical json'] = 0
......@@ -138,7 +137,11 @@ class CJSON(filewriter.Writer):
if _has_openbabel:
cjson_dict['properties']['molecular mass'] = self.pbmol.molwt
cjson_dict['diagram'] = self.pbmol.write(format='svg')
return cjson_dict
def generate_repr(self):
"""Generate the CJSON representation of the logfile data."""
cjson_dict = self.as_dict()
if self.terse:
return json.dumps(cjson_dict, cls=NumpyAwareJSONEncoder)
else:
......@@ -150,13 +153,14 @@ class CJSON(filewriter.Writer):
object: Python dictionary which is being appended with the key value.
key: cclib attribute name.
Returns:
Returns:
None. The dictionary is modified to contain the attribute with the
cclib keyname as key
"""
if hasattr(self.ccdata, key):
object[ccData._attributes[key].jsonKey] = getattr(self.ccdata, key)
class NumpyAwareJSONEncoder(json.JSONEncoder):
"""A encoder for numpy.ndarray's obtained from the cclib attributes.
For all other types the json default encoder is called.
......
# -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Generic file reader and related tools"""
class Reader(object):
"""Abstract class for reader objects."""
def __init__(self, source, *args, **kwargs):
"""Initialize the Reader object.
This should be called by a subclass in its own __init__ method.
Inputs:
source - A single filename, stream [TODO], or list of filenames/streams [TODO].
"""
if isinstance(source, str):
self.filename = source
else:
raise ValueError
def parse(self):
"""Read the raw contents of the source into the Reader."""
# TODO This cannot currently handle streams.
with open(self.filename) as handle:
self.filecontents = handle.read()
return None
def generate_repr(self):
"""Convert the raw contents of the source into the internal representation.
This should be overriden by all the subclasses inheriting from
Reader.
"""
raise NotImplementedError(
'generate_repr is not implemented for ' + str(type(self)))
......@@ -7,6 +7,11 @@
"""Generic file writer and related tools"""
import logging
from math import sqrt
from collections import Iterable
try:
from cclib.bridge import makeopenbabel
import openbabel as ob
......@@ -15,9 +20,6 @@ try:
except ImportError:
_has_openbabel = False
from math import sqrt
from collections import Iterable
from cclib.parser.utils import PeriodicTable
......@@ -86,10 +88,20 @@ class Writer(object):
def _make_openbabel_from_ccdata(self):
"""Create Open Babel and Pybel molecules from ccData.
"""
if not hasattr(self.ccdata, 'charge'):
logging.warning("ccdata object does not have charge, setting to 0")
_charge = 0
else:
_charge = self.ccdata.charge
if not hasattr(self.ccdata, 'mult'):
logging.warning("ccdata object does not have spin multiplicity, setting to 1")
_mult = 1
else:
_mult = self.ccdata.mult
obmol = makeopenbabel(self.ccdata.atomcoords,
self.ccdata.atomnos,
charge=self.ccdata.charge,
mult=self.ccdata.mult)
charge=_charge,
mult=_mult)
if self.jobfilename is not None:
obmol.SetTitle(self.jobfilename)
return (obmol, pb.Molecule(obmol))
......
# -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""A reader for XYZ (Cartesian coordinate) files."""
from cclib.io import filereader
from cclib.parser.data import ccData
from cclib.parser.utils import PeriodicTable
class XYZ(filereader.Reader):
"""A reader for XYZ (Cartesian coordinate) files."""
def __init__(self, source, *args, **kwargs):
super(XYZ, self).__init__(source, *args, **kwargs)
self.pt = PeriodicTable()
def parse(self):
super(XYZ, self).parse()
self.generate_repr()
return self.data
def generate_repr(self):
"""Convert the raw contents of the source into the internal representation."""
assert hasattr(self, 'filecontents')
it = iter(self.filecontents.splitlines())
# Ordering of lines:
# 1. number of atoms
# 2. comment line
# 3. line of at least 4 columns: 1 is atomic symbol (str), 2-4 are atomic coordinates (float)
# repeat for numver of atoms
# (4. optional blank line)
# repeat for multiple sets of coordinates
all_atomcoords = []
while True:
try:
line = next(it)
if line.strip() == '':
line = next(it)
tokens = line.split()
assert len(tokens) >= 1
natom = int(tokens[0])
comment = next(it)
lines = []
for _ in range(natom):
line = next(it)
tokens = line.split()
assert len(tokens) >= 4
lines.append(tokens)
assert len(lines) == natom
atomsyms = [line[0] for line in lines]
atomnos = [self.pt.number[atomsym] for atomsym in atomsyms]
atomcoords = [line[1:4] for line in lines]
# Everything beyond the fourth column is ignored.
all_atomcoords.append(atomcoords)
except StopIteration:
break
attributes = {
'natom': natom,
'atomnos': atomnos,
'atomcoords': all_atomcoords,
}
self.data = ccData(attributes)
......@@ -72,6 +72,9 @@ class XYZ(filewriter.Writer):
for i in indices:
xyzblock.append(self._xyz_from_ccdata(i))
# Ensure an extra newline at the very end.
xyzblock.append('')
return '\n'.join(xyzblock)
def _xyz_from_ccdata(self, index):
......
# -*- coding: utf-8 -*-
#
# Copyright (c) 2017, the cclib development team
#
# This file is part of cclib (http://cclib.github.io) and is distributed under
# the terms of the BSD 3-Clause License.
"""Calculate properties of nuclei based on data parsed by cclib."""
import logging
import numpy as np
try:
import periodictable as pt
except ImportError:
# Fail silently for now.
pass
try:
import scipy.constants as spc
except ImportError:
# Fail silently for now.
pass
from cclib.method.calculationmethod import Method
from cclib.parser.utils import PeriodicTable
def get_most_abundant_isotope(element):
"""Given a `periodictable` element, return the most abundant
isotope.
"""
most_abundant_isotope = element.isotopes[0]
abundance = 0
for iso in element:
if iso.abundance > abundance:
most_abundant_isotope = iso
abundance = iso.abundance
return most_abundant_isotope
def get_isotopic_masses(charges):
"""Return the masses for the given nuclei, respresented by their
nuclear charges.
"""
masses = []
for charge in charges:
el = pt.elements[charge]
isotope = get_most_abundant_isotope(el)
mass = isotope.mass
masses.append(mass)
return np.array(masses)
class Nuclear(Method):
"""A container for methods pertaining to atomic nuclei."""
def __init__(self, data, progress=None, loglevel=logging.INFO, logname="Log"):
self.required_attrs = ('natom','atomcoords','atomnos','charge')
super(Nuclear, self).__init__(data, progress, loglevel, logname)
def __str__(self):
"""Return a string representation of the object."""
return "Nuclear"
def __repr__(self):
"""Return a representation of the object."""
return "Nuclear"
def stoichiometry(self):
"""Return the stoichemistry of the object according to the Hill system"""
pt = PeriodicTable()
elements = [pt.element[ano] for ano in self.data.atomnos]
counts = {el: elements.count(el) for el in set(elements)}
formula = ""
elcount = lambda el, c: "%s%i" % (el, c) if c > 1 else el
if 'C' in elements:
formula += elcount('C', counts['C'])
counts.pop('C')
if 'H' in elements:
formula += elcount('H', counts['H'])
counts.pop('H')
for el, c in sorted(counts.items()):
formula += elcount(el, c)
if getattr(self.data, 'charge', 0):
magnitude = abs(self.data.charge)
sign = "+" if self.data.charge > 0 else "-"
formula += "(%s%i)" % (sign, magnitude)
return formula
def repulsion_energy(self, atomcoords_index=-1):
"""Return the nuclear repulsion energy."""
nre = 0.0
for i in range(self.data.natom):
ri = self.data.atomcoords[atomcoords_index][i]
zi = self.data.atomnos[i]
for j in range(i+1, self.data.natom):
rj = self.data.atomcoords[0][j]
zj = self.data.atomnos[j]
d = np.linalg.norm(ri-rj)
nre += zi*zj/d
return nre
def center_of_mass(self, atomcoords_index=-1):
"""Return the center of mass."""
charges = self.data.atomnos
coords = self.data.atomcoords[atomcoords_index]
masses = get_isotopic_masses(charges)
mwc = coords * masses[:, np.newaxis]
numerator = np.sum(mwc, axis=0)
denominator = np.sum(masses)
return numerator / denominator
def moment_of_inertia_tensor(self, atomcoords_index=-1):
"""Return the moment of inertia tensor."""
charges = self.data.atomnos
coords = self.data.atomcoords[atomcoords_index]
masses = get_isotopic_masses(charges)
moi_tensor = np.empty((3, 3))
moi_tensor[0][0] = np.sum(masses * (coords[:, 1]**2 + coords[:, 2]**2))
moi_tensor[1][1] = np.sum(masses * (coords[:, 0]**2 + coords[:, 2]**2))
moi_tensor[2][2] = np.sum(masses * (coords[:, 0]**2 + coords[:, 1]**2))
moi_tensor[0][1] = np.sum(masses * coords[:, 0] * coords[:, 1])
moi_tensor[0][2] = np.sum(masses * coords[:, 0] * coords[:, 2])
moi_tensor[1][2] = np.sum(masses * coords[:, 1] * coords[:, 2])
moi_tensor[1][0] = moi_tensor[0][1]
moi_tensor[2][0] = moi_tensor[0][2]
moi_tensor[2][1] = moi_tensor[1][2]
return moi_tensor
def principal_moments_of_inertia(self, units='amu_bohr_2'):
"""Return the principal moments of inertia in 3 kinds of units:
1. [amu][bohr]^2
2. [amu][angstrom]^2
3. [g][cm]^2
and the principal axes.
"""
choices = ('amu_bohr_2', 'amu_angstrom_2', 'g_cm_2')
units = units.lower()
if units not in choices:
raise ValueError("Invalid units, pick one of {}".format(choices))
moi_tensor = self.moment_of_inertia_tensor()
principal_moments, principal_axes = np.linalg.eigh(moi_tensor)
if units == 'amu_bohr_2':
conv = 1
if units == 'amu_angstrom_2':
bohr2ang = spc.value('atomic unit of length') / spc.angstrom
conv = bohr2ang ** 2
if units == 'g_cm_2':
amu2g = spc.value('unified atomic mass unit') * spc.kilo
conv = amu2g * (spc.value('atomic unit of length') * spc.centi) ** 2
return conv * principal_moments, principal_axes
def rotational_constants(self, units='ghz'):
"""Compute the rotational constants in 1/cm or GHz."""
choices = ('invcm', 'ghz')
units = units.lower()
if units not in choices:
raise ValueError("Invalid units, pick one of {}".format(choices))
principal_moments = self.principal_moments_of_inertia()[0]
bohr2ang = spc.value('atomic unit of length') / spc.angstrom
xfamu = 1 / spc.value('electron mass in u')
xthz = spc.value('hartree-hertz relationship')
rotghz = xthz * (bohr2ang ** 2) / (2 * xfamu * spc.giga)
if units == 'ghz':
conv = rotghz
if units == 'invcm':
ghz2invcm = spc.giga * spc.centi / spc.c
conv = rotghz * ghz2invcm
return conv / principal_moments
......@@ -20,12 +20,15 @@ from cclib.parser.gamessparser import GAMESS
from cclib.parser.gamessukparser import GAMESSUK
from cclib.parser.gaussianparser import Gaussian
from cclib.parser.jaguarparser import Jaguar
from cclib.parser.molcasparser import Molcas
from cclib.parser.molproparser import Molpro
from cclib.parser.mopacparser import MOPAC
from cclib.parser.nwchemparser import NWChem
from cclib.parser.orcaparser import ORCA
from cclib.parser.psiparser import Psi
from cclib.parser.psi3parser import Psi3
from cclib.parser.psi4parser import Psi4
from cclib.parser.qchemparser import QChem
from cclib.parser.turbomoleparser import Turbomole
from cclib.parser.data import ccData
......
......@@ -123,6 +123,22 @@ class ADF(logfileparser.Logfile):
break
line = next(inputfile)
version_searchstr = "Amsterdam Density Functional (ADF)"
if version_searchstr in line:
startidx = line.index(version_searchstr) + len(version_searchstr)
trimmed_line = line[startidx:].strip()[:-1]
# The package version is normally a year with revision
# number (such as 2013.01), but it may also be a random
# string (such as a version control branch name).
match = re.search(r"([\d\.]{4,7})", trimmed_line)
if match:
package_version = match.groups()[0]
self.metadata["package_version"] = package_version
else:
# This isn't as well-defined, but the field shouldn't
# be left empty.
self.metadata["package_version"] = trimmed_line.strip()
# In ADF 2014.01, there are (INPUT FILE) messages, so we need to use just
# the lines that start with 'Create' and run until the title or something
# else we are sure is is the calculation proper. It would be good to combine
......@@ -1151,3 +1167,6 @@ class ADF(logfileparser.Logfile):
self.polarizabilities.append(polarizability)
line = next(inputfile)
if line[:24] == ' Buffered I/O statistics':
self.metadata['success'] = True
......@@ -10,6 +10,8 @@
from __future__ import print_function
import re
import numpy
from cclib.parser import logfileparser
......@@ -76,12 +78,17 @@ class DALTON(logfileparser.Logfile):
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# extract the version number first
# Extract the version number and optionally the Git revision
# number.
if line[4:30] == "This is output from DALTON":
if line.split()[5] == "release" or line.split()[5] == "(Release":
self.metadata["package_version"] = line.split()[6][6:]
else:
self.metadata["package_version"] = line.split()[5]
rs = r"([0-9]{4})\D\s?\w*\s?([0-9]{1})"
match = re.search(rs, line)
package_version = ".".join(match.groups())
self.metadata["package_version"] = package_version
# Don't add revision information to the main package version for now.
if "Last Git revision" in line:
revision = line.split()[4]
# Is the basis set from a single library file, or is it
# manually specified? See before_parsing().
......@@ -1173,6 +1180,9 @@ class DALTON(logfileparser.Logfile):
# self.set_attribute('etoscs', etoscs)
self.set_attribute('etsecs', etsecs)
if line[:37] == ' >>>> Total wall time used in DALTON:':
self.metadata['success'] = True
# TODO:
# aonames
# aooverlaps
......
......@@ -170,7 +170,7 @@ class ccData(object):
_dictsofarrays = ["atomcharges", "atomspins"]
# Possible statuses for optimization steps.
# OPT_UNKNOWN should not be used after parsing, unless for unfinished computations.
# OPT_UNKNOWN is the default and means optimization is in progress.
# OPT_NEW is set for every new optimization (e.g. PES, IRCs, etc.)
# OPT_DONE is set for the last step of an optimisation that converged.
# OPT_UNCONVERGED is set for every unconverged step (e.g. should be mutually exclusive with OPT_DONE)
......
......@@ -83,9 +83,20 @@ class GAMESS(logfileparser.Logfile):
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# extract the version number first
if line.find("GAMESS VERSION") >= 0:
self.metadata["package_version"] = line.split()[4] + line.split()[5] + line.split()[6]
# Extract the version number. If the calculation is from
# Firefly, its version number comes before a line that looks
# like the normal GAMESS version number...
if "Firefly version" in line:
match = re.search(r"Firefly version\s([\d.]*)\D*(\d*)\s*\*", line)
if match:
version, build = match.groups()
package_version = "{}.b{}".format(version, build)
self.metadata["package_version"] = package_version
if "GAMESS VERSION" in line:
# ...so avoid overwriting it if Firefly already set this field.
if "package_version" not in self.metadata:
tokens = line.split()
self.metadata["package_version"] = ' '.join(tokens[4:-1])
if line[1:12] == "INPUT CARD>":
return
......@@ -1431,3 +1442,8 @@ class GAMESS(logfileparser.Logfile):
i, j = coord_to_idx[tokens[1][0]], coord_to_idx[tokens[1][1]]
polarizability[i, j] = tokens[3]
self.polarizabilities.append(polarizability)
if line[:30] == ' ddikick.x: exited gracefully.'\
or line[:41] == ' EXECUTION OF FIREFLY TERMINATED NORMALLY'\
or line[:40] == ' EXECUTION OF GAMESS TERMINATED NORMALLY':
self.metadata['success'] = True
......@@ -47,6 +47,19 @@ class GAMESSUK(logfileparser.Logfile):
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# Extract the version number and optionally the revision number.
if "version" in line:
search = re.search(r"\sversion\s*(\d\.\d)", line)
if search:
self.metadata["package_version"] = search.groups()[0]
if "Revision" in line:
revision = line.split()[1]
# Don't add revision information to the main package version for now.
# if "package_version" in self.metadata:
# package_version = "{}.r{}".format(self.metadata["package_version"],
# revision)
# self.metadata["package_version"] = package_version
if line[1:22] == "total number of atoms":
natom = int(line.split()[-1])
self.set_attribute('natom', natom)
......@@ -653,3 +666,6 @@ class GAMESSUK(logfileparser.Logfile):
line = inputfile.next()
self.set_attribute('nooccnos', occupations)
if line[:33] == ' end of G A M E S S program at':
self.metadata['success'] = True
......@@ -135,11 +135,16 @@ class Gaussian(logfileparser.Logfile):
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# Extract the version number first
# Extract the version number: "Gaussian 09, Revision D.01"
# becomes "09revisionD.01".
if line.strip() == "Cite this work as:":
line = inputfile.next()
self.metadata["package_version"] = line.split()[1][:-1]+ \
'revision'+line.split()[-1][:-1]
tokens = line.split()
self.metadata["package_version"] = ''.join([
tokens[1][:-1],
'revision',
tokens[-1][:-1],
])
# This block contains some general information as well as coordinates,
# which could be parsed in the future:
......@@ -1800,3 +1805,6 @@ class Gaussian(logfileparser.Logfile):
if not hasattr(self, 'optdone'):
self.optdone = []
self.optdone.append(len(self.optstatus) - 1)
if line[:31] == ' Normal termination of Gaussian':
self.metadata['success'] = True
......@@ -60,9 +60,14 @@ class Jaguar(logfileparser.Logfile):
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# Extract the version number first
# Extract the package version number.
if "Jaguar version" in line:
self.metadata["package_version"] = line.split()[3][:-1]
tokens = line.split()
# Don't add revision information to the main package
# version for now.
# package_version = "{}.r{}".format(tokens[3][:-1], tokens[5])
package_version = tokens[3][:-1]
self.metadata["package_version"] = package_version
# Extract the basis set name
if line[2:12] == "basis set:":
......@@ -701,3 +706,7 @@ class Jaguar(logfileparser.Logfile):
line = next(inputfile)
strength = float(line.split()[-1])
self.etoscs.append(strength)
if line[:20] == ' Total elapsed time:' \
or line[:18] == ' Total cpu seconds':
self.metadata['success'] = True
......@@ -18,6 +18,11 @@ import random
import sys
import zipfile
if sys.version_info.major == 2:
getargspec = inspect.getargspec
else:
getargspec = inspect.getfullargspec
import numpy
from cclib.parser import utils
......@@ -233,6 +238,8 @@ class Logfile(object):
self.metadata = {}
self.metadata["package"] = self.logname
self.metadata["methods"] = []
# Indicate if the computation has completed successfully
self.metadata['success'] = False
# Periodic table of elements.
......@@ -273,7 +280,7 @@ class Logfile(object):
raise AttributeError("Class %s has no extract() method." % self.__class__.__name__)
if not callable(self.extract):
raise AttributeError("Method %s._extract not callable." % self.__class__.__name__)
if len(inspect.getargspec(self.extract)[0]) != 3:
if len(getargspec(self.extract)[0]) != 3:
raise AttributeError("Method %s._extract takes wrong number of arguments." % self.__class__.__name__)
# Save the current list of attributes to keep after parsing.
......@@ -407,11 +414,18 @@ class Logfile(object):
"""
if check and hasattr(self, name):
try:
assert getattr(self, name) == value
numpy.testing.assert_equal(getattr(self, name), value)
except AssertionError:
self.logger.warning("Attribute %s changed value (%s -> %s)" % (name, getattr(self, name), value))
setattr(self, name, value)
def append_attribute(self, name, value):
"""Appends a value to an attribute."""
if not hasattr(self, name):
self.set_attribute(name, [])
getattr(self, name).append(value)
def skip_lines(self, inputfile, sequence):
"""Read trivial line types and check they are what they are supposed to be.
......
This diff is collapsed.
......@@ -265,7 +265,7 @@ class Molpro(logfileparser.Logfile):
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
# extract the version number first
# Extract the package version number.
if "Version" in line:
self.metadata["package_version"] = line.split()[1]
......@@ -896,3 +896,6 @@ class Molpro(logfileparser.Logfile):
if not hasattr(self, 'grads'):
self.grads = []
self.grads.append(grad)
if line[:25] == ' Variable memory released':
self.metadata['success'] = True
......@@ -86,6 +86,25 @@ class MOPAC(logfileparser.Logfile):
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
if "(Version:" in line:
# Part of the full version can be extracted from here, but is
# missing information about the bitness.
package_version = line[line.find("MOPAC") + 5:line.find("(")]
package_version = package_version[:4]
if "BETA" in line:
package_version = package_version + " BETA"
self.metadata["package_version"] = package_version
# Don't use the full package version until we know its field
# yet.
if "For non-commercial use only" in line:
tokens = line.split()
tokens = tokens[8:]
assert len(tokens) == 2
package_version_full = tokens[0]
if tokens[1] != "**":
package_version_full = '-'.join(tokens)[:-2]
# Extract the atomic numbers and coordinates from the optimized geometry
# note that cartesian coordinates section occurs multiple times in the file, and we want to end up using the last instance
# also, note that the section labeled cartesian coordinates doesn't have as many decimal places as the one used here
......@@ -232,3 +251,6 @@ class MOPAC(logfileparser.Logfile):
# Partial charges and dipole moments
# Example:
# NET ATOMIC CHARGES
if line[:16] == '== MOPAC DONE ==':
self.metadata['success'] = True
......@@ -42,9 +42,12 @@ class NWChem(logfileparser.Logfile):
def extract(self, inputfile, line):
"""Extract information from the file object inputfile."""
#extract the version number first
if "Northwest Computational" in line:
self.metadata["package_version"] = line.split()[5]
# Extract the version number.
if "nwchem branch" in line:
self.metadata["package_version"] = line.split()[3]
# Don't add revision information to the main package version for now.
if "nwchem revision" in line:
revision = line.split()[3]
# This is printed in the input module, so should always be the first coordinates,
# and contains some basic information we want to parse as well. However, this is not
......@@ -335,8 +338,10 @@ class NWChem(logfileparser.Logfile):
if line.strip() in ("The SCF is already converged", "The DFT is already converged"):
if self.linesearch:
return
self.scftargets.append(self.scftargets[-1])
self.scfvalues.append(self.scfvalues[-1])
if hasattr(self, 'scftargets'):
self.scftargets.append(self.scftargets[-1])
if hasattr(self, 'scfvalues'):
self.scfvalues.append(self.scfvalues[-1])
# The default (only?) SCF algorithm for Hartree-Fock is a preconditioned conjugate
# gradient method that apparently "always" converges, so this header should reliably
......@@ -673,7 +678,10 @@ class NWChem(logfileparser.Logfile):
else:
self.homos.append(-1)
# This is where the full MO vectors are printed, but a special directive is needed for it:
# This is where the full MO vectors are printed, but a special
# directive is needed for it in the `scf` or `dft` block:
# print "final vectors" "final vectors analysis"
# which gives:
#
# Final MO vectors
# ----------------
......@@ -1038,8 +1046,45 @@ class NWChem(logfileparser.Logfile):
polarizability.append(line.split()[1:])
self.polarizabilities.append(numpy.array(polarizability))
if line[:18] == ' Total times cpu:':
self.metadata['success'] = True
if line.strip() == "NWChem QMD Module":
self.is_BOMD = True
# Born-Oppenheimer molecular dynamics (BOMD): time.
if "QMD Run Information" in line:
self.skip_line(inputfile, 'd')
line = next(inputfile)
assert "Time elapsed (fs)" in line
time = float(line.split()[4])
self.append_attribute('time', time)
# BOMD: geometry coordinates when `print low`.
if line.strip() == "DFT ENERGY GRADIENTS":
if self.is_BOMD:
self.skip_lines(inputfile, ['b', 'atom coordinates gradient', 'xyzxyz'])
line = next(inputfile)
atomcoords_step = []
while line.strip():
tokens = line.split()
assert len(tokens) == 8
atomcoords_step.append([float(c) for c in tokens[2:5]])
line = next(inputfile)
self.atomcoords.append(atomcoords_step)
def before_parsing(self):
"""NWChem-specific routines performed before parsing a file.
"""
# The only reason we need this identifier is if `print low` is
# set in the input file, which we assume is likely for a BOMD
# trajectory. This will enable parsing coordinates from the
# 'DFT ENERGY GRADIENTS' section.
self.is_BOMD = False
def after_parsing(self):
"""NWChem-specific routines for after parsing file.
"""NWChem-specific routines for after parsing a file.