Testing on molecules

From now on, we will define several classes to help us with the calculation of molecular integrals. We will start with the minimal structure of these classes and extend them throughout this chapter. The most recent version of these classes can be found in subsection 4.4.1.

We have written our own routine for calculating overlap integrals. To confirm its functionality, we shall do some tesing on real molecules. For this, we define several classes to make a convenient interface for molecules and basis sets.

The Atom Class

Chemists work often with atomic symbols, which are not intuitive for computers. They would prefer atomic numbers. Therefore, we at first define a dictionary which has atomic symbols as keys and atomic numbers as values:

ATOMIC_NUMBER = {
    'H': 1, 'He': 2, 
    'Li': 3, 'Be': 4, 'B': 5, 'C': 6, 'N': 7, 'O': 8, 'F': 9, 'Ne': 10,
    'Na': 11, 'Mg': 12, 'Al': 13, 'Si': 14, 'P': 15, 'S': 16, 'Cl': 17, 'Ar': 18,
    'K': 19, 'Ca': 20, 
        'Sc': 21, 'Ti': 22, 'V': 23, 'Cr': 24, 'Mn': 25, 
        'Fe': 26, 'Co': 27, 'Ni': 28, 'Cu': 29, 'Zn': 30,
        'Ga': 31, 'Ge': 32, 'As': 33, 'Se': 34, 'Br': 35, 'Kr': 36,
    'Rb': 37, 'Sr': 38,
        'Y': 39, 'Zr': 40, 'Nb': 41, 'Mo': 42, 'Tc': 43, 
        'Ru': 44, 'Rh': 45, 'Pd': 46, 'Ag': 47, 'Cd': 48,
        'In': 49, 'Sn': 50, 'Sb': 51, 'Te': 52, 'I': 53, 'Xe': 54,
    'Cs': 55, 'Ba': 56,
        'La': 57, 'Ce': 58, 'Pr': 59, 'Nd': 60, 'Pm': 61, 'Sm': 62, 'Eu': 63, 
        'Gd': 64, 'Tb': 65, 'Dy': 66, 'Ho': 67, 'Er': 68, 'Tm': 69, 'Yb': 70, 
        'Lu': 71, 'Hf': 72, 'Ta': 73, 'W': 74, 'Re': 75, 
        'Os': 76, 'Ir': 77, 'Pt': 78, 'Au': 79, 'Hg': 80,
        'Tl': 81, 'Pb': 82, 'Bi': 83, 'Po': 84, 'At': 85, 'Rn': 86,
    'Fr': 87, 'Ra': 88,
        'Ac': 89, 'Th': 90, 'Pa': 91, 'U': 92, 'Np': 93, 'Pu': 94, 'Am': 95,
        'Cm': 96, 'Bk': 97, 'Cf': 98, 'Es': 99, 'Fm': 100, 'Md': 101, 'No': 102,
        'Lr': 103, 'Rf': 104, 'Db': 105, 'Sg': 106, 'Bh': 107, 
        'Hs': 108, 'Mt': 109, 'Ds': 110, 'Rg': 111, 'Cn': 112, 
        'Nh': 113, 'Fl': 114, 'Mc': 115, 'Lv': 116, 'Ts': 117, 'Og': 118,
}

We save this dictionary in a file called atomic_data.py.

The Atom class should represent an atom with a specific symbol Vand coordinate. The class should therefore include the following attributes:

  • atomic_number: A dictionary with keys corresponding to atomic symbols and values corresponding to atomic numbers.
  • symbol: The atomic symbol of the atom.
  • coord: The coordinate of the atom.
  • atnum: The atomic number corresponding to the symbol of the atom.

The class includes one method, __init__, which initializes a new atom with the given symbol and coordinate. The optional argument unit specifies the unit of the given coordinates and can be either A (Ångström) or B (Bohr).

import numpy as np
from atomic_data import ATOMIC_NUMBER
class Atom:
    """
    A class representing an atom with a specific symbol and coordinate.

    Attributes:
        atomic_number (dict): A dictionary with keys corresponding to 
            atomic symbols and values corresponding to atomic numbers.
        symbol (str): The atomic symbol of the atom.
        coord (list[float]): The coordinate of the atom.
        atnum (int): The atomic number corresponding to the symbol of the atom.

    Methods:
        __init__(self, symbol: str, coord: list[float]) -> None: 
            Initializes a new atom with the given symbol and coordinate.
    """

    def __init__(self, symbol: str, coord: list[float], unit='B') -> None:
        """
        Initializes a new `atom` object.

        Parameters:
            symbol (str): The atomic symbol of the atom.
            coord (list): The coordinate of the atom.

        Returns:
            None
        """
        self.symbol = symbol
        self.coord = np.array(coord)
        self.unit = unit
        self.atnum = ATOMIC_NUMBER[self.symbol]

This class is saved in a file called atom.py.

Because we are chemists, we are not satisfied with only atoms. We want to combined them into molecules, hence a Molecule class is needed.

The Molecule Class

The Molecule class should represent a molecule, therefore, the following attributes can be useful:

  • atomlist: A list of Atom objects representing the atoms in the molecule.
  • natom: The number of atoms in the molecule.
  • basisfunctions: A list of Gaussian objects representing the basis functions of the molecule.
  • S: A matrix representing the overlap integrals between basis functions.
import numpy as np
import copy
from atom import Atom
from basis_set import BasisSet

a0 = 0.529177210903  # Bohr radius in angstrom
class Molecule:
    """
    A class representing a molecule.

    Attributes:
        atomlist (list): A list of `atom` objects representing the atoms 
            in the molecule.
        natom (int): The number of atoms in the molecule.
        basisfunctions (list): A list of `Gaussian` objects representing 
            the basis functions of the molecule.
        S (ndarray): A matrix representing the overlap integrals between 
            basis functions.

    Methods:
        __init__(self) -> None: Initializes a new `molecule` object.
        set_atomlist(self,a: list) -> None: Sets the `atomlist` attribute 
            to the given list of `atom` objects.
        read_from_xyz(self,filename: str) -> None: Reads the coordinates of 
            the atoms in the molecule from an XYZ file.
        get_basis(self, name: str = "sto-3g") -> None: Computes the 
            basis functions for the molecule using the specified basis set.
        get_S(self) -> None: Computes the overlap integrals between 
            basis functions and sets the `S` attribute.
    """

    def __init__(self) -> None:
        """
        Initializes a new `Molecule` object.

        Returns:
            None
        """
        self.atomlist = []
        self.natom = len(self.atomlist)
        self.basisfunctions = []
        self.S = None

    def set_atomlist(self, a: list) -> None:
        """
        Sets the `atomlist` attribute to the given list of `Atom` objects.

        Parameters:
            a (list): A list of `Atom` objects representing the atoms 
                in the molecule.

        Returns:
            None
        """
        self.atomlist = []
        for at in a:
            if at.unit == 'A':
                at.coord = at.coord / a0
            elif at.unit == 'B':
                pass
            else:
                raise ValueError('Invalid unit for atom coordinates.')
            self.atomlist.append(at)
        self.natom = len(self.atomlist)

    def read_from_xyz(self, filename: str) -> None:
        """
        Reads the coordinates of the atoms in the molecule from an XYZ file.

        Parameters:
            filename (str): The name of the XYZ file to read.

        Returns:
            None
        """
        with open(filename, "r") as f:
            for line in f:
                tmp = line.split()
                if len(tmp) == 4:
                    symbol = tmp[0]
                    coord = np.array([float(x) for x in tmp[1:]]) / a0
                    at = Atom(symbol, coord)
                    self.atomlist.append(at)
        self.natom = len(self.atomlist)

    def get_basis(self, name: str = "sto-3g") -> None:
        """
        Computes the basis functions for the molecule using the 
        specified basis set.

        Parameters:
            name (str): The name of the basis set to use. Default is "sto-3g".

        Returns:
            None
        """
        self.basisfunctions = []
        # Initialize BasisSet instance
        basis = BasisSet(name=name)
        # Generate unique list of symbols
        elementlist = set([at.symbol for at in self.atomlist])
        # Return basis dictionary
        basis = basis.get_basisfunctions(elementlist)
        for at in self.atomlist:
            bfunctions = basis[at.symbol]
            for bf in bfunctions:
                newbf = copy.deepcopy(bf)
                newbf.set_A(at.coord)
                self.basisfunctions.append(newbf)
    def get_S(self) -> None:
        """
        Computes the overlap integrals between basis functions and sets 
        the `S` attribute.

        Returns:
            None
        """
        nbf = len(self.basisfunctions)
        self.S = np.zeros((nbf, nbf))
        for i in np.arange(0, nbf):
            for j in np.arange(i, nbf):
                self.S[i,j] = self.basisfunctions[i].S(self.basisfunctions[j])
                self.S[j,i] = self.S[i,j]

We have utilized the BasisSet class to initialize basis sets for our molecule, which we will define now.

The BasisSet Class

The BasisSet class should contain a set of GTOs and be able to calculate molecular integrals between them. Therefore, we shall define the Gaussian class at first for the integral handling.

import numpy as np
import json
import os
from atomic_data import ATOMIC_NUMBER
import S
class Gaussian:
    """
    A class representing a Cartesian Gaussian function for molecular integrals.
    """

    def __init__(self, A, exps, coefs, ijk):
        """
        Initialize the Gaussian function with given parameters.

        Parameters:
        A (array-like): The origin of the Gaussian function.
        exps (array-like): A list of exponents.
        coefs (array-like): A list of coefficients.
        ijk (tuple): A tuple representing the angular momentum components 
            (l, m, n).
        """
        self.A = np.asarray(A)
        self.exps = np.asarray(exps)
        self.coefs = np.asarray(coefs)
        self.ijk = ijk
        self.get_norm_constants()

    def set_A(self, A):
        """
        Set the origin of the Gaussian function.

        Parameters:
        A (array-like): The origin of the Gaussian function.
        """
        self.A = np.asarray(A)

    def get_norm_constants(self):
        """
        Calculate the normalization constants for the Gaussian function.
        """
        self.norm_const = np.zeros(self.coefs.shape)
        for i, alpha in enumerate(self.exps):
            a = S.s_ij(self.ijk[0], self.ijk[0], alpha, alpha, 
                      self.A[0], self.A[0])
            b = S.s_ij(self.ijk[1], self.ijk[1], alpha, alpha, 
                      self.A[1], self.A[1])
            c = S.s_ij(self.ijk[2], self.ijk[2], alpha, alpha, 
                      self.A[2], self.A[2])
            self.norm_const[i] = 1.0 / np.sqrt(a * b * c)

    def __str__(self):
        """
        Generate a string representation of the Gaussian function.

        Returns:
        str: A string representation of the Gaussian function.
        """
        strrep = "Cartesian Gaussian function:\n"
        strrep += "Exponents = {}\n".format(self.exps)
        strrep += "Coefficients = {}\n".format(self.coefs)
        strrep += "Origin = {}\n".format(self.A)
        strrep += "Angular momentum: {}".format(self.ijk)
        return strrep

    def S(self, other):
        """
        Calculate the overlap integral between this Gaussian and 
        another Gaussian function.

        Parameters:
        other (Gaussian): Another Gaussian function.

        Returns:
        float: The overlap integral value.
        """
        overlap = 0.0
        for ci, alphai, normi in zip(self.coefs, self.exps, 
                                     self.norm_const):
            for cj, alphaj, normj in zip(other.coefs, other.exps, 
                                         other.norm_const):
                a = S.s_ij(self.ijk[0], other.ijk[0], alphai, alphaj, 
                           self.A[0], other.A[0])
                b = S.s_ij(self.ijk[1], other.ijk[1], alphai, alphaj,
                           self.A[1], other.A[1])
                c = S.s_ij(self.ijk[2], other.ijk[2], alphai, alphaj,
                           self.A[2], other.A[2])
                overlap += ci * cj * normi * normj * a * b * c
        return overlap

Currently, this class can only calculate overlap integrals. We will extend this class with other molecular integrals in the comming lectures.

We can now construct the BasisSet class. It should be able to read basis sets in JSON format and instance objects from the Gaussian class.

class BasisSet:
    # Dictionary that maps angular momentum to a list of (i,j,k) tuples 
    # representing the powers of x, y, and z
    cartesian_power = {
        0: [(0, 0, 0)],
        1: [(1, 0, 0), (0, 1, 0), (0, 0, 1)],
        2: [(1, 1, 0), (1, 0, 1), (0, 1, 1), (2, 0, 0), (0, 2, 0), (0, 0, 2)],
    }

    def __init__(self, name="sto-3g"):
        """
        Initialize a new basisSet object with the given name.

        Parameters:
        name (str): The name of the basis set to use.
        """
        self.name = name

    def get_basisfunctions(self, elementlist, path="."):
        """
        Generate the basis functions for a list of elements.

        Parameters:
        elementlist (list): A list of element symbols.
        path (str): The path to the directory containing the basis set files.

        Returns:
        dict: A dictionary mapping element symbols to lists of 
            Gaussian basis functions.
        """
        try:
            # Load the basis set data from a JSON file
            with open(
                os.path.join(path, f"{self.name}.json"), "r",
            ) as basisfile:
                basisdata = json.load(basisfile)
        except FileNotFoundError:
            print("Basis set file not found!")
            return None

        basis = {}  # Initialize dictionary containing basis sets

        for element in elementlist:
            basisfunctions = []
            # Get the basis function data for the current element 
            # from the JSON file
            basisfunctionsdata = basisdata["elements"][
                str(ATOMIC_NUMBER[element])
            ]["electron_shells"]
            for bfdata in basisfunctionsdata:
                for i, angmom in enumerate(bfdata["angular_momentum"]):
                    exps = [float(e) for e in bfdata["exponents"]]
                    coefs = [float(c) for c in bfdata["coefficients"][i]]
                    # Generate Gaussian basis functions for each 
                    # angular momentum component
                    for ikm in self.cartesian_power[angmom]:
                        basisfunction = Gaussian(np.zeros(3), exps, coefs, ikm)
                        # Normalize the basis functions using the S method 
                        # of the Gaussian class
                        norm = basisfunction.S(basisfunction)
                        basisfunction.coefs = basisfunction.coefs / np.sqrt(norm)
                        basisfunctions.append(basisfunction)
            basis[element] = basisfunctions
        return basis