Using Machine learning force fields (MLFF) in SPARC-X-API
Training Machine Learned Force Fields using SPARC and the SPARC-X-API
The SPARC source code has the ability to train MLFF on-the-fly using
the SOAP descriptor and Bayesian Linear Regression. This feature can
be accessed via the API by passing the appropriate sparc flags to the
SPARC
calculator object. An exhaustive list of MLFF parameters are
available in the SPARC
documentation.
The primary flag of interest is the MLFF_FLAG
which must be set to 1
to train a MLFF from scratch. This needs to be included in a parameter dictionary that will later be passed to a SPARC calculator instance:
calc_params = {
"EXCHANGE_CORRELATION": "GGA_PBE",
"KPOINT_GRID": [1,1,1],
"MESH_SPACING": 0.35,
"TOL_SCF": 0.0001,
"MAXIT_SCF": 100,
"ELEC_TEMP_TYPE": "fermi-dirac",
"ELEC_TEMP": 116,
"PRINT_RESTART_FQ": 10,
"PRINT_ATOMS": 1,
"PRINT_FORCES": 1,
"SPIN_TYP": 0,
"MLFF_FLAG": 1,
"MLFF_INITIAL_STEPS_TRAIN": 3,
}
This parameter dictionary primes an on-the-fly simulation. The MLFF_INITIAL_STEPS_TRAIN
keyword specifies the number of reference structures that will be added to the training set before the model is first trained. References structures can be generated using the SPARC calculator in FileIO mode with C SPARC’s internal relaxation or MD algorithms. Adding
calc_params['RELAX_FLAG'] = 1
or
calc_params['MD_FLAG'] = 1
to the parameter dictionary will train an ML model on-the-fly during
relaxation or MD. Additional MD related parameters will be necessary
such as the timestep and method. See the SPARC documentation for
details. The simulations can be triggered by calling the familiar
property functions on an ASE
Atoms
object. Here, we use a water
molecule:
from sparc.calculator import SPARC
from ase.build import molecule
water = molecule('H2O', vacuum=7)
water.pbc = [False,False,False]
water.calc = SPARC(**calc_params)
energy = water.get_potential_energy()
This triggers an on-the-fly relaxation or MD simulation. Additional details are available in the SPARC documentation regarding hyperparameters and recommended settings.
Pairing SPARC and the SPARC-X-API with external algorithms via iPI socket communication
The iPI communication protocol embedded in SPARC and accessed via the
SPARC-X-API allows users to treat SPARC as a DFT backend for many
additional python libraries. For example, the reference structures for
training MLFF on-the-fly can be generated via external algorithms such
as ASE
optimizers or MD engines. The DFT energies, forces, and
stresses are computed by SPARC and parsed by the API to allow
positions to be updated externally. The code from the previous section
can be modified to use the ASE
implementation of the BFGS algorithm:
from sparc.calculator import SPARC
from ase.build import molecule
from ase.optimize import BFGS
water = molecule('H2O', vacuum=7)
water.pbc = [False,False,False]
calc_params = {
"EXCHANGE_CORRELATION": "GGA_PBE",
"KPOINT_GRID": [1,1,1],
"MESH_SPACING": 0.35,
"TOL_SCF": 0.0001,
"MAXIT_SCF": 100,
"ELEC_TEMP_TYPE": "fermi-dirac",
"ELEC_TEMP": 116,
"PRINT_RESTART_FQ": 10,
"PRINT_ATOMS": 1,
"PRINT_FORCES": 1,
"SPIN_TYP": 0,
"MLFF_FLAG": 1,
"MLFF_INITIAL_STEPS_TRAIN": 3,
}
with SPARC(use_socket=True, **calc_params) as calc:
water.calc = calc
dyn = BFGS(water, trajectory = 'water-bfgs-opt.traj')
dyn.run(fmax=0.05)
A MLFF is still trained on-the-fly using SPARC DFT energies, forces, and stresses; but the structures are generated from the ASE
optimization algorithm.