Client Name: Prof. Paul Popelier, The University of Manchester
Client Web / Social Media:
High End Compute has worked on a number of commissions with Professor Popelier, including submission's to Archer eCSE (for various optimisation/parallelisation opportunities) and help supervising post-docs (on ML of quantum chemistry functions) with commission being discussed re FFLUX and MORFI
FFLUX is a biomolecular force field that stands apart from its peers by virtue of its unique methodology. FFLUX is based on the theoretical framework of quantum chemical topology (QCT) and utilises modern machine learning techniques in the form of kriging models. This union results in an atomistic force field that is fully polarisable, multipolar and flexible and, therefore, highly accurate. The use of machine learning, carefully validated and acting on pre-calculated atomic properties, makes FFLUX achieve accurate predictions at far less computational cost than the first principle calculations it draws it information from. However, due to the higher computational cost than that of traditional force fields, the FFLUX code requires significant reductions in run times.
Improving MORFI's Lot
Central to the research of Prof. Popelier's group is Quantum Chemical Topology (QCT), which is an imaginative, minimal and rigorous approach to extract chemical information from quantum systems, at atomistic level.
Three strands of development emerge from these atomic and bond properties:
- the construction of the novel force field FFLUX (via machine learning),
- the identification of subsystem energetically behaving like the total system (i.e. rigorous interpretative chemistry compatible with the underpinning quantum reality),
- and Linear Free Energy Relationships with an eye on bulk property prediction (pKa in aqueous solution, for example).
The need to improve atomistic biomolecular force fields remains acute. Fortunately, the abundance of contemporary computing power enables an overhaul of the architecture of current force fields. Taking advantage of the recent advances of computing power, FFLUX is a fully polarizable, multipolar, and atomistic force field next generation force field that is being built from machine learning (i.e., kriging) models and quantum mechanics. FFLUX deals with dispersive interactions in a very unique manner, as we obtain ab initio atomistic correlation energies through our homemade software MORFI. However, the evaluation of the two-particle density matrix, necessary to run our calculation, can be very expensive. To get atomic correlation energies in reasonable times, on top of state-of-the-art hardware, a highly optimized and parallelized software is also required.
How HEC Helped
MORFI is a FORTRAN code with static allocation of arrays for variables, often with extent of 4 dimensions.
HEC profiled and confirmed a nested DO loop was taking over 95% of the run time for the provided examples. We examined further details and identified opportunities to reduce memory overhead generally and for parallelising these loops using OpenMP (there being a significant cost copying large 4D arrays where there scope is PRIVATE). Working with the research group members we identified potential parallelism at the outermost DO loop level which we implemented by careful use of OpenMP, noting the need for various REDUCTION clauses. The final version has a run time of 69 minutes on Haswell compared to the original time of 56 hours on the same chip set, with parallel efficiency of 86% when using 23 of the available 24 cores.
The Popelier Research Group had previously, due to a mix of the memory requirements and compiling for a given set of nodes, been accustomed to a run time of 3 days and 3 hours on SandyBridge/IvyBridge architectures. By HEC providing appropriate Makefile & batch scripts to not just run but also to first compile on the target compute node, the group has been able to use all of the available architectures in their cluster. The comparison of using Haswell (best times with new code) to SandyBridge/IvyBridge (only previous choice) is a whopping 65x improvement. Similar improvements have been seen for other test example sizes, with one example giving a 87 times improvement.
HEC also showed that the numerical results remained in agreement to at least 11 significant figures, across all architectures and numbers of OpenMP threads.
Feedback from Popelier Research Group
"Thanks to the work and support of HIGH END COMPUTE LTD, our homemade software is now highly parallelized and much less memory hungry. Thanks to these improvements, dispersion energies for much larger systems can be assessed and we are one step closer to our final goal: highly realistic biomolecular simulations using dispersion forces obtained through first principles."
“Molecular Simulation by knowledgeable Quantum Atoms”, P.L.A. Popelier, Phys.Scripta, 91, 033007 (16 pages) (2016).