1. ABOUT THE DATASET
------------
Title: Dataset supporting the article 'Pairwise additivity and three-body contributions for Density Functional Theory-based protein-ligand binding energies.'
Creators: Mauricio Cafiero (ORCID: 0000-0002-4895-1783), Charlotte Schultze
Organisation: University of Reading
Rights-holders: University of Reading, Charlotte Schulze.
Publication Year: 2023
Description: All component data for calculating Density Functional Theory-based protein-ligand binding energies using three different approaches: the total energy, a sum of pairwise energies,
and a sum of pairwise and three-body energies. Data is also presented comparing three types of basis set superposition error corrections: no-corrections, global corrections, and local
corrections. Timings are also given for each type of correction. Finally, a selection of four-body energy terms is given as an example. All calculations performed using Gaussian 16. Raw data for
manuscript "Pairwise additivity and three-body contributions for Density Functional Theory-based protein-ligand binding energies" by Schulze and Cafiero.
Cite as: Cafiero, Mauricio and Schulze, Charlotte (2023): Dataset supporting the article 'Pairwise additivity and three-body contributions for Density Functional Theory-based protein-ligand binding energies.'
University of Reading. Dataset. https://doi.org//10.17864/1947.000511
Related publication: C. Schulze and M. Cafiero, 'Pairwise additivity and three-body contributions for Density Functional Theory-based protein-ligand binding energies.' Submitted,
ACS Chemical Theory and Computation.
Contact: m.cafiero@reading.ac.uk
2. TERMS OF USE
------------
Copyright 2023 University of Reading, Charlotte Schulze. This dataset is licensed under a Creative Commons Attribution 4.0 International Licence:
https://creativecommons.org/licenses/by/4.0/.
3. PROJECT AND FUNDING INFORMATION
------------
Title: A novel dynamic Density Functional Theory method for analysing multi-scale ligand/protein interactions
Dates: Sept. 2022- August 2023
Funding organisation: The Royal Society of Chemistry
Grant no.: Research Enablement Grant (E21-9051333819)
Title: Computational drug design for Parkinson's Disease treatments
Dates: July 2023 - September 2023
Funding organisation: DAAD RISE Worldwide
Grant no.: GB-CH_ME-5660
4. CONTENTS
------------
File listing
Pairwise-additivity-Cafiero-2023.xlsx
This file contains all component data for calculating Density Functional Theory-based protein-ligand binding energies using three different approaches: the total energy, a sum of pairwise energies,
and a sum of pairwise and three-body energies. Data is also presented comparing three types of basis set superposition error corrections: no-corrections, global corrections, and local
corrections. Timings are also given for each type of correction. Finally, a selection of four-body energy terms is given as an example. All calculations performed using the Gaussian 16 software (www.Gaussian.com).
Tab Contents
S1. IEs Ligand-protein interaction energies (IEs) calculated in three ways with eighteen DFT methods. All values in kcal/mol.
S2. Total IEs Ligand-protein interaction energies (IEs) calculated in three ways with eighteen DFT methods. All values in kcal/mol.
S3. 2B + 3B terms Individual components for calculating the two and three body Interaction energies for eighteen DFT methods. PCM solvent=water. All values in kcal/mol.
S4. PAIRS Individual components for calculating the two body Interaction energies (not including L-DOPA) needed for calculating the three-body interactions for eighteen DFT methods. PCM solvent=water. All values in kcal/mol.
S5. Full BSSE Comparison of full counterpoise-corrections to basis set superposition errors (BSSE) to ‘local’ counterpoise-corrections to BSSE and no counterpoise corrections for BSSE for a GGA, meta-GGA and global hybrid meta-GGA
DFT method. Basis set is aug-cc-pVDZ for BMK and tHCTH and 6-31G* for M06L. final column is the core-time for a full interaction energy calculation between L-DOPA and Phe142. Energy values in kcal/mol and time in minutes.
S6. 4 BODY Individual components for calculating the four body Interaction energies including the three-body interactions needed for eighteen DFT methods. PCM solvent=water. All values in kcal/mol.
Acronyms Variables
PCM Polarizable Continuum Model
GGA Generalized Gradient Approximation
HF Hartree-Fock
BP Binding pocket
BSSE Basis set superposition error
DFT Density Functional Theory
5. METHODS
-----------
The methods below are adapted from the manuscript named above which has been submitted to review to the journal named above. All referenced figures and equations can be found in the manuscript.
The binding site for the SULT1A3 enzyme was extracted from the crystal structure (PDB ID: 2A3R4) with dopamine bound. The binding site was defined as all amino acid residues
with an atom within 3 angstroms of the bound ligand, and included Ala148, Asp86, Glu146, His149, His108, Lys106, Phe142, Phe24, Phe81, and Pro47 (see Figure 2). All residues
were capped with an -OH or an -H in order to maintain the physiological charge. The bound ligand dopamine was modified into L-DOPA, and the structure was optimized using
BMK/cc-pVDZ. In this optimization, the N-C(alpha)-C backbone of each residue was fixed in order to maintain the overall structure of the binding site from the crystal structure,
and all other atoms were allowed to relax. The optimization included solvation by water using the polarizable continuum model (PCM). This optimized structure was used for all
subsequent calculations.
Several “families” of DFT methods were used in this study: HCTH, tHCTH, and tHCTHhyb, along with the related BMK functional; BLYP, B3LYP, CAM-B3LYP, and
the empirical dispersion corrected CAM-B3LYP-D3; M06L, M06, M062X, and the empirical dispersion-corrected M062X-D3, along with the related MN12SX functional; PBE,
PBE1PBE, LC-wHPBE, and the related TPSS functional. The SVWN functional and the Hartree Fock (HF) method were also tested for comparison. All energy calculations were
performed with the aug-cc-pVDZ basis set, except the M06L calculations which were performed for comparison with the work of Ukisik et al, which were run with the 6-31G* basis
set.
The total interaction energy calculations (EQ. 1 in the manuscript) were performed with the eighteen DFT methods described above and the HF method. CP corrections were
applied as in Figure 1a in the manuscript, wherein the energy calculations for the binding site included ghost atoms from the ligand, and the energy calculation for the ligand included ghost atoms
from the entire binding site.
The two body energy calculations (EQ. 2) were performed with the eighteen DFT methods described above and the HF method. Global CP corrections were applied as in Figure 1b
for three sample series (BMK/aug-cc-pVDZ, tHCTH/aug-cc-pVDZ, and M06L/6-31G*), wherein energy calculations for the ligand included ghost atoms from the entire binding site, and energy
calculations for each amino acid residue included ghost atoms on the ligand and the other 9 residues. Local CP corrections were applied as in Figure 1c for all eighteen DFT methods
and HF, wherein energy calculations for the ligand included ghost atoms from the i-th amino acid residue only, and energy calculations for the i-th amino acid residue included ghost
atoms on the ligand only. Ten two body terms were calculated per DFT method.
The three-body energy calculations (EQ. 3) were performed with the eighteen DFT methods described above and the HF method. Local CP corrections were applied as in Figure 1d
for all DFT methods, wherein energy calculations for the ligand included ghost atoms from the i-th and j-th amino acid residues, the energy calculations for the i-th amino acid
residue included ghost atoms on the ligand and j-th residue, and the energy calculations for the j-th amino acid residue included ghost atoms on the ligand and i-th residue. The three
body calculations included 45 three-body energies and 45 two body energies (none including the ligand) per DFT method.
The four-body energy calculations (EQ. 3) were performed with BMK/aug-cc-pVDZ. Local CP corrections were applied in analogy to the three body calculations shown in Figure 1d,
wherein energy calculations for the ligand included ghost atoms from the i-th, j-th and k-th amino acid residues, the energy calculations for the i-th amino acid residue included
ghost atoms on the ligand and j-th and k-th residues, the energy calculations for the j-th amino acid residue included ghost atoms on the ligand and i-th and k-th residues, and the
energy calculations for the k-th amino acid residue included ghost atoms on the ligand and i-th and j-th residues. Only two sample four-body terms have been calculated as examples.
All calculations above were performed with the Gaussian 16 software.