Protein Partnerships Unlocked by Measuring Surface ‘chaos’ in Molecular Models

Scientists are increasingly focused on understanding how proteins and peptides bind, a central challenge in biophysics. Tyler Grear and Donald J. Jacobs, from the Department of Physics and Optical Science at the University of North Carolina at Charlotte, alongside co-authors, present a novel approach using peripheral surface information (PSI) entropy to quantify the statistical variability of non-interacting surface proportions during protein-peptide complex formation. Their research demonstrates that favourable binding events correlate with low-entropy states, revealing preferential surface configurations and suggesting an underlying principle governing molecular recognition. By combining computational modelling with experimental data from WW domains, the team established PSI entropy as a valuable thermoinformatic descriptor, potentially offering new insights into the evolutionary pressures shaping protein-peptide interactions and accelerating the design of novel binding molecules.

Nonlocal effects define low-entropy states in protein-peptide binding landscapes, promoting specificity and affinity

Scientists consider able protein-peptide binding events a central challenge in biophysics, with continued uncertainty surrounding how nonlocal effects shape the global energy landscape. We introduce peripheral surface information (PSI) entropy, SΨ, a quantitative measure of the statistical variability in apolar and charged non-interacting surface (NIS) proportions across conformational ensembles.
Using energy-directed molecular docking via HADDOCK3 and explicit-solvent molecular dynamics simulations, it is demonstrated that favorable binding partners exhibit emergent, low- entropy N -states (discrete macrostates in NIS state space) indicative of preferential apolar/charged surface configurations. Across dozens of peptides and multiple receptor systems (WW, PDZ, and MDM2 domains), dominant N -states persisted under varied docking parameters and initial conditions.

An experimental meta-ensemble of WW domains from 36 high- resolution structures confirmed the presence of dominant NIS modes independent of in silico methodology, suggesting an evolutionary selection pressure toward specific NIS fingerprints. These findings establish SΨ as a thermoinformatic descriptor that encodes favorable binding constraints into unique statistical signatures of the NIS.

The ability to reliably identify favorable protein-peptide binding events remains an unsolved problem in biophysics with implications for bioengineering, drug discovery, and our fundamental understanding of biomolecular recognition. We introduce peripheral surface information (PSI) entropy, SΨ, a novel thermoinformatic measure that captures emergent organization of apolar and charged non-interacting surface regions across conformational ensembles.

Uniting structural, energetic, and evolutionary perspectives, PSI entropy reveals conserved surface configurations which have been confirmed in 36 experimentally characterized complexes of the WW-domain model system, further supporting the observed phenomenon is independent of in silico docking methodology. This work establishes a quantitative framework for exploiting peripheral surface organization in predictive modeling, enabling new strategies for active-site characterization and targeted peptide engineering.

Ideas that the non-interacting surface (NIS) during protein-protein association can contribute, in a non-negligible manner, to binding affinity calculations have progressed over the last few decades. They were largely debated within the context of allosteric effects and entropy-driven solvent reorganization.

In the mid-2000s, researchers began to discuss the possibility of an extended interface, asking whether nonlocal interactions can meaningfully affect binding strength and/or specificity. Volumetric and calorimetric measurements, together with nuclear magnetic resonance (NMR) spectroscopy have enabled quantitative monitoring of these nonlocal influences on binding mechanisms.

Additionally, changes in heat capacity upon formation of a complex can provide information on nonlocal effects. In systems governed by complex weak interactions, enthalpy-entropy compensation can limit the discriminative value of ΔG-based descriptors. Free-energy functions that are overly concentrated on the binding interface often miss the global thermodynamic context provided by the environment.

Here, the heterogeneity of free-energy contributions is related to information entropy, establishing that conformations corresponding to favorable energetics can be characterized by unique statistical signatures that are encoded on the peripheral surface. Since the mid-2010s, researchers have called for quantitative descriptions of NIS characteristics in order to build a global model of binding affinity, seeking an Archimedean point for the problem of free-energy determination.

Furthermore, new entropic considerations are expected for flexible complexes, where the correlation of buried surface area (BSA) and binding affinity does not hold. This demands new global metrics derived from entire conformational ensembles rather than representative binding poses or energy profiles.

It has been shown that the percentages of apolar and charged NIS (denoted Naand Ncrespectively) exhibit significant correlations with binding affinity, and NIS effects have been verified through alanine scanning. It has also been observed that proportions of Naand Ncare conserved over orthologous complexes indicating an evolutionary selection pressure.

Assuming evolution has performed the functional optimization a priori, an effective peptide engineering protocol should harness the fingerprints that emerge from this selective pressure. A new information-theoretic entropy is proposed that encodes NIS properties over ensembles of molecular complexes consistent with favorable global binding conditions.

This emergent peripheral surface information (PSI) entropy is also verified to exist independent of in silico Methods, extracted from ensembles of experimentally resolved molecular complexes. For each complex a per-residue solvent accessible surface area (SASA) was converted to relative solvent accessibility (RSA) using NACCESS residue maximum ASA values, see Supporting Material Table S1 for the full set of ASAmax values.

Residues with RSA ≥0.05 were assigned to three chemical classes: apolar (A), charged (C), and polar (P). Interface residues were identified geometrically by a 5.0 Å heavy-atom distance cutoff between receptor Chain A and partner Chain B, and all such interacting residues were excluded. The remaining integer tuple of counts (nA, nC, nP) defines a macrostate label for each microstate, with NIS apolar and charged proportions denoted Naand Ncrespectively.

Because the three chemical-class fractions are conserved (they sum to 1), the polar component was considered redundant once nAand nCwere specified. The docking software used to generate Results was HADDOCK 3.0 (HADDOCK3). This method was selected due to its modular and open-source framework allowing for customization of parameters/energy functions, providing maximal control over conformational ensemble generation.

The standard HADDOCK3 pipeline consists of 3 steps: 1) rigid-body docking; 2) semi-flexible refinement in torsion angle space; and 3) molecular dynamics simulation (MDS) in explicit solvent. Parameters were initially set to the defaults unless explicitly stated in this description. Correspondent with the steps above, there are 3 energy-based HADDOCK scoring (HS) functions given by HS= (0.01 · Evdw) + (1.0 · Eelec) + (1.0 · Edesolv) + (0.01 · EAIR) −(0.01 · BSA), if Step 1 (1.0 · Evdw) + (1.0 · Eelec) + (1.0 · Edesolv) + (0.1 · EAIR) −(0.01 · BSA), if Step 2 (1.0 · Evdw) + (0.2 · Eelec) + (1.0 · Edesolv) + (0.1 · EAIR), if Step 3 (1) where Evdwand Eelecrepresent non-bonded van der Waals and Coulomb intermolecular energies respectively.

Edesolvis an empirical desolvation term, BSA is the buried surface area upon complexation in Å2, and EAIRis the restraint violation energy which expresses an agreement between experimental and back-calculated data. All energies were computed using the optimized potentials for liquid simulations (OPLS) force field, reported in units of kJ/mol.

For the purpose of establishing the Methods, the exemplar model system was the WW domain (PDB ID 2LTW, experimental method NMR), a small highly flexible 36-residue protein. The exemplar ensemble contained Ω = 227 microstates that partitioned into N= 67 distinct macrostates, with color indicating the multiplicity, g(Na, Nc).

The (Na, Nc) coordinates were used for visualization, while the full (nA, nC, nP) tuple distinguishes macrostates. The existence of a non- uniform occupancy in N-space indicates that, during an energy-directed search for bound conformations, a preferential subset of peripheral NIS patterns was populated.

Introduced by Claude Shannon in 1948, information entropy quantifies the amount of information contained in a message or the unpredictability (variability) of a system. For a system with a finite, countable number of microstates, Ω, Shannon’s information entropy is written as S= −KÍ ipilog2(pi), where Kis a positive scaling (normalization) factor.

Choosing K= 1 gives the Shannon entropy in binary digits (bits). In this work, the unnormalized macrostate entropy is first defined in bits, followed by the Introduction of a physically motivated Kfrom contact statistics. Each docking ensemble populates a discrete set of N-states (macrostates) in N-space.

Let Nibe the i-th macrostate in N-space with coordinates (Na, Nc), and take pias the empirical probability g(Ni)/Ω, where g(Ni) counts how many microstates belong to that macrostate. This mapping of binding events to macrostate-level N-space allows for a peripheral surface information (PSI) entropy, SΨ, to be expressed as S ′ Ψ = −K N ∑ i=1g(Ni) Ω log2g(Ni) Ω (2) Where the prime notation indicates the intermediate unnormalized expression.

S ′ Ψ serves as a descriptor of the ensemble’s statistical state rather than a direct component of the classical Gibbs free energy. The guiding hypothesis is that energetically favorable recognition is accompanied by reduced variability of peripheral NIS macrostates, and therefore a lower measured PSI entropy. This was treated as a working assumption rather than an a priori guarantee, the practical question.

Calculation of peripheral surface information macrostates and exclusion of interface residues are critical for accurate modeling

Peripheral surface information (PSI) entropy was quantified by first calculating per-residue solvent accessible surface area using the NACCESS program, converting these values to relative solvent accessibility. Residues exhibiting relative solvent accessibility greater than or equal to 0.05 were then classified into three chemical categories: apolar, charged, and polar, with specific maximum ASA values detailed in Supporting Material Table S1.

Interface residues were rigorously defined using a geometric criterion of a 5.0 Å heavy-atom distance cutoff between receptor Chain A and the partner Chain B, ensuring these interacting residues were excluded from subsequent analysis. Each remaining microstate was characterised by an integer tuple representing the counts of apolar, charged, and polar residues, defining a macrostate label.

Since the sum of these three fractions always equals one, the polar component was deemed redundant, simplifying the analysis to focus on the apolar and charged proportions, denoted as Na and Nc respectively. This allowed for the calculation of PSI entropy, a novel thermoinformatic measure capturing the statistical variability of these non-interacting surface regions across conformational ensembles.

Energy-directed molecular docking was performed using HADDOCK3, followed by explicit-solvent molecular dynamics simulations to generate conformational ensembles for dozens of peptides and receptor systems including WW, PDZ, and MDM2 domains. Dominant N-states, representing discrete macrostates within the NIS state space, consistently persisted across varied docking parameters and initial conditions, demonstrating robustness of the observed phenomenon. Furthermore, an experimental meta-ensemble comprising 36 high-resolution WW domain structures independently confirmed the presence of these dominant NIS modes, suggesting an evolutionary pressure towards specific surface fingerprints.

WW domain macrostate definition via apolar and charged surface partitioning reveals conformational ensembles

Across dozens of peptides and receptor systems, including WW, PDZ, and MDM2 domains, dominant N-states persisted despite varied docking parameters and initial conditions. Analysis of the WW domain system revealed that 227 microstates assembled into 67 distinct macrostates when defining N-space based on apolar and charged non-interacting surface proportions.

The multiplicity function, quantifying microstates within each macrostate, demonstrated a structured occupancy pattern across the generated ensembles. Peripheral surface information entropy, a new thermoinformatic descriptor, was calculated using Shannon’s information entropy formula, initially in bits before incorporating a physically motivated scaling factor derived from contact statistics.

The research established a method for defining macrostates using integer tuples representing apolar (nA) and charged (nC) non-interacting surface proportions, excluding residues within a 5.0 Å distance of the receptor-partner interface. The WW domain exemplar system, utilizing the PDB ID 2LTW structure, generated 3000 rigid-body conformations, retaining the top 400 for subsequent refinement.

Following semi-flexible refinement and iRMSD filtering at 10 Å, the ensemble was reduced to 227 complexes, revealing the emergent organization of NIS patterns. Dual energy-distribution plots further characterized the interfacial energetics of the ensembles. The unnormalized macrostate entropy, S’Ψ, was expressed as the sum of log2 probabilities for each macrostate, providing a descriptor of ensemble statistical state.

The study hypothesized that energetically favorable recognition events correlate with reduced variability in peripheral NIS macrostates, leading to lower measured PSI entropy, and this was investigated through the analysis of the generated ensembles. An experimental meta-ensemble of WW domains, comprising 36 high-resolution structures, confirmed the presence of dominant NIS modes independent of computational methodology, suggesting evolutionary selection pressure towards specific NIS fingerprints.

Dominant low-entropy surface states correlate with peptide binding affinity and stability

Scientists have established a new thermoinformatic descriptor, termed peripheral surface information (PSI) entropy, to quantify statistical variability in the proportions of apolar and charged non-interacting surfaces across conformational ensembles of peptides. Investigations utilising energy-directed molecular docking and explicit-solvent molecular dynamics simulations demonstrate that peptides with favourable binding affinities exhibit emergent, low-entropy states indicative of preferred apolar and charged surface configurations.

These low-entropy states, or N-states, represent discrete macrostates within the space of non-interacting surface configurations and were observed to persist across diverse docking parameters and initial conditions. Across multiple receptor systems, including WW, PDZ, and MDM2 domains, and dozens of peptides, a consistent pattern emerged where dominant N-states were present.

An analysis of WW domains, utilising experimental data from 36 high-resolution structures, confirmed the existence of these dominant non-interacting surface modes independent of computational methods, suggesting a potential evolutionary pressure towards specific surface fingerprints. Further validation involved a cross-fertilization procedure, where proper and improper peptide candidates were docked to different protein receptors, consistently revealing lower PSI entropy for favourable binding partners.

The authors acknowledge that ensemble sizes and macrostate counts varied between replicas, introducing some measurement uncertainty, but the observed ordering of PSI entropy, lower for proper candidates, remained consistent. These findings establish PSI entropy as a valuable tool for characterising binding constraints and identifying unique statistical signatures on non-interacting surfaces.

The consistent identification of dominant N-states suggests a fundamental principle governing favourable peptide binding, potentially informing future drug design and protein engineering efforts. Future research could focus on expanding the application of PSI entropy to more complex systems and exploring its relationship to other biophysical properties of protein-peptide interactions.

👉 More information
🗞 Harnessing the Peripheral Surface Information Entropy from Globular Protein-Peptide Complexes
🧠 ArXiv: https://arxiv.org/abs/2602.00498

Rohail T.

Rohail T.

As a quantum scientist exploring the frontiers of physics and technology. My work focuses on uncovering how quantum mechanics, computing, and emerging technologies are transforming our understanding of reality. I share research-driven insights that make complex ideas in quantum science clear, engaging, and relevant to the modern world.

Latest Posts by Rohail T.:

Protected: Models Achieve Reliable Accuracy and Exploit Atomic Interactions Efficiently

March 3, 2026

Protected: Quantum Computing Tackles Fluid Dynamics with a New, Flexible Algorithm

March 3, 2026

Protected: Silicon Unlocks Potential for Long-Distance Quantum Communication Networks

March 3, 2026