What is PDBcor?

PDBcor is an automated and unbiased method for the detection and analysis of correlated motions from experimental multi-state protein structures using torsion angle and distance statistics that does not require any structure superposition. Clustering of protein conformers allows us to extract correlations in the form of mutual information based on information theory. Correlations extracted with PDBcor can be utilized in subsequent assays including NMR multi-state structure optimization and validation. Further information is available in the reference publication.

How does PDBcor work?

An input structure bundle (supplied PDB, gray protein ensemble on the left in the Figure 1) is subjected to the significance thresholding that filters out spurious insignificant correlations. Here an illustrative example (second figure from the left) depicts conformers existing in two states and shown as black points in a scatter plot of two arbitrary distances (for example first is a distance between residues X and Y and second is a distance between residues X and Z). As expected from the protein existing in 2 states, distances center around 2 state-specific values (shown as 2 gray clusters). During significance thresholding random displacement of atoms broadens the edges of states so that states separated by less than the amplitude of the noise loose separation (third figure from the left). Then, interresidual distances are used to cluster conformers for each residue with GMM (Gaussian Mixture Model). In the illustrative example conformers are clustered according to the distances from the residue X to residues Y and Z (second figure from the right). In PDBcor, distances from the selected residue X to all other residues are considered for the clustering. By repeating this procedure for each residue a set of N clustering vectors is obtained. Finally, a pairwise comparison of the resulting clustering vectors based on their mutual information yields an interpretable correlation matrix (figure on the right).

The graphical abstract presenting PDBcor data analysis workflow.
Figure 1. The graphical abstract presenting PDBcor data analysis workflow.

What can PDBcor results indicate?

Correlation matrix indicates statistically significant correlations that stand out from the background. There are two types of correlation matrices: distance correlation matrix and angular correlation matrix. The difference between them is that distance correlation matrix takes distances as input whereas angular correlation matrix takes dihedral angles as input. Both distance and angular correlation analyses are able to detect correlated motion. Nevertheless, distance correlation extraction is more sensitive to the protein motion. As a guide for the interpretation of PDBcor results, we provide a series of protein structure ensembles that exhibit different levels of correlation, including non-correlated, locally correlated, and globally correlated ensembles in the Figure 2.

The panel illustrates proteins at different levels of structural correlations.
Figure 2. The panel illustrates proteins at different levels of structural correlations including non-correlated (a), locally correlated (b), and globally correlated ensembles (c) and their respective distance correlation matrix heatmaps (d, e, f).