A. Since we exclude low-resolution structures, currently, the membrane protein structures described in EncoMPASS amount to >65% of membrane protein structures contained in the PDB.

A. Chain A is a sequence neighbor of chain B if they contain a similar number of transmembrane segments and the sequence identity based on the sequence alignment between A and B is ≥ 0.85. Likewise, chain A is a structure neighbor of chain B if they contain a similar number of transmembrane segments and the TM-score of the structure alignment between A and B is ≥ 0.6. Note that the sequence and structure alignments are not symmetric procedures and that the TM-score is not a symmetric operator, so the fact that chain A is neighbor of chain B does not guarantee that chain B is a neighbor of chain A (see below).

A. Structural relationship is assessed using a rigid cutoff: two chains are structurally related if their structural alignment has a TM-score ≥ 0.6. The TM-score is not a symmetric operator, meaning that it does not have the property TM-score(A,B) = TM-score(B,A). Thus, it is possible that protein A is considered structurally related to protein B, but not the opposite. The user can redefine the similarity threshold as needed, yet this stands as a reminder of the kind of inconsistencies that can be encountered when using fixed cutoffs with an asymptotically correct estimator. Moreover, neither the sequence nor the structure alignments are performed via symmetric algorithms. Thus, the alignment of chain A on chain B is in general not the same as the alignment of chain B on chain A.

A. In the Standard Symmetry Detection section, you will find the output from the programs CE-Symm and SymD when they are executed with their default (or author-selected) set of parameters. The Multi-Step Symmetry Detection section instead shows the result of a more sophisticated procedure where both programs have been executed multiple times with different sets of parameters in order to increase their sensitivity without compromising specificity. The results are then filtered to exclude irrelevant symmetries and a final result is selected so as to present the most information about the symmetric relationships in the given structure.

A. The Multi-Step Symmetry Detection method is not yet available for beta-barrels. Furthermore, for alpha-helical proteins, it filters symmetries comprising repeats with <2 TM helices or repeats that are entirely outside of the membrane-embedded region of the protein. These criteria have been defined so as to focus on the potentially functionally meaningful symmetries in the membrane.

A. The Multi-Step Symmetry Detection section aims to provide complete information about each defined symmetry - i.e. it reports the ranges and multiple alignments of the repeats associated with a symmetry. SymD does not provide information about individual repeats and there is no obvious way for such information to be extracted. Hence, while we use the results from SymD to enhance the abilities of CE-Symm for detecting a symmetry, we do not report its raw output in the Multi-Step Symmetry Detection section.

A. To illustrate this, consider the following example:

(4F35.D_43-140,4F35.C_43-140)(4F35.D_259-362,4F35.C_259-362)

(4F35.D_43-140,4F35.D_259-362)

(4F35.C_43-140,4F35.C_259-362)

Each line corresponds to the repeats associated with one axis of symmetry. Thus, in 4F35 we have found three axes of symmetry. Furthermore, each bracket defines the regions that have been aligned to each other. For example, the first axis links regions 4F35.D_43-140 and 4F35.D_259-362 to regions 4F35.C_43-140 and 4F35.C_259-362, highlighting the fact that the protein is a dimer, i.e. C2-symmetric. The second and third axes refer to a C2 symmetry within each protomer.

A. C1 means that no symmetry was detected.

A. R is the order assigned by CE-Symm to cases where there is an open symmetry (as opposed to a point group symmetry such as C2 or D3) that is described either by negligible translation (rotational repeats) or by negligible rotation between repeats (translational repeats).

A. Some structures contain different symmetries in different structural regions. These symmetries are not hierarchically related to each other, and can each generate hierarchies of symmetric subregions. In order to keep them separated, such symmetries are annotated and separated by the word "and".

A. Lowercase letters represent amino acids that were not included in the calculation of the RMSD and TM-scores between the repeats. They are either aligned with a gap or not close enough to their corresponding amino acids to be considered aligned.

A. We are in the process of defining filter criteria that help us exclude the symmetries that have no functional relevance, which in turn allows us to use more permissive options for CE-Symm and SymD. We plan to include beta-barrels in the future.

A. These options reveal more of the raw output of the corresponding symmetry analysis program.

A. In the results from CE-Symm, the Multi-step Symmetry Detection Analysis, and the Symmetry Inferred From Neighbors sections, the RMSD and the TM-score are calculated after superimposing all hierarchically-related repeats onto the first repeat. In the results from SymD, the RMSD and the TM-score are calculated over the aligned residues after superimposing the transformed protein (with a transformation defined by the reported symmetry axis, angle and translation) onto the original structure.

A. A key step in the production of all data presented in EncoMPASS is the pairwise structure alignments of each single-chain subunit from each complex with all other topologically similar chains. To ensure the accuracy of the alignments, we imposed a cutoff on the resolution of the structures. However, we are exploring ways to include such structures in EncoMPASS without compromising the accuracy of the present structure alignments.

A. There are some rare exceptions caused by inconsistencies, repetitions or ambiguities in the coordinate files, such as unknown residue names or repeated residue indices (with no alternate location indicators) in the same chain. We also exclude structures with extended gaps in residue numbering (likely to be chimeric structures).

A. A transmembrane domain is defined as a continuous range of amino acids containing at least one segment of secondary structure with Cα atoms inside the OPM-defined boundaries of the lipid bilayer by at least 1 Å. This definition includes membrane crossings, same-side membrane insertions, but not small membrane loops.

A. No. OPM, PDBTM and EncoMPASS have three different definitions of transmembrane domains (or segments). Specifically, OPM defines as transmembrane segment a continuous segment of secondary structure that is at least partially contained inside the lipid bilayer boundaries. This implies that a kinked TM helix crossing the membrane can correspond to two different TM segments (if the kink region is extended enough to be considered a loop). When compared on the common set of membrane protein structures, OPM, PDBTM and EncoMPASS agree on ~60% of assignments of TM domains.

A. Many existing structure classifications and protein structures databases (such as DALI, SCOP and CATH) take structural domains as fundamental units for their structural analyses. However, membrane proteins usually have only 1 or 2 structural domains, and domain fusion is rare. Moreover, the definition of a structural domain is controversial. On the other hand, comparing the structures of whole complexes can result in very low sensitivity to structural similarity. Thus, instead we used single-chain subunits as a fundamental unit. Chains are uniquely defined. Moreover, the program Fr-TM-Align is able to produce accurate structural alignments of chains with multiple structural domains.

A. EncoMPASS aims to maximize the accuracy of the similarity assessments it produces. Despite the reasonable accuracy of the sequence identity and TM-score estimators, the number of pairs of structures mistakenly considered to be related (false positives) increases with decreasing topological similarity. To limit the number of false positives, we only include alignments of chains with similar numbers of transmembrane segments (according to our definition.

A. Two chains are considered to be topologically similar when the estimated number of transmembrane domains is the >75% of the larger number of transmembrane segments. We are aware that this is an approximation which precludes some interesting and important comparisons. Yet, the current strategy is efficient and consistent with the philosophy of sacrificing sensitivity over specificity. We are currently working on a more extensive definition of a topology class.

A. The structure alignment program Fr-TM-Align relies on the TM-score to generate its results. TM-score is independent of the size of the proteins being aligned, meaning that a given value of the TM-score will always imply the same degree of similarity regardless of the size of the two structures. This is not true for the RMSD (e.g., an RMSD of 2.5 Å does not have the same meaning for an alignment of two 600-residue long proteins and two 40-residue long proteins).

A. No. Both RMSD and TM-score are calculated only on the pairs of Cα atoms that have been aligned by the program Fr-TM-Align. This subset of atoms and their coordinates can be downloaded from the web page corresponding to the chain of interest.

A. Sequence identity is not uniquely defined. We represent the number of matches over the number of matches and mismatches, thus ignoring the parts of the sequence alignment containing gaps. This estimator is therefore a sequence-similarity equivalent of RMSD.

A. Currently this is not possible, but we are working on adding this feature.

A. During the upload of the database, coordinate files may have been renamed. Please change the name of the coordinate file in PyMOL to make it correspond to the coordinate file you downloaded.

General category | Criteria | Complex | Chain | Expected variable type | Description |
---|---|---|---|---|---|

Structure information | PDB | Yes | Yes | String | The 4-letter PDB code for a complex or the chain code in the format XXXX_Y, where the PDB code is designated with Xs and Y is the case-sensitive chain identifier |

Protein name | Yes | Yes | String | The full or partial name of the protein as it appears in the OPM database | |

Protein type (alpha/beta) | Yes | Yes | String | Either alpha or beta, indicating the primary secondary structure content | |

Num. of chains | Yes | No | Integer | The number of chains in the complex | |

Num. of TM chains | Yes | No | Integer | The number of membrane-spanning chains in the complex | |

Num. of TM domains | Yes | No | Integer | The number of transmembrane crossings of the chain | |

Num. of amino acids | Yes | Yes | Integer | The number of residues in the structure | |

Resolution | Yes | Yes | Float | The resolution of the structure | |

Num. of sequence neighbors | No | Yes | Integer | The number of sequence homologues (≥85% identity) within the database | |

Num. of structural neighbors | No | Yes | Integer | The number of structural homologues (TM-Score ≥ 0.6) within the database | |

Num. of all neighbors | No | Yes | Integer | The number of sequence or structural homologues within the database | |

CE-Symm results | Order | Yes | Yes | String | Symmetry order such as C2, D3, for point group symmetries and H or R for helical or repeated symmetries |

Num. of levels | Yes | Yes | Integer | The total number of levels of symmetry detected for multiple, hierarchically organized symmetries | |

Num. of repeats | Yes | Yes | Integer | The number of symmetry-related structural repeats within the structure | |

Repeat length | Yes | Yes | Integer | The average number of residues in a symmetry-related structural repeat | |

Coverage | Yes | Yes | Float | The fraction of all amino acids in the structure that contribute to symmetry-related repeats | |

Angle | Yes | Yes | Float | The angle [degree] of a symmetry transformation | |

Translation | Yes | Yes | Float | The length [Å] of the component of the translation vector associated with a symmetry transformation that is parallel to the symmetry axis (i.e., screw translation) | |

RMSD | Yes | Yes | Float | The average root mean square deviation [Å] of the Cα atoms calculated by superposing all repeats in a hierarchical symmetry | |

TM-Score | No | Yes | Float | The TM-score calculated by superposing all repeats in a hierarchical symmetry | |

SymD results | Order | Yes | Yes | Integer | The number of structural repeats found within the structure (with no distinction between circular, dihedral or helical symmetries) |

Coverage | Yes | Yes | Float | The fraction of all amino acids in the structure that contribute to symmetry-related repeats | |

Angle | Yes | Yes | Float | The angle [degree] of a symmetry transformation | |

Translation | Yes | Yes | Float | The length [Å] of the component of the translation vector associated with a symmetry transformation that is parallel to the symmetry axis (i.e., screw translation) | |

RMSD | Yes | Yes | Float | The root mean square deviation of the Cα atoms [Å] calculated over the symmetry-related residues after superimposing the coordinates transformed by the symmetry transformation onto the initial structure coordinates | |

TM-Score | Yes | Yes | Float | The TM-score calculated after superimposing the coordinates transformed by the symmetry transformation onto the initial structure coordinates | |

Z-TM-Score | Yes | Yes | Float | The z-score of the TM-score associated with the symmetry according to SymD | |

MSSD results | Order | Yes | Yes | String | Symmetry order such as C2, D3, for point group symmetries and H or R for helical or repeated symmetries |

Num. of levels | Yes | Yes | Integer | The total number of levels of symmetry detected for multiple, hierarchically organized symmetries | |

Num. of repeats | Yes | Yes | Integer | The number of symmetry-related structural repeats within the structure | |

Repeat length | Yes | Yes | Integer | The average number of residues in a symmetry-related structural repeat | |

Coverage | Yes | Yes | Float | The fraction of all amino acids in the structure that contribute to symmetry-related repeats | |

Angle | Yes | Yes | Float | The angle [degree] of a symmetry transformation | |

Translation | Yes | Yes | Float | The length [Å] of the component of the translation vector associated with a symmetry transformation that is parallel to the symmetry axis (i.e., screw translation) | |

RMSD | Yes | Yes | Float | The average root mean square deviation [Å] of the Cα atoms calculated by superposing all repeats in a hierarchical symmetry | |

TM-Score | Yes | Yes | Float | The TM-score calculated by superposing all repeats in a hierarchical symmetry | |

Repeat topology (parallel/antiparallel) | Yes | Yes | String | Either parallel or antiparallel, indicating the relative topology of the repeats in the membrane | |

Angle with membrane normal | Yes | Yes | Float | The angle [degree] between the symmetry axis and the membrane normal | |

Inferred symmetry results | Template PDB | No | Yes | String | Chain PDB code (XXXX_Y) of the structure that was used as a source for the symmetry information |

Order | No | Yes | String | Symmetry order such as C2, D3, for point group symmetries and H or R for helical or repeated symmetries | |

Num. of levels | No | Yes | Integer | The total number of levels of symmetry detected for multiple, hierarchically organized symmetries | |

Num. of repeats | No | Yes | Integer | The number of symmetry-related structural repeats within the structure | |

Repeat length | No | Yes | Float | The average number of residues in a symmetry-related structural repeat | |

Angle | No | Yes | Float | The angle [degree] of a symmetry transformation | |

Translation | No | Yes | Float | ||

RMSD | No | Yes | Float | The average root mean square deviation [Å] of the Cα atoms calculated by superposing all repeats in a hierarchical symmetry | |

TM-Score | No | Yes | Float | The TM-score calculated by superposing all repeats in a hierarchical symmetry | |

Repeat topology (parallel/antiparallel) | No | Yes | String | Either parallel or antiparallel, indicating the relative topology of the repeats in the membrane | |

Angle with membrane normal | No | Yes | Float | The angle [degree] between the symmetry axis and the membrane normal |

NIH | NINDS | USA.Gov | Disclaimer

NINDS Copyright 2017