RCSB PDB Embedding Search provides a structure similarity search method based on protein 3D structure embeddings.
These embeddings are stored in a vector database optimized for real-time retrieval and scalable performance,
enabling efficient searches across large and expanding structural datasets, including both experimentally determined PDB structures and predicted AlphaFold DB models.
While the server does not generate residue-level structural alignments, it supports rapid, large-scale searches with sensitivity comparable to traditional structural alignment approaches.
Search Parameters
RCSB PDB: Search 3D structures in RCSB.org database, including experimental entries and Computed Structure Models.
AlphaFold DB: Search 3D structures in AlphaFold DB.
Search By
Chain: Use a PDB chain as a query for the search.
Assembly: Use a PDB assembly as a query for the search.
UniProt: Use a UniProt accession as a query for the search.
PDB ID: 4-letters code of a PDB entry.
Chain: Instance identifier for the PDB chain defined as the mmcif term label_asym_id.
Assembly: Assembly identifier defined as the mmcif term _pdbx_struct_assembly.id.
Return
Chains: Returns PDB chains.
Assemblies: Returns PDB assemblies.
# results: Number of returned search results.
Similarity Type
Local: Uses unmodified scores from the embedding model. Local structure similarity may rank higher. Useful to find matches between local regions of 3D structures.
Global: Scores are scaled based on the relationship between the number of residues, using the factor: min(query_length, target_length) / max(query_length, target_length). Useful to force global matches between 3D structures.
Switch to upload form: Use a coordinates file as a query for the search.
Format: Select the coordinates file format (.pdb, .cif, or .bcif).
Chain ID: Use a specif chain of the structure (optional). If not present all chains are used.