pallatom.helpers.atom_utils¶
Attributes¶
Classes¶
Protein structure representation. |
Functions¶
|
Extract the 5 backbone+Cβ atoms from atom37 representation. |
|
Full pipeline: atom37 → atom5 → Cβ / pseudo-Cβ. |
|
Center 'atom_positions' on CA center of mass. |
|
Extract Cβ (slot 4) from atom5, replacing missing Cβ (Gly) with pseudo-Cβ. |
|
Pad features to fixed sequence length, i.e. currently axis=0. |
|
Make a dictionary of non-batched numpy protein features. |
|
Parse a PDB file (ATOM records only) into a Protein using the atom37 layout. |
|
Compute a virtual Cβ from backbone geometry (Gly-safe). |
|
Converts a Protein instance to a PDB string. |
Module Contents¶
- class pallatom.helpers.atom_utils.Protein¶
Protein structure representation.
- aatype: jaxtyping.Int[numpy.ndarray, num_res]¶
- atom_mask: jaxtyping.Float[numpy.ndarray, num_res num_atom_type]¶
- atom_positions: jaxtyping.Float[numpy.ndarray, num_res num_atom_type 3]¶
- b_factors: jaxtyping.Float[numpy.ndarray, num_res num_atom_type]¶
- chain_index: jaxtyping.Int[numpy.ndarray, num_res]¶
- residue_index: jaxtyping.Int[numpy.ndarray, num_res]¶
- pallatom.helpers.atom_utils.atom37_to_atom5(atom37_positions: jaxtyping.Float[torch.Tensor, B N_res 37 3], atom37_mask: jaxtyping.Float[torch.Tensor, B N_res 37]) tuple[jaxtyping.Float[torch.Tensor, B N_res 5 3], jaxtyping.Float[torch.Tensor, B N_res 5]]¶
Extract the 5 backbone+Cβ atoms from atom37 representation.
- Returns:
atom5_positions ((B, N_res, 5, 3))
atom5_mask ((B, N_res, 5))
- pallatom.helpers.atom_utils.atom37_to_cb(atom37_positions: jaxtyping.Float[torch.Tensor, B N_res 37 3], atom37_mask: jaxtyping.Float[torch.Tensor, B N_res 37]) tuple[jaxtyping.Float[torch.Tensor, B N_res 3], jaxtyping.Bool[torch.Tensor, B N_res]]¶
Full pipeline: atom37 → atom5 → Cβ / pseudo-Cβ.
- Returns:
cb ((B, N_res, 3) — real Cβ where available, pseudo-Cβ otherwise)
pseudo_beta_mask ((B, N_res) — True where real Cβ present, False where pseudo-Cβ was used)
- pallatom.helpers.atom_utils.center_positions(np_example)¶
Center ‘atom_positions’ on CA center of mass.
- pallatom.helpers.atom_utils.get_cb_coords(atom5_positions: jaxtyping.Float[torch.Tensor, B N_res 5 3], atom5_mask: jaxtyping.Float[torch.Tensor, B N_res 5], fill_pseudo: bool = True) tuple[jaxtyping.Float[torch.Tensor, B N_res 3], jaxtyping.Bool[torch.Tensor, B N_res]]¶
Extract Cβ (slot 4) from atom5, replacing missing Cβ (Gly) with pseudo-Cβ.
- Parameters:
atom5_positions ((B, N_res, 5, 3))
atom5_mask ((B, N_res, 5) — 1 where atom is present)
fill_pseudo (if True, compute pseudo-Cβ wherever Cβ is absent)
- Returns:
cb ((B, N_res, 3) — real Cβ where available, pseudo-Cβ otherwise)
pseudo_beta_mask ((B, N_res) — True where real Cβ present, False where pseudo-Cβ was used)
- pallatom.helpers.atom_utils.make_fixed_size(np_example, max_seq_length=500)¶
Pad features to fixed sequence length, i.e. currently axis=0.
- pallatom.helpers.atom_utils.make_np_example(coords_dict)¶
Make a dictionary of non-batched numpy protein features.
- pallatom.helpers.atom_utils.protein_from_pdb(pdb_path: str) Protein¶
Parse a PDB file (ATOM records only) into a Protein using the atom37 layout.
- pallatom.helpers.atom_utils.pseudo_cb(n: jaxtyping.Float[torch.Tensor, ... 3], ca: jaxtyping.Float[torch.Tensor, ... 3], c: jaxtyping.Float[torch.Tensor, ... 3]) jaxtyping.Float[torch.Tensor, ... 3]¶
Compute a virtual Cβ from backbone geometry (Gly-safe).
- Uses the standard ideal-geometry recipe:
b = Cα - N (N→Cα bond vector) d = C - Cα (Cα→C bond vector) Cross them, then combine with ideal tetrahedral offsets.
This matches the AlphaFold2 / ESMFold convention exactly.
- pallatom.helpers.atom_utils.to_pdb(prot: Protein) str¶
Converts a Protein instance to a PDB string.
- Parameters:
prot – The protein to convert to PDB.
- Returns:
PDB string.
- pallatom.helpers.atom_utils.ATOM37_C = 2¶
- pallatom.helpers.atom_utils.ATOM37_CA = 1¶
- pallatom.helpers.atom_utils.ATOM37_CB = 4¶
- pallatom.helpers.atom_utils.ATOM37_N = 0¶
- pallatom.helpers.atom_utils.ATOM37_O = 3¶
- pallatom.helpers.atom_utils.ATOM5_C = 2¶
- pallatom.helpers.atom_utils.ATOM5_CA = 1¶
- pallatom.helpers.atom_utils.ATOM5_CB = 4¶
- pallatom.helpers.atom_utils.ATOM5_ELEMENTS¶
- pallatom.helpers.atom_utils.ATOM5_N = 0¶
- pallatom.helpers.atom_utils.ATOM5_NAMES = ['N', 'CA', 'C', 'O', 'CB']¶
- pallatom.helpers.atom_utils.ATOM5_O = 3¶
- pallatom.helpers.atom_utils.PDB_CHAIN_IDS = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789'¶
- pallatom.helpers.atom_utils.PDB_MAX_CHAINS = 62¶
- pallatom.helpers.atom_utils.atom_types = ['N', 'CA', 'C', 'CB', 'O', 'CG', 'CG1', 'CG2', 'OG', 'OG1', 'SG', 'CD', 'CD1', 'CD2', 'ND1',...¶
- pallatom.helpers.atom_utils.restype_1to3¶
- pallatom.helpers.atom_utils.restype_3to1¶
- pallatom.helpers.atom_utils.restype_num = 21¶
- pallatom.helpers.atom_utils.restype_order¶
- pallatom.helpers.atom_utils.restypes = ['A', 'R', 'N', 'D', 'C', 'Q', 'E', 'G', 'H', 'I', 'L', 'K', 'M', 'F', 'P', 'S', 'T', 'W', 'Y', 'V', 'X']¶
- pallatom.helpers.atom_utils.rigid_group_atom_positions¶