Skip to content

Conversation

@shahryary
Copy link

This PR fixes two issues when running ipsae.py on Boltz-2 structures:

  1. plddt_AAAAA_model_0.npz and pae_AAAAA_model_0.npz were indexed with
    token_array.astype(bool), which assumes a 1:1 correspondence between the
    Boltz-1 pLDDT/PAE vectors and the token_mask built from the mmCIF
    _atom_site table. For some Boltz-2 outputs this is not true and leads to:

    • IndexError in the Boltz block:
      IndexError: index XXX is out of bounds for axis 0 with size Y
    • A second IndexError later in the pDockQ calculation:
      mean_plddt = cb_plddt[list(pDockQ_unique_residues[chain1][chain2])].mean()
  2. cb_plddt could end up having a length different from the number of scored
    residues (numres), while downstream code assumes residue-level arrays of
    length numres (e.g. for pDockQ, ipSAE by residue).

What this change does

For boltz-1/boltz-2 inputs we now:

  • Load plddt from plddt_*.npz, scale it to 0–100, and then:

    • If len(plddt) >= max(CA_atom_num)+1, treat it as per-atom and build
      residue-level plddt / cb_plddt using CA_atom_num / CB_atom_num
      (same strategy as the AF3 code path).
    • If len(plddt) == numres, treat it as per-residue and use it directly.
    • Otherwise, fall back to truncating/padding to numres with a warning so
      that downstream calculations never hit an out-of-bounds error.
  • Load pae from pae_*.npz and ensure pae_matrix is (numres, numres):

    • If the matrix is larger, truncate to [:numres, :numres].
    • If it is exactly numres x numres, use it as-is.
    • Otherwise, emit a warning and use the raw matrix.

This makes sure that all residue-level arrays (plddt, cb_plddt,
pae_matrix) are consistent with the rest of the script, and fixes the
Boltz-2 crashes I was seeing in practice.

Manual testing

  • Ran ipsae.py on Boltz-2 outputs (structure .cif, plddt_*.npz,
    pae_*.npz, confidence_*.json) where the previous version raised:

    • IndexError: index 604 is out of bounds for axis 0 with size 604
    • IndexError: index 600 is out of bounds for axis 0 with size 600

    With this patch, ipsae.py completes successfully and produces scores for
    all chain pairs (including pDockQ, pDockQ2, LIS and the various ipSAE
    variants).

There are no changes to AF2/AF3 paths.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant