Skip to content

Conversation

@k-chrispens
Copy link

📋 PR Checklist

  • This PR is tagged as a draft if it is still under development and not ready for review.

    This avoids auto-triggering the slower tests in the CI and needlessly wasting resources.

  • I have ensured that all my commits follow angular commit message conventions.

    Format: <type>[optional scope]: <subject>
    Example: fix(af3): add missing crop transform to the af3 pipeline

    This affects semantic versioning as follows:

    • fix: patch version increment (0.0.1 → 0.0.2)
    • feat: minor version increment (0.0.1 → 0.1.0)
    • BREAKING CHANGE: major version increment (0.0.1 → 1.0.0)
    • All other types do not affect versioning

    The format ensures readable changelogs through auto-generation from commit messages.

  • I have run make format on the codebase before submitting the PR (this autoformats the code and lints it).

  • I have named the PR in angular PR message format as well (c.f. above), with a sensible tag line that summarizes all the changes in the PR.

    This is useful as the name of the PR is the default name of the commit that will be used if you merge with a squash & merge.
    Format: <type>[optional scope]: <subject>
    Example: fix(af3): add missing crop transform to the af3 pipeline


ℹ️ PR Description

What changes were made and why?

Exposed the altloc argument to parse, which will enable applying the same sort of altloc filtering with parse as with load_any. This only works properly when add_missing_atoms=False, as the template matching to CCDs fails otherwise.

How were the changes tested?

>>> from atomworks.io.parser import parse
Environment variable CCD_MIRROR_PATH not set. Will not be able to use function requiring this variable. To set it you may:
  (1) add the line 'export VAR_NAME=path/to/variable' to your .bashrc or .zshrc file
  (2) set it in your current shell with 'export VAR_NAME=path/to/variable'
  (3) write it to a .env file in the root of the atomworks.io repository
Environment variable PDB_MIRROR_PATH not set. Will not be able to use function requiring this variable. To set it you may:
  (1) add the line 'export VAR_NAME=path/to/variable' to your .bashrc or .zshrc file
  (2) set it in your current shell with 'export VAR_NAME=path/to/variable'
  (3) write it to a .env file in the root of the atomworks.io repository
>>> st = parse("./6b8x_final.cif", altloc="all", add_missing_atoms=False, fix_bond_types=False)
We can't fix formal charges without building from templates, as we need to know the true number of hydrogens bonded to a given atom, not the inferred number. This may lead to occasional inaccuracies after adding inter-residue bonds. To avoid this and fix formal charges, set `add_missing_atoms = True`.
>>> st["asym_unit"][0].query("altloc_id == 'B'")
AtomArray([
        Atom(np.array([30.831,  6.13 ,  9.697], dtype=float32), chain_id="A", res_id=24, ins_code="", res_name="ARG", hetero=False, atom_name="N", element="N", atom_id=193, b_factor=32.8, occupancy=0.53, charge=0, auth_seq_id="24", label_entity_id="1", altloc_id="B", is_polymer=True, chain_type=6, pn_unit_id="A", molecule_id=0, chain_entity=0, pn_unit_entity=0, molecule_entity=0, atomic_number=7),
        Atom(np.array([31.851,  5.451, 10.485], dtype=float32), chain_id="A", res_id=24, ins_code="", res_name="ARG", hetero=False, atom_name="CA", element="C", atom_id=195, b_factor=37.92, occupancy=0.53, charge=0, auth_seq_id="24", label_entity_id="1", altloc_id="B", is_polymer=True, chain_type=6, pn_unit_id="A", molecule_id=0, chain_entity=0, pn_unit_entity=0, molecule_entity=0, atomic_number=6),
        ... (hidden for brevity)

Additional Notes

Also added relevant docstrings!

Altlocs are now grabbed when  in  and
Copilot AI review requested due to automatic review settings December 18, 2025 04:26
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR exposes the altloc (alternate location indicator) argument to the parse function, enabling users to control how alternate conformations are handled during structure parsing. Previously, this option was only available in load_any, and parse was hardcoded to use "first" altloc filtering.

Key changes:

  • Added altloc parameter to parse() function with default value "first" to maintain backward compatibility
  • Implemented validation to prevent incompatible parameter combinations (altloc='all' with add_missing_atoms or fix_bond_types)
  • Enhanced documentation for the altloc parameter in both parse() and load_any() functions

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/atomworks/io/parser.py Added altloc parameter to parse() function signature, docstring, validation logic, and propagated it through _parse_from_cif() and _parse_from_pdb() helper functions. Also includes a formatting improvement for an assert statement.
src/atomworks/io/utils/io_utils.py Expanded documentation for the altloc parameter in load_any() to clarify the behavior of each option ("first", "occupancy", "all", or specific altloc ID).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants