Relative Free Energy for (Unnatural) Amino Acids

Relative Free Energy for (Unnatural) Amino Acids calculates the difference in solvation free energy between two short peptide sequences in water, reported as a free energy in kcal/mol.

Solvation free energy is the energy change when a molecule is moved from a vacuum into water. It tells you how comfortable the peptide is in water. This is the same calculation as Relative Free Energy, but you provide peptide sequences instead of SMILES strings, and you can swap in unnatural amino acids that are not one of the twenty standard ones.

You can write a natural amino acid with its usual one letter code, for example GAG. For an unnatural amino acid you have three choices: a library name in angle brackets such as <pF-Phe>, a SMILES string in angle brackets such as <NC(C)(C)C(=O)O>, or a lowercase placeholder letter that you spell out separately using uaa_map, for example GxG together with uaa_map set to {"x": "pF-Phe"}. The built in library names are pF-Phe, oF-Phe, mF-Phe, Sar, and N-Me-Ala.

It works by gradually morphing the first peptide into the second one inside water, an alchemical transformation, and measuring the free energy change along the way. Because the two peptides are similar, much of the calculation cancels out, which makes the result both cheaper to compute and more precise than running each one separately.

When to use it

Use it when you are designing peptides and want to see how swapping one residue, often for an unnatural amino acid, changes how well the peptide sits in water. The two sequences should differ by only a small change so the morph from one to the other is well defined. If you are comparing ordinary small molecules rather than peptides, use Relative Free Energy instead.

Inputs

Input	Required	What it is
`sequence_a`	yes	The first peptide sequence. Use uppercase one letter codes for natural amino acids (for example `GAG`). For unnatural amino acids use a library name in angle brackets, a SMILES string in angle brackets, or a lowercase placeholder spelled out with `uaa_map`.
`sequence_b`	yes	The second peptide sequence, written the same way as the first.
`uaa_map`	no	A mapping from placeholder letters to unnatural amino acid names or SMILES, for example `{"x": "pF-Phe"}`. Only needed if you used lowercase placeholders in your sequences.
`assign_protonation_states`	no, default `true`	Protonates both peptides automatically at the given pH. Turn off if your inputs are already protonated.
`ph`	no, default `7.0`	pH used to decide the protonation state.
`equil_length`	no, default `0.08` ns	Equilibration length per replica, in nanoseconds.
`prod_length`	no, default `0.4` ns	Production length per replica, in nanoseconds.
`platform`	no, default `CUDA`	Compute platform, one of `CUDA`, `OpenCL`, `CPU`, or `Reference`.
`protocol_repeats`	no, default `3`, minimum `1`	Number of independent repeats used for the uncertainty estimate. More repeats give a smaller uncertainty.
`keep_dirs`	no, default `true`	Preserves the full simulation outputs so you can download them.

Longer simulation lengths and more repeats give more reliable numbers, but cost more runtime and credits.

How to run it

Submit your two sequences from Azulene Studio, the Python SDK, or the CLI. New here? The Get started page walks through installing, logging in, and running a ready made example first.

In Azulene Studio

Open Relative Free Energy for (Unnatural) Amino Acids from the tools list, then on the Inputs and Parameters step enter sequence A and sequence B, fill in the unnatural amino acid mapping if you used placeholder letters, adjust the pH and simulation lengths if you want, then Review and Submit.

From the Python SDK

from opal import jobs

result = jobs.submit(
    job_type="relative_fe_uaa",
    input_data={
        "sequence_a": "YGH",
        "sequence_b": "<pF-Phe>GH",
        "ph": 7.0,
        "protocol_repeats": 3,
    },
)

From the CLI

Pass the inputs as a JSON string.

opal jobs submit --job-type relative_fe_uaa \
  --input-data '{"sequence_a": "YGH", "sequence_b": "<pF-Phe>GH", "ph": 7.0, "protocol_repeats": 3}'

Reading the result

The main output is dg_solvation, the free energy difference between the two peptides in water, in kcal/mol, reported with its unit in dg_solvation_unit. The result also includes uncertainty and uncertainty_unit, the error estimate on that number, and smiles, the pair of peptides that were compared, given as the molecular structures they resolved to.

More protocol_repeats lower the uncertainty. The result also carries replica_transition_statistics, a record of how well the simulation mixed between its stages, which helps judge whether the run was reliable. If keep_dirs is on, the full simulation outputs can be downloaded from results_file.

Notes

Keep the simulation lengths short for a quick first run. For reliable numbers, use a longer prod_length and at least 3 repeats. This tool runs on a GPU, and runtime grows with the simulation lengths and the number of repeats. The two sequences need to differ by only a small change so the morph from one to the other is well defined. If your two peptides are too dissimilar, the result may not be reliable.