Protein Mutation ΔΔG Fold
Protein Mutation ΔΔG Fold predicts how much a single amino acid change makes a protein more stable or less stable, reported as a change in folding free energy, ΔΔG_fold, in kcal/mol.
The sign convention is that a negative ΔΔG_fold means the mutation makes the fold more stable, and a positive value means it makes the fold less stable.
It works through a two step cycle. The same mutation is simulated twice: once in the fully folded protein, and once in a short stretched out reference peptide that stands in for the unfolded state. That reference peptide is built from the local sequence around the mutation site, by default the three residues on each side. The difference between the folded change and the unfolded change is the ΔΔG_fold. Both legs use the same simulation engine and force field so that the comparison is fair.
Alongside the natural twenty amino acids you can also mutate to any of fourteen unnatural amino acids: Aib, Sar, dA, dF, dW, dY, hPhe, Hyp, mePhe, meS, meT, Nle, Orn, and Phe_4F.
When to use it
Section titled “When to use it”Use it when you want to know whether a point mutation will stabilise or destabilise a protein, for example when engineering a more stable variant of an enzyme or antibody, or when you are exploring the effect of an unnatural amino acid at a specific site. Your input protein must be a single chain in version one of this tool. Multi chain structures where the mutation is ambiguous are rejected.
Inputs
Section titled “Inputs”| Input | Required | What it is |
|---|---|---|
protein_pdb | yes | The input protein structure as a PDB file. Single chain only in this version. |
mutations | no | A comma separated list of mutations, each written as CHAIN:RESID:TARGET, for example A:78:V. An unnatural amino acid target goes in square brackets, for example A:78:[Aib]. Provide either this or mutant_chain_helm, not both. |
mutant_chain_helm | no | The full mutated chain written in HELM2 notation. The tool finds the mutations by lining this sequence up against the input PDB chain position by position. Provide either this or mutations, not both. |
mutant_chain_id | no, default A | Which chain in the PDB the mutant_chain_helm sequence matches. Ignored when you use mutations directly. |
unfolded_flank_size | no, default 3 | How many residues to keep on each side of the mutation site when building the unfolded reference peptide. The default of 3 gives a 7 residue window. 0 collapses to a single capped residue. |
unfolded_engine | no, default feflow | The engine for the unfolded leg. feflow is recommended and shares its method with the folded leg. rfe_legacy is the older path, kept as a fallback. |
unfolded_relax_ns | no, default 0.005 ns | How long to relax each reference peptide before the free energy ramp, in nanoseconds. This lets the peptide settle into a random coil shape. |
unfolded_relax_implicit_solvent | no, default true | If true, relaxes the reference peptide in a fast simplified water model. If false, uses explicit water, which is slower but matches the published quantitative protocol. |
unfolded_relax_restrained_nvt_ns | no, default 0.0 ns | Length of an extra relaxation phase that holds the backbone fixed while the rest settles, in nanoseconds. Skipped by default. |
unfolded_relax_unrestrained_nvt_ns | no, default 0.0 ns | Length of a further relaxation phase with the backbone free, run after the restrained phase, in nanoseconds. Skipped by default. |
mode_preset | no, default smoke | A single knob that sets several fields at once. smoke keeps your own values for a quick test. aldeghi_2019_quantitative switches everything to the longer published protocol for accurate numbers. |
random_seed | no, default 42 | A fixed seed so a re run gives the same result. |
equil_length_ns | no, default 5.0 ns | Equilibration length per endpoint, in nanoseconds. The small value 0.005 is used for quick plumbing tests. |
n_neq_switches_per_direction | no, default 50 | How many switching runs to do in each direction. More switches give a better estimate. |
protocol_repeats | no, default 3, minimum 1 | Number of independent repeats used for the uncertainty estimate. More repeats give a smaller uncertainty. |
Longer simulation lengths and more repeats and switches give more reliable numbers, but cost more runtime and credits.
How to run it
Section titled “How to run it”Submit your protein and mutation from Azulene Studio, the Python SDK, or the CLI. New here? The Get started page walks through installing, logging in, and running a ready made example first.
In Azulene Studio
Section titled “In Azulene Studio”Open Protein Mutation ΔΔG Fold from the tools list, then on the Inputs and Parameters step upload your protein PDB, enter the mutation you want, for example A:6:F, adjust the simulation settings or pick a mode_preset if you want, then Review and Submit.
From the Python SDK
Section titled “From the Python SDK”from opal import jobs
result = jobs.submit( job_type="protein_mutation_ddg_fold", input_data={ "protein_pdb": "/path/to/your/protein.pdb", "mutations": "A:6:F", "protocol_repeats": 3, },)From the CLI
Section titled “From the CLI”Pass the inputs as a JSON string. The protein PDB path is uploaded for you.
opal jobs submit --job-type protein_mutation_ddg_fold \ --input-data '{"protein_pdb": "/path/to/your/protein.pdb", "mutations": "A:6:F", "protocol_repeats": 3}'Reading the result
Section titled “Reading the result”The main output is ddG_fold, the change in folding free energy on mutation, in kcal/mol, with its error estimate in sigma_total. A negative ddG_fold means the mutation makes the protein more stable, and a positive value means it makes the protein less stable.
The result also reports the two halves of the cycle separately. dG_folded, with its error sigma_folded and unit dG_folded_unit, is the free energy change for the mutation inside the folded protein. dG_unfolded, with its error sigma_unfolded and unit dG_unfolded_unit, is the same change measured on the short reference peptide that stands in for the unfolded state. The ddG_fold is the folded value minus the unfolded value.
Keep the default smoke preset and short simulation lengths for a quick first run. For accurate numbers you can compare against published data, pick the aldeghi_2019_quantitative preset, which sets longer simulations, more switches, and an explicit water model for the reference peptide. This tool runs on a GPU, and runtime grows with the simulation lengths, the number of switches, and the number of repeats. In this version the input protein must be a single chain.