Covalent Docking

Covalent docking places a small molecule into a protein active site and forms a real chemical bond between the molecule and a chosen residue. A placed molecule with its bond formed is called a pose. This is different from ordinary docking, where the molecule only sits in the pocket held by attractions. Here the molecule has a reactive group, called a warhead, that links permanently to a target atom on a protein residue, such as the oxygen of a serine or the sulfur of a cysteine.

This tool builds poses with the bond formed, then scores each one with a three stage scheme: it generates poses, rescores them with a classical physics model, and ranks them by a combined score. It knows the common warhead classes, including boronic acids, acrylamides, nitriles, and sulfonyl fluorides.

Use it to study covalent inhibitors, the kind of drug molecule that bonds to its target instead of just binding loosely.

When to use it

Use it when your molecule is meant to form a covalent bond to a specific residue in the active site, and you want to know how it sits once that bond is in place and how favourable the fit looks. If your molecule does not form a bond and only sits in the pocket, use Docking instead. To dock several molecules one after another into the same site, mixing covalent and non covalent steps, use Sequential Docking.

Inputs

Input	Required	What it is
`structure_file`	yes	Protein structure file in PDB or CIF format.
`drug_smiles`	yes	SMILES string of the covalent inhibitor. It must contain a reactive warhead.
`chain_id`	yes	Which protein chain holds the target residue, for example `A`.
`target_resname`	yes	Three letter name of the residue the molecule bonds to. One of `SER`, `CYS`, `LYS`, `THR`, `HIS`, `TYR`.
`target_resid`	yes	Sequence number of the target residue in the chain.
`target_atom`	no	Name of the reactive atom on that residue, for example `OG` for serine or `SG` for cysteine. Worked out from the residue name if left blank.
`covalent_element`	no, default `B`	The warhead element on the ligand: `B` boronic acid, `C` acrylamide or nitrile, `S` sulfonyl, `P` phosphonate, `N` nitrogen.
`warhead_smarts`	no	A custom SMARTS pattern to find the warhead. Mark the reactive atom with the `:1` atom map.
`protonate`	no, default `true`	Sets the protonation state at the chosen pH. Turn off if your input is already prepared.
`ph`	no, default `7.4`	pH used to decide the protonation state. Allowed range `0.0` to `14.0`.
`n_conformers`	no, default `1`	Number of 3D shapes of the ligand to try. Higher helps flexible molecules but costs more runtime. Allowed range `1` to `10`.
`placement`	no, default `combined`	How the ligand is placed before bonding. One of `combined` (most thorough), `directed`, or `tetrahedral`.
`max_steps`	no, default `700`	Largest number of minimisation steps per orientation. Allowed range `100` to `2000`.
`keep_cofactors`	no	JSON list of cofactor residue names to keep, for example `["ZN", "HEM"]`.
`extra_chains`	no	JSON list of extra protein chain IDs to include, for example `["B", "C"]`.

How to run it

Submit your own protein and inhibitor from Azulene Studio, the Python SDK, or the CLI. New here? The Get started page walks through installing, logging in, and running a ready made example first.

In Azulene Studio

Open Covalent Docking from the tools list, then on the Inputs and Parameters step upload your protein structure file, enter the inhibitor SMILES, set the chain ID and the target residue (name and number), pick the warhead element if it is not a boronic acid, adjust the sampling settings if you want, then Review and Submit.

From the Python SDK

from opal import jobs

result = jobs.submit(
    job_type="covalent_docking",
    input_data={
        "structure_file": "/path/to/your/protein.pdb",
        "drug_smiles": "OB(O)c1ccccc1",
        "chain_id": "A",
        "target_resname": "SER",
        "target_resid": 70,
        "target_atom": "OG",
        "covalent_element": "B",
    },
)

From the CLI

Pass the inputs as a JSON string.

opal jobs submit --job-type covalent_docking \
  --input-data '{"structure_file": "/path/to/your/protein.pdb", "drug_smiles": "OB(O)c1ccccc1", "chain_id": "A", "target_resname": "SER", "target_resid": 70, "target_atom": "OG", "covalent_element": "B"}'

Reading the result

The headline number is the composite score in kcal/mol, the classical physics score of the best pose. A more negative composite score means a better fitting pose. Next to it is the covalent distance in Angstroms (final_distance), the length of the bond that formed. A bond is considered formed when this distance is around 2.0 Angstroms or less, so this number confirms the warhead actually reached the target atom.

Below the headline numbers is a ranked table of the top poses. Each row has a rank (numbered from 0), a pose label, and that pose’s composite score in kcal/mol (top_k_composite_kcal_mol). The poses are also drawn as a bar chart of composite score per pose. You can sort the table, colour it by score, and export it to CSV. The downloaded result contains the docked structures as pose_0.pdb, pose_1.pdb, and so on.

When the job also runs the OPAL machine learning rescore step, the result instead carries a poses list, where each pose has a dg_kcal_mol value and a pose label, and a headline best_dg_kcal_mol. This is the same machine learning binding energy described in OPAL ML Score.

Notes

Make sure the SMILES really contains a reactive warhead, otherwise the molecule has nothing to bond with. Pick the target residue carefully: the residue name, its sequence number, and the chain all have to match your structure. For flexible molecules, raise n_conformers so more shapes are tried. Keep any structural cofactors, such as a zinc ion, with keep_cofactors so the active site keeps its real shape.