OpenFold3 Structure Prediction
OpenFold3 predicts the full 3D structure of a molecular complex from sequence. It is an all atom model, which means it places every atom in 3D, and it is open source. On standard benchmarks it sits within experimental error of AlphaFold3 on single chain proteins, and it is the only open model that matches AlphaFold3 on RNA.
It can fold proteins, RNA, DNA, and small molecule ligands together in one run. You describe what you want to fold by listing chains inside a single request object. Each chain says what kind of molecule it is and gives its sequence. By default the output structure is written as a CIF file.
When to use it
Section titled “When to use it”Use it when you want a high quality 3D model of a complex and you prefer an open model, or when your system includes RNA, where OpenFold3 is especially strong. It is a good general purpose choice for folding proteins on their own, protein complexes, protein and nucleic acid complexes, and protein and ligand complexes.
Inputs
Section titled “Inputs”| Input | Required | What it is |
|---|---|---|
request | yes | A single object describing what to fold. It holds a list of chains (each one a protein, rna, dna, or ligand), plus optional bonded_atom_pairs (explicit covalent links between atoms), templates, and runtime settings. Each chain has an id, a type, and a sequence. The runtime block controls options such as num_diffusion_samples (how many candidate structures to generate) and output_format (cif by default). For the power user route, set mode: "raw_af3_json" and pass a file directly to skip the normal input checks. |
How to run it
Section titled “How to run it”Submit your sequences from Azulene Studio, the Python SDK, or the CLI. New here? The Get started page walks through installing, logging in, and running a ready made example first.
In Azulene Studio
Section titled “In Azulene Studio”Open OpenFold3 Structure Prediction from the tools list, then on the Inputs and Parameters step add each chain you want to fold, choosing its type and pasting its sequence, adjust the run settings if you want, then Review and Submit.
From the Python SDK
Section titled “From the Python SDK”from opal import jobs
result = jobs.submit( job_type="openfold3_prediction", input_data={ "request": { "mode": "json", "chains": [ { "id": "A", "type": "protein", "sequence": "MNIFEMLRIDEGLRLKIYKDTEGYYTIGIGHLLTKSPSLNAAK", } ], "runtime": { "num_diffusion_samples": 5, "output_format": "cif", }, } },)From the CLI
Section titled “From the CLI”Pass the inputs as a JSON string.
opal jobs submit --job-type openfold3_prediction \ --input-data '{"request": {"mode": "json", "chains": [{"id": "A", "type": "protein", "sequence": "MNIFEMLRIDEGLRLKIYKDTEGYYTIGIGHLLTKSPSLNAAK"}], "runtime": {"num_diffusion_samples": 5, "output_format": "cif"}}}'Reading the result
Section titled “Reading the result”The main output is the predicted 3D structure of your complex, written by default as a CIF file (a plain text format that lists every atom and its position). You can download it from the Files tab of the result and open it in any molecular viewer. If you asked for several diffusion samples, the run produces several candidate structures.
OpenFold3 also reports confidence scores that tell you how much to trust the prediction. In plain words:
- A per residue confidence score, usually called pLDDT, says how sure the model is about the position of each part of a chain. Higher means more confident. High confidence regions are usually well folded, while low confidence regions are often flexible or uncertain.
- A predicted aligned error, usually called PAE, says how confident the model is about the position of one part of the structure relative to another. Lower error means the relative placement of two regions, for example two chains, is more trustworthy.
- An overall and an interface confidence score, usually called pTM and ipTM, summarize the whole structure and the part where chains meet, each on a 0 to 1 scale where higher is better.
The full set of scores and any extra files are available to download from the result. The exact names of each score in the downloaded data come from OpenFold3 itself.
A single protein chain is enough for a first run. Add more chains, of any supported type, to fold a complex. OpenFold3 runs on a GPU, and runtime grows with the number and length of the chains and the number of diffusion samples you ask for.