Cookbook
Warning
🚧 This page is under construction! We are still adding recipes for several Tasks
. 🚧
Chai-1
Chai-1 is a high accuracy, ligand-aware protein folding tool. It takes as input a FASTA file, and outputs a directory of folded PDB structures. Chai-1 scores are output as .npz files.
Simple
Fold a protein from a sequence
ribbon.Chai1(
fasta_file = 'my_sequence.fasta', # A single input FASTA. If there are multiple sequences, they will be folded in the same structure.
output_dir = './out' # Where the outputs will be stored
).run()
Advanced
Fold a protein from a sequence. Include two copies of ligand
ribbon.Chai1(
fasta_file = 'my_sequence.fasta', # A single input FASTA. If there are multiple sequences, they will be folded in the same structure.
output_dir = './out', # Where the outputs will be stored
smiles_string = 'C1=CC=CC=C1', # SMILES string of our ligand
num_ligands = 2, # How many copies of our ligand?
device = 'gpu' # Run on GPU (necessary for Chai-1)
).run()
LigandMPNN
LigandMPNN is a ligand-aware tool for designing sequences for a backbone structure. It takes as input a list of PDB files, and outputs a sequence (or sequences) that are predicted to fold into that structure. While LigandMPNN is gpu-accelerated, it seems to run fast on the CPU as well.
Simple
Design a sequence for a backbone:
ribbon.LigandMPNN(
structure_list = ['my_structure.pdb'], # List of PDB files
output_dir = './out' # Output directory
num_designs = 5 # How many sequences should we generate?
).run()
The output folder will have the following structure:
output_dir/
├─ backbones/ # Backbone structures with labeled AAs (but no sidechains)
├─ packed/ # (Optional) Backbone structures with packed sidechains
├─ seqs/ # A single FASTA containing the reference sequence and all designed sequences
└─ seqs_split/ # Individual FASTA files for each designed sequence
Advanced
Design a homodimeric sequence, keeping crucial residues fixed.
This example uses the extra_args
parameter to add extra parameters into your run command.
Note that this can inject arbitrary code into your container - use with caution!
ligandmpnn_task = ribbon.LigandMPNN(
structure_list = ['my_structure.pdb'], # List of PDB files
output_dir = './out' # Output directory
num_designs = 5 # How many sequences should we generate?
extra_args= '--fixed_residues \"' + RESIDUES_TO_KEEP + '\" --homo_oligomer 1' # Make sure to keep my catalytric residues, and make two chains identical.
)
RaptorX-Single
RaptorX-Single is a fast protein folding tool. It can fold small structures in as little as 5 seconds, after an initial loading period. It takes as input a FASTA file (or directory containing FASTAs), and outputs a directory of folded PDB structures.
Simple
Fold a directory of FASTA files
ribbon.RaptorXSingle(
fasta_file_or_dir = './my_FASTA_directory/',
output_dir = './out'
).run()
Advanced
Fold a directory of FASTA files using a non-default model checkpoint (param). Run on the CPU (slower; GPU is default).
ribbon.RaptorXSingle(
fasta_file_or_dir = './my_FASTA_directory/',
output_dir = './out',
param = 'RaptorX-Single-ESM1b-ESM1v-ProtTrans-Ab.pt',
device='cpu'
).run()
The available model parameters are:
'RaptorX-Single-ESM1b.pt',
'RaptorX-Single-ESM1v.pt',
'RaptorX-Single-ProtTrans.pt',
'RaptorX-Single-ESM1b-ESM1v-ProtTrans.pt',
'RaptorX-Single-ESM1b-Ab.pt',
'RaptorX-Single-ESM1v-Ab.pt',
'RaptorX-Single-ProtTrans-Ab.pt',
'RaptorX-Single-ESM1b-ESM1v-ProtTrans-Ab.pt'