Germinal

Germinal is a generative framework for designing high-affinity antibodies against specific protein epitopes. Unlike general protein design tools, Germinal is optimized specifically for antibody-format binders, creating functional complementarity-determining regions (CDRs) onto user-specified frameworks while preserving favorable therapeutic developability profiles. On Tamarind Bio, Germinal uses AbLang2 as its primary language model guidance, enabling high-performance sequence optimization without commercial licensing restrictions.

Methodology

Germinal employs a three-stage pipeline that co-optimizes antibody structure and sequence through joint optimization of structure prediction confidence and antibody sequence likelihood.

1. Hallucination (Design Stage)

The design stage inverts AlphaFold-Multimer to sample sequences that bind desired epitopes with high predicted confidence. Simultaneously, AbLang2 sequence likelihoods bias sampling toward naturally occurring antibody sequences. This dual-objective optimization navigates a trade-off between structural confidence and antibody naturalness. The hallucination proceeds through three phases:

Logits phase: Initial sequence exploration with gradually increasing language model influence
Softmax phase: Temperature annealing to refine sequence selection
Semi-greedy phase: Final sequence commitment with strong language model guidance

Custom loss functions ensure realistic antibody binding conformations:

Paratope loss: Ensures binding occurs through CDR regions rather than framework residues
α-helix loss: Prevents CDRs from forming rigid helical structures
β-strand loss: Encourages flexible loop conformations characteristic of functional antibodies

2. Sequence Optimization

Sequences passing initial structural filters proceed to AbMPNN (antibody-specific ProteinMPNN), which redesigns CDR residues not in direct contact with the antigen. This improves binder stability while preserving the binding interface.

3. Filtering & Validation

Final designs are co-folded using Chai-1 to provide an independent structural assessment. Strict confidence thresholds (ipTM, pLDDT) identify candidates with the highest probability of experimental success.

Continuous Generation

Each Germinal job operates as a continuous loop that repeatedly generates and evaluates designs until one accepted design is found. Each iteration:

Generates a new candidate through hallucination
Applies initial structural filters
Runs AbMPNN sequence optimization on passing candidates
Applies final Chai-1 confidence filters
If the design passes all filters, the job completes; otherwise, it loops back to step 1

This means a single job may run many iterations internally before finding an accepted design. The failure_counts.csv output file tracks how many trajectories failed at each step, which is useful for understanding the acceptance rate for your specific target. To generate multiple designs, increase the Number of Designs parameter—this launches additional parallel jobs, each independently looping until it finds one accepted design.

Configuration

Required Settings

Setting	Type	Description
Task	`Dropdown`	VHH (Nanobody) or scFv (Single-Chain Variable Fragment)
Target PDB	`File`	Protein structure of your target (.pdb format)
Target Chain	`String`	Chain ID to design a binder against (e.g., `A`)

Design Parameters

Setting	Type	Default	Description
Hotspot Residues	`List`	Empty	Target residue numbers for the binder to focus on (e.g., `37,39,41,96,98`). Leave empty to allow the model to find optimal binding sites.
Framework	`Dropdown`	Generic	Generic: Use standard nanobody/scFv frameworks from Germinal. Custom: Upload your own framework structure.
Number of Designs	`Integer`	1	Number of parallel jobs to launch. Each job loops continuously until it finds one accepted design.
Omit Amino Acids	`String`	`C`	Amino acids to exclude from CDR designs (comma-separated). Cysteine is omitted by default to prevent unwanted disulfide bridges.

Custom Framework Settings

These settings appear only when Framework is set to Custom:

Setting	Type	Description
Framework Structure	`File`	Your antibody framework PDB. Germinal will only design CDR regions; framework residues remain fixed. The structure does not need to be in a bound pose.
Binder Chain	`String`	Chain ID of your framework to design CDRs for (e.g., `A`)
CDR Lengths	`String`	Comma-separated CDR region lengths. For VHH: `11,8,18` means HCDR1=11, HCDR2=8, HCDR3=18 residues. For scFv: provide 6 values for heavy then light chains (e.g., `8,8,13,6,6,9`).
Framework Lengths	`String`	Comma-separated framework region lengths. For VHH: `25,17,38,14` (HFW1-4). For scFv: `25,17,38,52,17,33,10` (HFW1-4+linker+LFW1-4).

scFv-Specific Settings (Custom Framework Only)

Setting	Type	Default	Description
VH First	`Boolean`	`true`	Whether the heavy chain appears first in the sequence
VH Length	`Integer`	—	Length of the variable heavy domain (include linker length)
VL Length	`Integer`	—	Length of the variable light domain

Advanced Settings

Setting	Type	Description
Use Rosetta	`Boolean`	Enable PyRosetta for additional biophysical scoring. Contact [email protected] if you have a license.

Best Practices

Epitope Selection Strategy

Germinal excels at blocking specific protein-protein interactions (PPIs). For optimal results:

Provide hotspot residues corresponding to known functional interfaces to steer the design toward competitive binders
Focus on accessible epitopes: Surface-exposed residues with clear structural definition yield higher success rates
Consider epitope size: 3-8 hotspot residues typically provide sufficient guidance without over-constraining the design

Framework Selection

Generic frameworks are recommended for most use cases and have been validated across diverse targets
Custom frameworks enable designs on proprietary scaffolds with favorable developability profiles or humanization characteristics
When using custom frameworks, ensure accurate CDR and framework length annotations—incorrect values will cause design failures

Understanding Output Metrics

Designs are ranked by structural confidence scores from Chai-1 co-folding:

Metric	Threshold	Interpretation
ipTM	> 0.7	High confidence in predicted interface; strong indicator of binding potential
pLDDT (binder)	> 0.8	Well-folded CDR regions with defined structure
Interface contacts	Higher is better	More extensive interfaces correlate with tighter binding

Amino Acid Omission

By default, Cysteine (C) is excluded from CDR designs to prevent:

Unwanted disulfide bridge formation
Protein aggregation issues
Oxidation-related instability

You can omit additional amino acids (e.g., C,M to also exclude methionine) based on your expression system or stability requirements.

Experimental Validation

Germinal achieves 4–22% experimental success rates across diverse protein targets, including:

Target	Type	Designs Tested	Binders Found	Best Affinity
PD-L1	Immune checkpoint	101	7	170 nM
IL3	Cytokine	46	2	560 nM
IL20	Cytokine	43	4	190 nM
BHRF1	Viral protein	52	11	140 nM

These results demonstrate that testing 40–100 designs typically yields multiple nanomolar-affinity binders, making Germinal practical for low-throughput experimental validation without requiring large-scale screening campaigns.

Output Format

Germinal generates an organized output directory containing all designs and their associated metrics:

runs/your_target_nb_20240101_120000/
├── final_config.yaml           # Complete run configuration
├── trajectories/               # Designs that passed hallucination but failed initial filters
│   ├── structures/             # PDB files
│   ├── plots/                  # Visualization plots
│   └── designs.csv             # Design metrics
├── redesign_candidates/        # Designs that were AbMPNN-redesigned but failed final filters
│   ├── structures/          
│   └── designs.csv           
├── accepted/                   # Designs that passed all filters (your top candidates)
│   ├── structures/          
│   └── designs.csv           
├── all_trajectories.csv        # Master CSV with all designs across all stages
└── failure_counts.csv          # Summary of trajectories failing at each step

Key Output Files

File	Description
`accepted/structures/*.pdb`	Final antibody-antigen complex structures for passing designs—these are your top candidates for experimental testing
`accepted/designs.csv`	Metrics and sequences for all accepted designs
`all_trajectories.csv`	Complete list of all designs with their metrics, pipeline stage reached, and structure file paths
`failure_counts.csv`	Diagnostic summary showing where designs failed in the pipeline

The all_trajectories.csv file is particularly useful for understanding design quality across the full run, as it contains in silico metrics for every design that passed the hallucination stage, regardless of whether it was ultimately accepted.

Limitations

Target size: Memory constraints favor smaller proteins; large targets should be truncated to regions of interest
Protein epitopes only: Currently limited to protein targets (glycans, small molecules, and nucleic acids are not supported)
Computational cost: Each design iteration requires structure prediction and backpropagation, making generation computationally intensive

Try Germinal →

Tasks

Tools

Platform

Support

Methodology

1. Hallucination (Design Stage)

2. Sequence Optimization

3. Filtering & Validation

Continuous Generation

Configuration

Required Settings

Design Parameters

Custom Framework Settings

scFv-Specific Settings (Custom Framework Only)

Advanced Settings

Best Practices

Epitope Selection Strategy

Framework Selection

Understanding Output Metrics

Amino Acid Omission

Experimental Validation

Output Format

Key Output Files

Limitations

Tasks

Tools

Platform

Support

​Methodology

​1. Hallucination (Design Stage)

​2. Sequence Optimization

​3. Filtering & Validation

​Continuous Generation

​Configuration

​Required Settings

​Design Parameters

​Custom Framework Settings

​scFv-Specific Settings (Custom Framework Only)

​Advanced Settings

​Best Practices

​Epitope Selection Strategy

​Framework Selection

​Understanding Output Metrics

​Amino Acid Omission

​Experimental Validation

​Output Format

​Key Output Files

​Limitations

Methodology

1. Hallucination (Design Stage)

2. Sequence Optimization

3. Filtering & Validation

Continuous Generation

Configuration

Required Settings

Design Parameters

Custom Framework Settings

scFv-Specific Settings (Custom Framework Only)

Advanced Settings

Best Practices

Epitope Selection Strategy

Framework Selection

Understanding Output Metrics

Amino Acid Omission

Experimental Validation

Output Format

Key Output Files

Limitations