Skip to main content
Germinal is a generative framework for designing high-affinity antibodies against specific protein epitopes. Unlike general protein design tools, Germinal is optimized specifically for antibody-format binders, creating functional complementarity-determining regions (CDRs) onto user-specified frameworks while preserving favorable therapeutic developability profiles. On Tamarind Bio, Germinal uses AbLang2 as its primary language model guidance, enabling high-performance sequence optimization without commercial licensing restrictions.

Methodology

Germinal employs a three-stage pipeline that co-optimizes antibody structure and sequence through joint optimization of structure prediction confidence and antibody sequence likelihood.

1. Hallucination (Design Stage)

The design stage inverts AlphaFold-Multimer to sample sequences that bind desired epitopes with high predicted confidence. Simultaneously, AbLang2 sequence likelihoods bias sampling toward naturally occurring antibody sequences. This dual-objective optimization navigates a trade-off between structural confidence and antibody naturalness. The hallucination proceeds through three phases:
  • Logits phase: Initial sequence exploration with gradually increasing language model influence
  • Softmax phase: Temperature annealing to refine sequence selection
  • Semi-greedy phase: Final sequence commitment with strong language model guidance
Custom loss functions ensure realistic antibody binding conformations:
  • Paratope loss: Ensures binding occurs through CDR regions rather than framework residues
  • α-helix loss: Prevents CDRs from forming rigid helical structures
  • β-strand loss: Encourages flexible loop conformations characteristic of functional antibodies

2. Sequence Optimization

Sequences passing initial structural filters proceed to AbMPNN (antibody-specific ProteinMPNN), which redesigns CDR residues not in direct contact with the antigen. This improves binder stability while preserving the binding interface.

3. Filtering & Validation

Final designs are co-folded using Chai-1 to provide an independent structural assessment. Strict confidence thresholds (ipTM, pLDDT) identify candidates with the highest probability of experimental success.

Continuous Generation

Each Germinal job operates as a continuous loop that repeatedly generates and evaluates designs until one accepted design is found. Each iteration:
  1. Generates a new candidate through hallucination
  2. Applies initial structural filters
  3. Runs AbMPNN sequence optimization on passing candidates
  4. Applies final Chai-1 confidence filters
  5. If the design passes all filters, the job completes; otherwise, it loops back to step 1
This means a single job may run many iterations internally before finding an accepted design. The failure_counts.csv output file tracks how many trajectories failed at each step, which is useful for understanding the acceptance rate for your specific target. To generate multiple designs, increase the Number of Designs parameter—this launches additional parallel jobs, each independently looping until it finds one accepted design.

Configuration

Required Settings

SettingTypeDescription
TaskDropdownVHH (Nanobody) or scFv (Single-Chain Variable Fragment)
Target PDBFileProtein structure of your target (.pdb format)
Target ChainStringChain ID to design a binder against (e.g., A)

Design Parameters

SettingTypeDefaultDescription
Hotspot ResiduesListEmptyTarget residue numbers for the binder to focus on (e.g., 37,39,41,96,98). Leave empty to allow the model to find optimal binding sites.
FrameworkDropdownGenericGeneric: Use standard nanobody/scFv frameworks from Germinal. Custom: Upload your own framework structure.
Number of DesignsInteger1Number of parallel jobs to launch. Each job loops continuously until it finds one accepted design.
Omit Amino AcidsStringCAmino acids to exclude from CDR designs (comma-separated). Cysteine is omitted by default to prevent unwanted disulfide bridges.

Custom Framework Settings

These settings appear only when Framework is set to Custom:
SettingTypeDescription
Framework StructureFileYour antibody framework PDB. Germinal will only design CDR regions; framework residues remain fixed. The structure does not need to be in a bound pose.
Binder ChainStringChain ID of your framework to design CDRs for (e.g., A)
CDR LengthsStringComma-separated CDR region lengths. For VHH: 11,8,18 means HCDR1=11, HCDR2=8, HCDR3=18 residues. For scFv: provide 6 values for heavy then light chains (e.g., 8,8,13,6,6,9).
Framework LengthsStringComma-separated framework region lengths. For VHH: 25,17,38,14 (HFW1-4). For scFv: 25,17,38,52,17,33,10 (HFW1-4+linker+LFW1-4).

scFv-Specific Settings (Custom Framework Only)

SettingTypeDefaultDescription
VH FirstBooleantrueWhether the heavy chain appears first in the sequence
VH LengthIntegerLength of the variable heavy domain (include linker length)
VL LengthIntegerLength of the variable light domain

Advanced Settings

SettingTypeDescription
Use RosettaBooleanEnable PyRosetta for additional biophysical scoring. Contact [email protected] if you have a license.

Best Practices

Epitope Selection Strategy

Germinal excels at blocking specific protein-protein interactions (PPIs). For optimal results:
  • Provide hotspot residues corresponding to known functional interfaces to steer the design toward competitive binders
  • Focus on accessible epitopes: Surface-exposed residues with clear structural definition yield higher success rates
  • Consider epitope size: 3-8 hotspot residues typically provide sufficient guidance without over-constraining the design

Framework Selection

  • Generic frameworks are recommended for most use cases and have been validated across diverse targets
  • Custom frameworks enable designs on proprietary scaffolds with favorable developability profiles or humanization characteristics
  • When using custom frameworks, ensure accurate CDR and framework length annotations—incorrect values will cause design failures

Understanding Output Metrics

Designs are ranked by structural confidence scores from Chai-1 co-folding:
MetricThresholdInterpretation
ipTM> 0.7High confidence in predicted interface; strong indicator of binding potential
pLDDT (binder)> 0.8Well-folded CDR regions with defined structure
Interface contactsHigher is betterMore extensive interfaces correlate with tighter binding

Amino Acid Omission

By default, Cysteine (C) is excluded from CDR designs to prevent:
  • Unwanted disulfide bridge formation
  • Protein aggregation issues
  • Oxidation-related instability
You can omit additional amino acids (e.g., C,M to also exclude methionine) based on your expression system or stability requirements.

Experimental Validation

Germinal achieves 4–22% experimental success rates across diverse protein targets, including:
TargetTypeDesigns TestedBinders FoundBest Affinity
PD-L1Immune checkpoint1017170 nM
IL3Cytokine462560 nM
IL20Cytokine434190 nM
BHRF1Viral protein5211140 nM
These results demonstrate that testing 40–100 designs typically yields multiple nanomolar-affinity binders, making Germinal practical for low-throughput experimental validation without requiring large-scale screening campaigns.

Output Format

Germinal generates an organized output directory containing all designs and their associated metrics:
runs/your_target_nb_20240101_120000/
├── final_config.yaml           # Complete run configuration
├── trajectories/               # Designs that passed hallucination but failed initial filters
│   ├── structures/             # PDB files
│   ├── plots/                  # Visualization plots
│   └── designs.csv             # Design metrics
├── redesign_candidates/        # Designs that were AbMPNN-redesigned but failed final filters
│   ├── structures/          
│   └── designs.csv           
├── accepted/                   # Designs that passed all filters (your top candidates)
│   ├── structures/          
│   └── designs.csv           
├── all_trajectories.csv        # Master CSV with all designs across all stages
└── failure_counts.csv          # Summary of trajectories failing at each step

Key Output Files

FileDescription
accepted/structures/*.pdbFinal antibody-antigen complex structures for passing designs—these are your top candidates for experimental testing
accepted/designs.csvMetrics and sequences for all accepted designs
all_trajectories.csvComplete list of all designs with their metrics, pipeline stage reached, and structure file paths
failure_counts.csvDiagnostic summary showing where designs failed in the pipeline
The all_trajectories.csv file is particularly useful for understanding design quality across the full run, as it contains in silico metrics for every design that passed the hallucination stage, regardless of whether it was ultimately accepted.

Limitations

  • Target size: Memory constraints favor smaller proteins; large targets should be truncated to regions of interest
  • Protein epitopes only: Currently limited to protein targets (glycans, small molecules, and nucleic acids are not supported)
  • Computational cost: Each design iteration requires structure prediction and backpropagation, making generation computationally intensive
Try Germinal →