Computational biology / Research prototype

Boltz-2 Affinity Embedding Modeling

Per-target ligand affinity models built from Boltz-2 affinity-module embeddings and benchmarked against scalar model outputs.

GitHub

Thumbnail artwork: Boltz protein structure render by jwohlwend/boltz , licensed under MIT License . Displayed unmodified and scaled for layout.

Research Pipeline

Affinity Embedding Extraction

01Extract
Run Boltz-2 to generate internal affinity module embeddings rather than relying solely on scalar output.
02Baseline
Evaluate embeddings on ~900 typical ligand pairs via classification and regression models.
03Augment
Append Ligand-Residue Interaction Profile Scoring Function (LRIP-SF) for added structural features.
04Generalize
Switch to peptide-target complexes to test predictor performance in an out-of-distribution domain.

Problem

Published biomolecular models can expose useful internal representations, but it is not obvious whether those embeddings improve target-specific affinity prediction.

Approach

I built a Python pipeline that joins ULVSH labels, Boltz scalar outputs, docking features, and extracted affinity embeddings into comparable feature sets.

Result

The pipeline writes target-specific datasets, manifests, cross-validation metrics, predictions, and saved model artifacts.