Back to projects

Computational biology / Research prototype

Boltz-2 Affinity Embedding Modeling

Per-target ligand affinity models built from Boltz-2 affinity-module embeddings and benchmarked against scalar model outputs.

Thumbnail artwork: Boltz protein structure render by jwohlwend/boltz , licensed under MIT License . Displayed unmodified and scaled for layout.

Research Pipeline

Affinity Embedding Extraction

  1. 01Extract

    Run Boltz-2 to generate internal affinity module embeddings rather than relying solely on scalar output.

  2. 02Baseline

    Evaluate embeddings on ~900 typical ligand pairs via classification and regression models.

  3. 03Augment

    Append Ligand-Residue Interaction Profile Scoring Function (LRIP-SF) for added structural features.

  4. 04Generalize

    Switch to peptide-target complexes to test predictor performance in an out-of-distribution domain.

Problem

Published biomolecular models can expose useful internal representations, but it is not obvious whether those embeddings improve target-specific affinity prediction.

Approach

I built a Python pipeline that joins ULVSH labels, Boltz scalar outputs, docking features, and extracted affinity embeddings into comparable feature sets.

Result

The pipeline writes target-specific datasets, manifests, cross-validation metrics, predictions, and saved model artifacts.