Train linear embedding adapters with triplet loss to align retrieval embeddings with your queries.
Open source by Santander AI Lab. An AI / machine learning Python library for retrieval-augmented generation (RAG): it trains linear embedding adapters with triplet loss to align your retrieval embeddings with real user queries. Part of Santander AI Open Source — see also santander.com.
LinearAdapterTrainer fine-tunes retrieval without retraining your embedding model. It learns a small linear transform that is applied to query embeddings at search time, nudging them closer to relevant chunks and away from irrelevant ones. Your vector index stays exactly as it is — you only adapt the query side.
It is built around two composable modules:
- Dataset generator — point it at a knowledge base and it produces
(query, positive, negative)triplets, with configurable negative mining (semantically opposite, hard, or random) and a leakage-free train/val split. - Linear adapter trainer — trains a PyTorch linear adapter on those triplets with triplet loss, then reports retrieval gains with precision@k, recall@k, MRR and nDCG.
| Approach | Cost | Re-index corpus? | Reversible |
|---|---|---|---|
| Fine-tune the embedding model | High (GPU, data) | Yes | No |
| Linear query adapter (this repo) | Low (CPU-friendly) | No | Yes |
| Re-ranking model | Medium (latency) | No | Yes |
The adapter is a single matrix (initialized at identity), so training is fast,
stable, and easy to audit. At inference you simply do
adapted_query = adapter(query_embedding) before your usual nearest-neighbor
search.
Module 1: dataset Module 2: adapter
┌───────────────┐ ┌──────────────────┐ ┌────────────────────────┐
│ Knowledge base│──▶│ query generation │──┐ │ anchor = adapter(q) │
│ (chunks) │ │ negative mining │ ├──▶│ triplet loss: │
└───────────────┘ │ train/val split │ │ │ pull anchor→positive │
│ └──────────────────┘ │ │ push anchor↮negative │
▼ │ └────────────┬───────────┘
embed chunks ──────────────────────────── evaluate
precision@k · recall@k · MRR · nDCG
The triplet objective is:
L = max(0, d(adapter(query), positive) − d(adapter(query), negative) + margin)
where d is cosine (default) or Euclidean distance.
# with uv (recommended)
uv sync --dev
# for the example notebook (scraping + plotting)
uv sync --group examples
# or with pip, choosing the backends you need
pip install "linear-adapter-trainer[sentence-transformers]" # local models
pip install "linear-adapter-trainer[openai]" # OpenAI API
pip install "linear-adapter-trainer[all]"The core install is dependency-light (numpy, torch, tqdm). A
dependency-free HashingEmbedder and TemplateQueryGenerator let you run the
whole pipeline offline (great for CI and demos).
from linear_adapter_trainer import (
AdapterTrainer, DatasetConfig, DatasetGenerator, KnowledgeBase,
TemplateQueryGenerator, TrainingConfig,
)
from linear_adapter_trainer.embeddings import SentenceTransformerEmbedder
kb = KnowledgeBase.from_jsonl("examples/data/sample_kb.jsonl")
embedder = SentenceTransformerEmbedder("sentence-transformers/all-MiniLM-L6-v2")
# Module 1 — generate triplets
dataset = DatasetGenerator(
knowledge_base=kb,
embedder=embedder,
query_generator=TemplateQueryGenerator(seed=0), # or LLMQueryGenerator(...)
config=DatasetConfig(queries_per_chunk=4, strategy="mixed", val_fraction=0.2),
).generate()
# Module 2 — train the adapter
result = AdapterTrainer(kb, embedder, TrainingConfig(epochs=30)).fit(dataset)
print(result.improvement) # delta per metric vs the base embeddings
result.adapter.save("adapter.pt")At query time:
import numpy as np
from linear_adapter_trainer import LinearAdapter
adapter = LinearAdapter.load("adapter.pt")
query_vec = embedder.embed(["how do plants make energy?"])
adapted = adapter.transform(query_vec) # use this for nearest-neighbor searchEverything is driven by one TOML file (see examples/config.toml):
uv run linear-adapter generate examples/config.toml # build the dataset
uv run linear-adapter train examples/config.toml # train + report metrics
uv run linear-adapter evaluate examples/config.toml # base vs adapted
uv run linear-adapter run examples/config.toml # generate -> trainExample output (with a Sentence-Transformers backend on a paraphrased query set):
metric base adapted delta
------------------------------------------------
precision@1 0.5200 0.7100 +0.1900
mrr 0.6310 0.8050 +0.1740
ndcg@10 0.6890 0.8420 +0.1530
The bundled
examples/config.tomlis offline (hashing embedder + template queries) so it runs anywhere. In that setup the baseline is already optimal — the queries reuse chunk tokens — so the adapter correctly reports a ~0 delta. Model selection always includes the identity baseline, so the trained adapter can never score worse than your base embeddings. Switch to a semantic backend (andllmquery generation) to see real gains.
The notebook
examples/santander_retrieval_demo.ipynb
walks through the full workflow on a corpus scraped from the
Santander website:
- Scrape & chunk the site into a knowledge base (live, with a cached snapshot fallback so it always runs).
- Generate a QA dataset of triplets with an LLM (or offline templates).
- Train the linear adapter.
- Measure the base-vs-adapted retrieval gain with tables and plots.
uv sync --group examples
export OPENAI_API_KEY=sk-... # for natural, LLM-generated queries
uv run jupyter lab examples/santander_retrieval_demo.ipynbThe notebook prefers a Sentence-Transformers model + LLM queries (real gains)
and automatically falls back to the offline HashingEmbedder +
TemplateQueryGenerator when no model or API key is available.
Set strategy in [dataset] to control how negatives are sampled:
semantic_opposite— least similar chunks (far in latent space).hard— most similar but incorrect chunks (the strongest training signal).random— uniformly sampled easy negatives.mixed— a weighted blend (configure[dataset.mix]).
linear_adapter_trainer/
├── knowledge_base/ # ingestion + chunking
├── embeddings/ # pluggable backends (hashing, sentence-transformers, openai)
├── dataset/ # Module 1: query generation, negatives, split
├── adapter/ # Module 2: model, triplet loss, trainer
├── evaluation/ # precision@k, recall@k, MRR, nDCG, comparison
├── config.py # TOML config + factories
└── cli.py # command-line interface
See DOCUMENTATION.md for the full API reference and design
notes.
uv sync --dev
uv run ruff check .
uv run pytest
uv run python examples/quickstart.pyContributions are welcome — please read CONTRIBUTING.md.
Notable changes are tracked in CHANGELOG.md.
Please report security issues responsibly — see .github/SECURITY.md
(contact security-opensource@gruposantander.com or open a GitHub Security
Advisory). Do not file public issues for vulnerabilities.
Apache License 2.0. See LICENSE and NOTICE.
Copyright (c) 2026 Santander Group. Originally authored by Pedro Martin Minguez
(Santander AI Lab).
If you use this software, please cite it — see CITATION.cff.
@software{linear_adapter_trainer_2026,
title = {LinearAdapterTrainer: linear embedding adapters with triplet loss for retrieval},
author = {{Santander AI Lab}},
year = {2026},
url = {https://github.com/SantanderAI/linear-adapter-trainer},
license = {Apache-2.0}
}