Skip to content
Open
Show file tree
Hide file tree
Changes from 62 commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
fff2f04
Model and API view for SourceImageThumbnail
loppear May 14, 2026
a2c26f6
Thumbnails: add field to captures serializer
loppear May 14, 2026
8077c2a
With internal prefix
loppear May 14, 2026
d11cfa3
black postcommit
loppear May 14, 2026
33f42df
black?
loppear May 14, 2026
04cb3b3
black settings
loppear May 14, 2026
e9abc52
Better handling for thumbnail invalidation and deletion.
loppear May 14, 2026
1a150a2
Merge remote-tracking branch 'origin/main' into feat/thumbnail-source…
loppear May 14, 2026
f94b5c8
Thumbnail lookup by label and check for changed settings width
loppear May 14, 2026
c7484ab
Move thumbnail generation to method on SourceImage
loppear May 15, 2026
182c035
Thumbnails: handle uploaded source images (without datasource), 302 r…
loppear May 15, 2026
a606ace
Merge remote-tracking branch 'origin/main' into feat/thumbnail-source…
loppear May 26, 2026
3e7739f
merge migrations
loppear May 26, 2026
36c14d6
UI capture data: add and use thumbnail property accessors
loppear May 26, 2026
ec4747b
Add thumbnails to api responses for events with captures
loppear May 26, 2026
9c3d349
revert UI capture.src property, only use thumbnails explicitly
loppear May 26, 2026
72dd63d
fix: prioritize API response size before natural size
annavik May 27, 2026
f18e374
Merge branch 'main' into feat/thumbnail-source-images
loppear May 28, 2026
a6b8cd1
Merge migrations
loppear May 28, 2026
b1ffd8d
Merge origin/main
loppear Jun 3, 2026
e9c302f
Rebase thumbnail migrations
loppear Jun 3, 2026
df98379
ami.utils.media.fetch_image_content for thumbnail image source
loppear Jun 3, 2026
4400a61
Merge branch 'main' into feat/thumbnail-source-images
mihow Jun 4, 2026
ecfec9b
fix(thumbnails): convert non-RGB source images to RGB before JPEG encode
mihow Jun 4, 2026
0c13f73
fix(thumbnails): handle SourceImage.last_modified=None on regen check
mihow Jun 4, 2026
138cfb5
fix(thumbnails): drop no-op f-strings; return 405 on list
mihow Jun 4, 2026
9fca069
fix(thumbnails): guard empty THUMBNAILS['SIZES'] config
mihow Jun 4, 2026
fff3ffa
fix(media): add finite default timeout to fetch_image_content
mihow Jun 4, 2026
6426c4d
fix(thumbnails): use upsert + drop pre-save delete-loop to fix concur…
mihow Jun 4, 2026
84f70ea
fix(thumbnails): CASCADE source_image FK and clean storage blob on ro…
mihow Jun 4, 2026
6903c58
chore: black formatting
mihow Jun 4, 2026
e1e422d
chore(ui): prettier strip trailing whitespace in capture.ts
mihow Jun 4, 2026
c3c70ea
docs(sourceimage): help_text for storage-derived fields; admin read-only
mihow Jun 4, 2026
25e5101
chore(migrations): collapse 0089 SET_NULL + 0090 CASCADE into single …
mihow Jun 4, 2026
22045e0
chore(migrations): drop over-defensive comment from collapsed 0089
mihow Jun 5, 2026
545873a
perf(thumbnails): serve direct storage URLs on warm cache; drop per-r…
mihow Jun 8, 2026
aeb46c0
fix(thumbnails): tolerate PIL aspect-ratio rounding in width predicate
mihow Jun 8, 2026
0d05fb4
Merge remote-tracking branch 'origin/main' into feat/thumbnail-source…
mihow Jun 9, 2026
209a1c7
Merge remote-tracking branch 'origin/feat/thumbnail-source-images' in…
mihow Jun 9, 2026
1b54cc8
refactor(thumbnails): push staleness into prefetch annotation; review…
mihow Jun 9, 2026
b52ab69
fix(ui): session list uses capture thumbnails instead of full-size or…
mihow Jun 9, 2026
576c93e
Merge remote-tracking branch 'origin/feat/thumbnail-source-images' in…
mihow Jun 9, 2026
287de34
Merge remote-tracking branch 'origin/main' into HEAD
mihow Jun 10, 2026
547e39d
fix(migrations): renumber thumbnail migrations after main merge
mihow Jun 10, 2026
d6e96ef
Merge branch 'feat/thumbnail-source-images' (migration renumber after…
mihow Jun 10, 2026
7899bec
fix(tests): use real taxon pk in role-permission test data
mihow Jun 10, 2026
88a65d6
Merge branch 'feat/thumbnail-source-images' (role-permission test dat…
mihow Jun 10, 2026
012efbc
fix(thumbnails): record requested spec width on thumbnail rows
mihow Jun 11, 2026
de5b21c
perf(thumbnails): trust the cached row without a storage HEAD per req…
mihow Jun 11, 2026
cb83de3
feat(thumbnails): encode thumbnails as progressive JPEGs
mihow Jun 11, 2026
de78202
perf(thumbnails): let browsers cache the thumbnail redirect
mihow Jun 11, 2026
ec5cb93
fix(thumbnails): regenerate when the cached row has no stored path
mihow Jun 11, 2026
4292ee4
Merge branch 'feat/thumbnail-source-images' (spec width, no storage H…
mihow Jun 11, 2026
56bc46a
refactor(thumbnails): drop width tolerance now that rows store the sp…
mihow Jun 11, 2026
ea4f0bd
test(thumbnails): drop progressive-JPEG encoding assertion
mihow Jun 11, 2026
61aa16c
refactor(thumbnails): trim review-narration comments to load-bearing …
mihow Jun 11, 2026
2afc0d7
Merge branch 'feat/thumbnail-source-images' (comment trim) into feat/…
mihow Jun 11, 2026
586ec18
refactor(thumbnails): trim docstrings and comments to load-bearing facts
mihow Jun 11, 2026
63d103f
refactor(thumbnails): trim remaining narration comments found in full…
mihow Jun 11, 2026
5b8d1d3
Merge branch 'feat/thumbnail-source-images' (comment trim round 2) in…
mihow Jun 11, 2026
15677a2
refactor(thumbnails): single-source the thumbnail validity check + en…
mihow Jun 16, 2026
28e8018
Merge remote-tracking branch 'origin/main' into feat/thumbnails-seria…
mihow Jun 16, 2026
f8a8e72
fix(thumbnails): fall back to route URLs when thumbnails unprefetched
mihow Jun 16, 2026
dd40bae
test(thumbnails): pin warm-list prefetch contract; prune redundant wi…
mihow Jun 16, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 6 additions & 11 deletions ami/main/api/serializers.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import datetime

from django.conf import settings
from django.db.models import QuerySet
from guardian.shortcuts import get_perms
from rest_framework import serializers
Expand Down Expand Up @@ -74,20 +73,16 @@ class Meta:


class SourceImageThumbnailSerializer(DefaultSerializer):
"""Adds a ``thumbnails`` field via :meth:`SourceImage.thumbnail_urls`.
Viewsets must apply :meth:`SourceImageQuerySet.with_thumbnails`.
"""

def __init__(self, *args, **kwargs):
super().__init__(*args, **kwargs)
self.fields["thumbnails"] = serializers.SerializerMethodField()

def get_thumbnails(self, obj: SourceImage) -> dict | None:
return {
label: reverse_with_params(
"sourceimagethumbnail-detail",
args=(obj.pk,),
request=self.context.get("request"),
params={"label": label},
)
for label in settings.THUMBNAILS["SIZES"]
}
def get_thumbnails(self, obj: SourceImage) -> dict[str, str]:
return obj.thumbnail_urls(request=self.context.get("request"))


class SourceImageNestedSerializer(DefaultSerializer):
Expand Down
17 changes: 9 additions & 8 deletions ami/main/api/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -451,9 +451,10 @@ def get_queryset(self) -> QuerySet:
Prefetch(
"captures",
queryset=SourceImage.objects.order_by("-size").select_related(
"deployment",
"deployment__data_source",
)[:num_example_captures],
"deployment", "deployment__data_source"
)
# Required by SourceImage.thumbnail_urls in the nested serializer.
.with_thumbnails()[:num_example_captures],
to_attr="example_captures",
)
)
Expand Down Expand Up @@ -632,11 +633,11 @@ def get_queryset(self) -> QuerySet:
self.require_project = True
project = self.get_active_project()

queryset = queryset.select_related(
"event",
"deployment",
"deployment__data_source",
).order_by("timestamp")
queryset = (
queryset.select_related("event", "deployment", "deployment__data_source")
# Required by SourceImage.thumbnail_urls in SourceImageThumbnailSerializer.
.with_thumbnails().order_by("timestamp")
)

if self.action == "list":
# It's cumbersome to override the default list view, so customize the queryset here
Expand Down
72 changes: 64 additions & 8 deletions ami/main/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -1309,7 +1309,8 @@ def first_capture(self):
# Ideally this would be an annotated field, rather than an additional query.
# raise NotImplementedError("This is added an annotated field, it should not be called directly.")
# return SourceImage.objects.filter(event=self).order_by("timestamp").first().with_detections()
return SourceImage.objects.filter(event=self).order_by("timestamp").first()
# with_thumbnails() satisfies the thumbnail_urls prefetch contract for the nested serializer.
return SourceImage.objects.filter(event=self).order_by("timestamp").with_thumbnails().first()

def summary_data(self):
"""
Expand Down Expand Up @@ -2092,6 +2093,12 @@ def with_was_processed(self):
processed_exists = models.Exists(Detection.objects.filter(source_image_id=models.OuterRef("pk")))
return self.annotate(was_processed=processed_exists)

def with_thumbnails(self):
"""Prefetch ``thumbnails`` so :meth:`SourceImage.thumbnail_urls` decides
warm/cold in memory instead of firing a SELECT per row.
"""
return self.prefetch_related("thumbnails")


class SourceImageManager(models.Manager.from_queryset(SourceImageQuerySet)):
pass
Expand Down Expand Up @@ -2247,7 +2254,7 @@ def get_was_processed(self, algorithm_key: str | None = None) -> bool:
``SourceImageQuerySet.with_was_processed()``). Falls back to a DB query otherwise.

Do not call in bulk without the annotation — use ``with_was_processed()``
on the queryset instead to avoid N+1 queries.
on the queryset instead so each row does not trigger its own DB query.

:param algorithm_key: If provided, only detections from this algorithm are checked.
The annotation does not filter by algorithm; per-algorithm
Expand Down Expand Up @@ -2399,6 +2406,60 @@ def get_custom_user_permissions(self, user) -> list[str]:
custom_perms.add(Project.Permissions.RUN_SINGLE_IMAGE_JOB)
return list(custom_perms)

def thumbnail_is_valid(self, spec: dict, thumb: "SourceImageThumbnail | None") -> bool:
"""Whether ``thumb`` satisfies ``spec`` and need not be regenerated.

``thumb.width`` stores the requested spec width (see the generator), so the
comparison is strict equality; legacy encoder-width rows read invalid and
self-heal on next generation. A None ``last_modified`` on either side means
"no signal of change" (matches ``NULL < x`` → ``False`` in SQL).
"""
if thumb is None or not thumb.path or thumb.width != spec["width"]:
return False
source_changed = (
self.last_modified is not None
and thumb.last_modified is not None
and thumb.last_modified < self.last_modified
)
return not source_changed

def thumbnail_urls(self, request: Request | None = None) -> dict[str, str]:
"""Per-label ``{label: url}`` for this capture's thumbnails.

Warm (cached row valid for the spec) → direct storage URL. Cold/stale →
route URL into the thumbnail viewset, which (re)generates lazily.

Requires prefetched ``thumbnails`` (use :meth:`SourceImageQuerySet.with_thumbnails`);
the guard below turns a forgotten prefetch into a loud error instead of a
silent per-row N+1.
"""
# TODO: drop this guard once django-zen-queries enforces prefetch globally.
if "thumbnails" not in getattr(self, "_prefetched_objects_cache", {}):
raise RuntimeError(
"thumbnail_urls() requires prefetched thumbnails — call via SourceImageQuerySet.with_thumbnails()."
)

# Local import avoids a models ↔ serializers cycle at module load time.
from ami.base.serializers import reverse_with_params

thumbs: dict[str, "SourceImageThumbnail"] = {t.label: t for t in self.thumbnails.all()}

out: dict[str, str] = {}
for label, spec in settings.THUMBNAILS["SIZES"].items():
thumb = thumbs.get(label)
if self.thumbnail_is_valid(spec, thumb):
out[label] = default_storage.url(thumb.path)
else:
# Qualified ``api:`` namespace so this also resolves when ``request``
# is None (management commands, template tags).
out[label] = reverse_with_params(
"api:sourceimagethumbnail-detail",
args=(self.pk,),
request=request,
params={"label": label},
)
return out

def find_or_generate_thumbnail_for_label(self, label):
try:
thumb = self.thumbnails.get(label=label)
Expand All @@ -2407,14 +2468,9 @@ def find_or_generate_thumbnail_for_label(self, label):
size = settings.THUMBNAILS["SIZES"].get(label)
prefix = settings.THUMBNAILS["STORAGE_PREFIX"]

# ``self.last_modified`` can be None on legacy rows synced before upload
# backfill — treat None as "no signal of change", not an error.
source_changed = (
self.last_modified is not None and thumb is not None and thumb.last_modified < self.last_modified
)
# The row is trusted without a storage existence check; an orphan row (blob
# deleted out of band) shows a broken image until the row is removed.
if not thumb or not thumb.path or thumb.width != size["width"] or source_changed:
if not self.thumbnail_is_valid(size, thumb):
img = PIL.Image.open(BytesIO(fetch_image_content(self.public_url(raise_errors=True))))
# JPEG only supports L, RGB, CMYK — convert other modes (e.g. RGBA PNGs)
# or PIL raises ``OSError: cannot write mode <X> as JPEG``.
Expand Down
133 changes: 133 additions & 0 deletions ami/main/tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -518,6 +518,139 @@ def test_thumbnail_delete_removes_storage_blob(self):

self.assertFalse(default_storage.exists(path), "pre_delete signal must clean the storage blob")

def test_serializer_emits_storage_url_when_thumb_row_exists(self):
"""Warm-path optimization: when a SourceImageThumbnail row exists with the
configured width, the serializer must emit ``default_storage.url(row.path)``
directly instead of the lazy-regen route URL. This is the whole point of
the direct-URLs change — every cached thumbnail must skip Django on the
warm path.
"""
from django.core.files.storage import default_storage

# Pre-create a thumbnail row whose width matches THUMBNAILS['SIZES']['small'].
configured_width = settings.THUMBNAILS["SIZES"]["small"]["width"]
self.first_capture.thumbnails.create(
path="thumbnails/cached/abc.jpg", label="small", width=configured_width, height=180, size=42
)

response = self.client.get(f"/api/v2/captures/{self.first_capture.pk}/?project_id={self.project.pk}")
self.assertEqual(response.status_code, 200)
small_url = response.json()["thumbnails"]["small"]

# ``small`` should be the storage URL, not the route URL.
expected_storage_url = default_storage.url("thumbnails/cached/abc.jpg")
self.assertEqual(small_url, expected_storage_url)
# ``medium`` has no cached row → falls back to the route URL.
self.assertURLEqual(
response.json()["thumbnails"]["medium"], f"{self.base_url}{self.first_capture.pk}/?label=medium"
)

def test_serializer_falls_back_to_route_url_when_width_is_stale(self):
"""If the cached row's width no longer matches the configured size (settings
changed since last gen), emit the route URL so the next browser fetch
triggers lazy regen via the redirect viewset.
"""
configured_width = settings.THUMBNAILS["SIZES"]["small"]["width"]
self.first_capture.thumbnails.create(
path="thumbnails/stale/abc.jpg",
label="small",
width=configured_width + 100, # mismatch — settings changed
height=180,
size=42,
)

response = self.client.get(f"/api/v2/captures/{self.first_capture.pk}/?project_id={self.project.pk}")
self.assertEqual(response.status_code, 200)
# Stale width → emit route URL so the redirect viewset can regenerate.
self.assertURLEqual(
response.json()["thumbnails"]["small"], f"{self.base_url}{self.first_capture.pk}/?label=small"
)

def test_serializer_treats_legacy_encoder_width_row_as_stale(self):
"""Rows written before the spec-width fix stored PIL's rounded output
(e.g. 239 for a 240 spec). The warm-check compares against the spec
with strict equality, so such rows must fall back to the route URL —
the redirect viewset regenerates them once and they store the spec
width from then on.
"""
configured_width = settings.THUMBNAILS["SIZES"]["small"]["width"]
# Stored width 1px below spec — what the generator used to record.
self.first_capture.thumbnails.create(
path="thumbnails/pilround/abc.jpg",
label="small",
width=configured_width - 1,
height=180,
size=42,
)
response = self.client.get(f"/api/v2/captures/{self.first_capture.pk}/?project_id={self.project.pk}")
self.assertEqual(response.status_code, 200)
# Legacy width → route URL so the next browser fetch self-heals the row.
self.assertURLEqual(
response.json()["thumbnails"]["small"], f"{self.base_url}{self.first_capture.pk}/?label=small"
)

def test_serializer_falls_back_to_route_url_when_source_changed(self):
"""If the source image was re-uploaded after the cached row was
generated (``source.last_modified > row.last_modified``), the serializer
must emit the route URL so the next browser fetch regenerates via the
redirect viewset. Mirrors the predicate the generator already uses.

Guards https://github.com/RolnickLab/antenna/pull/1331#discussion_r3373715397.
"""
configured_width = settings.THUMBNAILS["SIZES"]["small"]["width"]
# Row generated at T0. ``last_modified`` is ``auto_now_add`` so set
# ``self.first_capture.last_modified`` to a later timestamp via UPDATE
# below to flip the staleness predicate.
self.first_capture.thumbnails.create(
path="thumbnails/preupload/abc.jpg", label="small", width=configured_width, height=180, size=42
)
# Source bytes were re-uploaded after the cached row was written.
SourceImage.objects.filter(pk=self.first_capture.pk).update(
last_modified=timezone.now() + datetime.timedelta(minutes=1)
)

response = self.client.get(f"/api/v2/captures/{self.first_capture.pk}/?project_id={self.project.pk}")
self.assertEqual(response.status_code, 200)
# Source-changed → emit route URL so the redirect viewset regenerates.
self.assertURLEqual(
response.json()["thumbnails"]["small"], f"{self.base_url}{self.first_capture.pk}/?label=small"
)

def test_thumbnail_urls_model_method_emits_warm_and_route(self):
"""``SourceImage.thumbnail_urls`` is the single source of truth for the
per-label URL contract: warm storage URL when a cached row exists at
the configured width, route URL otherwise.

Exercising the model method directly keeps the contract regression-tested
even if the serializer surface changes (e.g. a future template tag or
management command that needs the same dict).
"""
from django.core.files.storage import default_storage

configured_width = settings.THUMBNAILS["SIZES"]["small"]["width"]
self.first_capture.thumbnails.create(
path="thumbnails/cached/direct.jpg", label="small", width=configured_width, height=180, size=42
)

# Re-fetch through ``with_thumbnails()`` to satisfy the prefetch contract
# (thumbnail_urls raises without prefetched thumbnails).
capture = SourceImage.objects.with_thumbnails().get(pk=self.first_capture.pk)
urls = capture.thumbnail_urls(request=None)

# Warm path → direct storage URL for the cached label.
self.assertEqual(urls["small"], default_storage.url("thumbnails/cached/direct.jpg"))
# Cold path → route URL for labels without a cached row. With ``request=None``
# the URL is path-only (no scheme/host) — matches DRF's reverse() contract.
self.assertIn(f"/api/v2/captures/thumbnails/{self.first_capture.pk}/", urls["medium"])
self.assertIn("label=medium", urls["medium"])

def test_thumbnail_urls_requires_prefetch(self):
"""``thumbnail_urls`` fails loud without prefetched thumbnails rather than
silently firing a per-row SELECT (an N+1 in list contexts)."""
capture = SourceImage.objects.get(pk=self.first_capture.pk)
with self.assertRaises(RuntimeError):
capture.thumbnail_urls(request=None)


class TestImageGrouping(TestCase):
def setUp(self) -> None:
Expand Down
Loading