-
Notifications
You must be signed in to change notification settings - Fork 11
Enable async and distributed processing for the ML backend #910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
vanessavmac
wants to merge
80
commits into
main
Choose a base branch
from
515-new-async-distributed-ml-backend
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 78 commits
Commits
Show all changes
80 commits
Select commit
Hold shift + click to select a range
8aad275
Set up customizable local processing service
vanessavmac 61b45a4
Set up separate docker compose stack, rename ml backend services
vanessavmac 4a03c7e
WIP: README.md
vanessavmac 09d7dfb
Improve processing flow
vanessavmac 996674e
fix: tests and postgres connection
vanessavmac ce973fc
Update READMEs with minimal/example setups
vanessavmac bf7178d
fix: transformers fixed version
vanessavmac 41efa42
Add tests
vanessavmac 78babeb
Typos, warn --> warnings
vanessavmac 8d28d01
Add support for Darsa flat-bug
vanessavmac bb22514
chore: Change the Pipeline class name to FlatBugDetectorPipeline to a…
mohamedelabbas1996 1dbc5f0
Move README
vanessavmac fe1a9f4
Address comment tasks
vanessavmac 7747f3a
Merge branch 'main' into 747-get-antenna-to-work-locally-on-laptops-f…
vanessavmac 1978cbe
Update README
vanessavmac 82ac82d
Pass in pipeline request config, properly cache models, simplifications
vanessavmac 7d733f9
Pass in pipeline request config, properly cache models, simplifications
vanessavmac 07d61d9
fix: update docker compose instructions & build path
mihow d129029
feat: use ["insect"] for the default zero-shot class
mihow 76ce2d8
feat: try to use faster version of zero-shot detector
mihow 035b952
feat: use gpu if available
mihow 1230386
fix: update minimal docker compose build path
vanessavmac 45dbacf
Add back crop_image_url
vanessavmac 7361fb2
Support re-processing detections and skipping localizer
vanessavmac 3f722c8
fix: correctly pass candidate labels for zero shot object detector
vanessavmac 075a7ec
Support re-processing detections and skipping localizer
vanessavmac 85c676d
fix: merge conflict
vanessavmac cbd7ae0
fix: allow empty pipeline request config
vanessavmac 7d15ffb
fix: allow empty pipeline request config
vanessavmac c2881b4
clean up
vanessavmac 14396ba
fix: ignore detection algorithm during reprocessing
vanessavmac 6613366
remove flat bug
vanessavmac 2cf0c0a
feat: only use zero shot and HF classifier algorithms
vanessavmac 1dbf3b1
clean up
vanessavmac c82c076
Merge branch '747-get-antenna-to-work-locally-on-laptops-for-panama-t…
vanessavmac fb874c4
Function for creating detection instances from requests
vanessavmac f2ef5ff
Add reprocessing to minimal app
vanessavmac b6ce90f
Merge branch 'main' into 706-support-for-reprocessing-detections-and-…
vanessavmac 8fe8b1d
Merge branch 'main' into 706-support-for-reprocessing-detections-and-…
vanessavmac d0f4f26
Add re-processing test
vanessavmac fc8470d
Merge branch 'main' into 706-support-for-reprocessing-detections-and-…
mihow 3d3b820
Fix requirements
vanessavmac 5c7af56
Address review comments
vanessavmac e7e579e
Only open source image once
vanessavmac cb74eac
Merge branch 'main' into 706-support-for-reprocessing-detections-and-…
vanessavmac ffea1aa
Setup processing service celery workers; basic task queueing/processing
vanessavmac 6cb852b
Save results; update job progress
vanessavmac d0380b9
Improvements to handle large batches
vanessavmac 2594049
Merge branch 'main' of github.com:RolnickLab/antenna into 515-new-asy…
mihow 57e6691
Add batch processing unit test; bulk db updates; fix duplicate logs; …
vanessavmac f785dda
Fix for "get() returned more than one AlgorithmCategoryMap" error
vanessavmac 0a22d53
Allow synchronous
vanessavmac d139734
Fix job progress if no images are submitted
vanessavmac a83dd20
Subscribe antenna celeryworker to all pipeline queues; add more task …
vanessavmac 0707433
Rename celery to antenna queue; only query ml task records created af…
vanessavmac 0103b7e
Merge branch 'main' into 515-new-async-distributed-ml-backend
vanessavmac 3a3b881
Re-subscribe to queues before processing images; fix test issues
vanessavmac 7c86612
Add missing migration; rename antenna celeryworker
vanessavmac c016d47
Use transaction.on_commit with all async celery tasks
vanessavmac fa510ed
Test clean up
vanessavmac 6da55a9
feat: isolate the CI / test compose stack from other containers
mihow 652f47f
feat: fix isoloated CI stack (rely on compose project name)
mihow f3b588a
fix: run migrations from celery start command, other fixes for tests
mihow 5654ed0
fix: rabbitmq credentials for tests & local dev
mihow 875d3cb
draft: methods for inspecting celery tasks during tests
mihow be351f8
feat: add health check; fix: rabbitmq credentials, minio ci set up
vanessavmac e550531
draft: unit test changes
vanessavmac 2fa57ef
draft: some more unit test updates (working up to process_pipeline_re…
vanessavmac 5c21be6
test fix: check the job status synchronously additional error logging
vanessavmac a1e8fa3
test fix: check the job status synchronously, batchify the save_resul…
vanessavmac 57fba22
feat: celerybeat task to prevent dangling ML jobs
vanessavmac f8e374a
Merge branch 'main' into 515-new-async-distributed-ml-backend
vanessavmac cd593bc
fix: migration conflicts
vanessavmac bde423a
Address copilot review
vanessavmac a1238dc
fix: passing test checks
vanessavmac 03390e2
revoke dangling jobs
vanessavmac 460f27c
feat: move ML celeryworker to separate PR
vanessavmac 6f87ca4
clean up unit test
vanessavmac bd86042
Admin action to revoke ml tasks; clean up logs; add tests for stale m…
vanessavmac b552d1e
fix: increase dangling job timeout and timezone bug
vanessavmac File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,72 @@ | ||
| # Generated by Django 4.2.10 on 2025-08-10 22:17 | ||
|
|
||
| import ami.ml.schemas | ||
| from django.db import migrations, models | ||
| import django.db.models.deletion | ||
| import django_pydantic_field.fields | ||
|
|
||
|
|
||
| class Migration(migrations.Migration): | ||
| dependencies = [ | ||
| ("main", "0060_alter_sourceimagecollection_method"), | ||
| ("jobs", "0018_alter_job_job_type_key"), | ||
| ] | ||
|
|
||
| operations = [ | ||
| migrations.CreateModel( | ||
| name="MLTaskRecord", | ||
| fields=[ | ||
| ("id", models.BigAutoField(auto_created=True, primary_key=True, serialize=False, verbose_name="ID")), | ||
| ("created_at", models.DateTimeField(auto_now_add=True)), | ||
| ("updated_at", models.DateTimeField(auto_now=True)), | ||
| ("task_id", models.CharField(max_length=255)), | ||
| ( | ||
| "task_name", | ||
| models.CharField( | ||
| choices=[ | ||
| ("process_pipeline_request", "process_pipeline_request"), | ||
| ("save_results", "save_results"), | ||
| ], | ||
| default="process_pipeline_request", | ||
| max_length=255, | ||
| ), | ||
| ), | ||
| ( | ||
| "status", | ||
| models.CharField( | ||
| choices=[("STARTED", "STARTED"), ("SUCCESS", "SUCCESS"), ("FAIL", "FAIL")], | ||
| default="STARTED", | ||
| max_length=255, | ||
| ), | ||
| ), | ||
| ("raw_results", models.JSONField(blank=True, default=dict, null=True)), | ||
| ("raw_traceback", models.TextField(blank=True, null=True)), | ||
| ( | ||
| "pipeline_request", | ||
| django_pydantic_field.fields.PydanticSchemaField( | ||
| blank=True, config=None, null=True, schema=ami.ml.schemas.PipelineRequest | ||
| ), | ||
| ), | ||
| ( | ||
| "pipeline_response", | ||
| django_pydantic_field.fields.PydanticSchemaField( | ||
| blank=True, config=None, null=True, schema=ami.ml.schemas.PipelineResultsResponse | ||
| ), | ||
| ), | ||
| ("num_captures", models.IntegerField(default=0, help_text="Same as number of source_images")), | ||
| ("num_detections", models.IntegerField(default=0)), | ||
| ("num_classifications", models.IntegerField(default=0)), | ||
| ("subtask_id", models.CharField(blank=True, max_length=255, null=True)), | ||
| ( | ||
| "job", | ||
| models.ForeignKey( | ||
| on_delete=django.db.models.deletion.CASCADE, related_name="ml_task_records", to="jobs.job" | ||
| ), | ||
| ), | ||
| ("source_images", models.ManyToManyField(related_name="ml_task_records", to="main.sourceimage")), | ||
| ], | ||
| options={ | ||
| "abstract": False, | ||
| }, | ||
| ), | ||
| ] |
28 changes: 28 additions & 0 deletions
28
ami/jobs/migrations/0020_alter_job_logs_alter_job_progress.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| # Generated by Django 4.2.10 on 2025-09-04 10:42 | ||
|
|
||
| import ami.jobs.models | ||
| from django.db import migrations | ||
| import django_pydantic_field.fields | ||
|
|
||
|
|
||
| class Migration(migrations.Migration): | ||
| dependencies = [ | ||
| ("jobs", "0019_mltaskrecord"), | ||
| ] | ||
|
|
||
| operations = [ | ||
| migrations.AlterField( | ||
| model_name="job", | ||
| name="logs", | ||
| field=django_pydantic_field.fields.PydanticSchemaField( | ||
| config=None, default=ami.jobs.models.JobLogs, schema=ami.jobs.models.JobLogs | ||
| ), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="job", | ||
| name="progress", | ||
| field=django_pydantic_field.fields.PydanticSchemaField( | ||
| config=None, default=ami.jobs.models.default_job_progress, schema=ami.jobs.models.JobProgress | ||
| ), | ||
| ), | ||
| ] |
30 changes: 30 additions & 0 deletions
30
ami/jobs/migrations/0021_remove_mltaskrecord_subtask_id_and_more.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| # Generated by Django 4.2.10 on 2025-10-16 19:31 | ||
|
|
||
| from django.db import migrations, models | ||
|
|
||
|
|
||
| class Migration(migrations.Migration): | ||
| dependencies = [ | ||
| ("jobs", "0020_alter_job_logs_alter_job_progress"), | ||
| ] | ||
|
|
||
| operations = [ | ||
| migrations.RemoveField( | ||
| model_name="mltaskrecord", | ||
| name="subtask_id", | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="mltaskrecord", | ||
| name="status", | ||
| field=models.CharField( | ||
| choices=[("PENDING", "PENDING"), ("STARTED", "STARTED"), ("SUCCESS", "SUCCESS"), ("FAIL", "FAIL")], | ||
| default="STARTED", | ||
| max_length=255, | ||
| ), | ||
| ), | ||
| migrations.AlterField( | ||
| model_name="mltaskrecord", | ||
| name="task_id", | ||
| field=models.CharField(blank=True, max_length=255, null=True), | ||
| ), | ||
| ] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,17 @@ | ||
| # Generated by Django 4.2.10 on 2025-10-17 01:27 | ||
|
|
||
| from django.db import migrations, models | ||
|
|
||
|
|
||
| class Migration(migrations.Migration): | ||
| dependencies = [ | ||
| ("jobs", "0021_remove_mltaskrecord_subtask_id_and_more"), | ||
| ] | ||
|
|
||
| operations = [ | ||
| migrations.AddField( | ||
| model_name="job", | ||
| name="last_checked", | ||
| field=models.DateTimeField(blank=True, null=True), | ||
| ), | ||
| ] |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: DATABASE_URL credentials and database name don't match the individual environment variables.
The
DATABASE_URLcontains different credentials and database name than the individualPOSTGRES_*variables:xekSryPnqczJXkOnTAeDmDyIapSRrGEEbutPOSTGRES_USER(line 4) is4JXkOnTAeDmDyIapSRrGEEPOSTGRES_PASSWORD(line 5)amibutPOSTGRES_DB(line 3) isami-ciThis will cause CI connection failures when code uses
DATABASE_URLinstead of individual variables.Apply this diff to fix the mismatches and use the recommended
postgresql://scheme:📝 Committable suggestion
🤖 Prompt for AI Agents