Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ jobs:
- name: Set __version__ and poetry version
run: |
TAG="$(git describe --tags --always | awk -F"-" '{if (NF>1) {print substr($1, 2)".post"$2} else {print substr($1, 2)}}')"
echo "__version__ = \"$TAG\"" > asu/__init__.py
sed "s/__version__.*/__version__ = \"$TAG\"/" -i asu/__init__.py
Copy link
Member

@aparcar aparcar Apr 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change seems unrelated?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's so the other contents of the init file are not deleted.

poetry version "$TAG"

- name: Build and publish PyPi package
Expand Down
2 changes: 2 additions & 0 deletions asu/__init__.py
Original file line number Diff line number Diff line change
@@ -1 +1,3 @@
__version__ = "0.0.0"

from .rq import GCWorker as GCWorker
72 changes: 72 additions & 0 deletions asu/rq.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
from re import compile
from pathlib import Path
from rq import Queue, Worker
from rq.job import Job
from podman import PodmanClient
from shutil import rmtree

from asu.config import settings
from asu.util import log, get_podman

REQUEST_HASH_LENGTH = 64
store: Path = settings.public_path / "store"
podman: PodmanClient = get_podman()


class GCWorker(Worker):
"""A Worker class that does periodic garbage collection on ASU's
public store directory. We tie into the standard `Worker` maintenance
sequence, so the period is controlled by the base class. You may change
the garbage collection frequency in podman-compose.yml by adding a
`--maintenance-interval` option to the startup command as follows (the
default is 600 seconds).

>>> command: rqworker ... --maintenance-interval 1800
"""

hash_match = compile(f"^[0-9a-f]{{{REQUEST_HASH_LENGTH}}}$")

def clean_store(self) -> None:
"""For performance testing, the store directory was mounted on a
slow external USB hard drive. A typical timing result showed ~1000
directories deleted per second on that test system. The synthetic
test directories were created containing 10 files in each.
File count dominated the timing, with file size being relatively
insignificant, likely due to `stat` calls being the bottleneck.
(Just for comparison, tests against store mounted on a fast SSD
were about twice as fast.)

>>> Cleaning /mnt/slow/public/store: deleted 5000/5000 builds
>>> Timing analysis for clean_store: 5.081s
"""

deleted: int = 0
total: int = 0

Check warning on line 44 in asu/rq.py

View check run for this annotation

Codecov / codecov/patch

asu/rq.py#L43-L44

Added lines #L43 - L44 were not covered by tests
dir: Path
queue: Queue
for dir in store.glob("*"):
if not dir.is_dir() or not self.hash_match.match(dir.name):
continue
total += 1
for queue in self.queues:
job: Job = queue.fetch_job(dir.name)
log.info(f" Found {dir.name = } {job = }")
if job is None:
rmtree(dir)
deleted += 1

Check warning on line 56 in asu/rq.py

View check run for this annotation

Codecov / codecov/patch

asu/rq.py#L47-L56

Added lines #L47 - L56 were not covered by tests

log.info(f"Cleaning {store}: deleted {deleted}/{total} builds")

Check warning on line 58 in asu/rq.py

View check run for this annotation

Codecov / codecov/patch

asu/rq.py#L58

Added line #L58 was not covered by tests

def clean_podman(self) -> None:
"""Reclaim space from the various podman disk entities as they are orphaned."""
removed = podman.containers.prune()
log.info(f"Reclaimed {removed.get('SpaceReclaimed', 0):,d}B from containers")
removed = podman.images.prune()
log.info(f"Reclaimed {removed.get('SpaceReclaimed', 0):,d}B from images")
removed = podman.volumes.prune()
log.info(f"Reclaimed {removed.get('SpaceReclaimed', 0):,d}B from volumes")

Check warning on line 67 in asu/rq.py

View check run for this annotation

Codecov / codecov/patch

asu/rq.py#L62-L67

Added lines #L62 - L67 were not covered by tests

def run_maintenance_tasks(self):
super().run_maintenance_tasks()
self.clean_store()
self.clean_podman()

Check warning on line 72 in asu/rq.py

View check run for this annotation

Codecov / codecov/patch

asu/rq.py#L70-L72

Added lines #L70 - L72 were not covered by tests
2 changes: 1 addition & 1 deletion podman-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ services:
context: .
dockerfile: Containerfile
restart: unless-stopped
command: rqworker --logging_level INFO
command: rqworker --logging_level INFO --with-scheduler --worker-class asu.GCWorker
env_file: .env
environment:
REDIS_URL: "redis://redis:6379/0"
Expand Down