Distributed locking for hosted agents
Hosted agents multiply quickly. If they can all see the same work, they can all do the same work. A lease-based lock gives each task one current owner without turning your system into a workflow platform.
Most hosted agent systems start with a simple loop: poll for work, choose a task, run a tool, write the result. That works until there are two workers. Then a GitHub issue, customer ticket, evaluation run, or deploy request can be picked up twice.
The fix does not have to be a large orchestrator. Often the missing primitive is just ownership: before an agent starts, it asks a shared service whether it can own the task for a bounded amount of time.
The lease model
OctoStore exposes that primitive as HTTP locks. An agent sends POST /locks/:name/acquire with a TTL and optional metadata. The lock name should be a single stable segment such as issue-1842. If the response status is acquired, this agent owns the work. If the response status is held, another worker owns it.
curl -X POST https://api.octostore.io/locks/issue-1842/acquire \
-H "Authorization: Bearer ***" \
-H "Content-Type: application/json" \
-d '{
"ttl_seconds": 120,
"metadata": "runner=agent-7 repo=acme/app issue=1842"
}'
Metadata is deliberately plain. Put the runner ID, task URL, trace ID, branch, model, or runbook hint where an operator can see it. Keep it small; the API validates metadata size.
Renew while work is live
A lease is not a permanent claim. Long-running work should renew before expiry:
curl -X POST https://api.octostore.io/locks/issue-1842/renew \
-H "Authorization: Bearer ***" \
-H "Content-Type: application/json" \
-d '{"lease_id":"...","ttl_seconds":120}'
# response shape
# {"lease_id":"...","expires_at":"..."}
Renewal gives you a heartbeat. If the worker is healthy, the lease stays alive. If the process crashes, the heartbeat stops and the lock eventually becomes available again.
Release on completion, expire on failure
When work completes, release the lock with the lease ID. The release endpoint returns null on success.
curl -X POST https://api.octostore.io/locks/issue-1842/release \
-H "Authorization: Bearer ***" \
-H "Content-Type: application/json" \
-d '{"lease_id":"..."}'
# null
If the worker dies before release, expiry is the recovery path. Pick a TTL that bounds how long a task can look owned after a crash, then renew periodically during normal execution.
What a lock does not solve
A distributed lock prevents ordinary duplicate starts. It does not make external side effects magically transactional. Hosted agents should still write idempotently, handle retries, and be careful with irreversible actions. Treat the lock as the coordination boundary, not as a substitute for safe application logic.
Use the smallest useful coordinator
For many hosted agent workloads, the right abstraction is not a scheduler or DAG engine. It is a lease: one visible owner, renewed while alive, released when done, expired when abandoned.