Coregit vs GitHub: April 2026 benchmarks

Performance matters for AI agents. An agent that commits code 50 times per session burns minutes on API latency alone — time it could spend reasoning. We ran a comprehensive benchmark comparing Coregit and GitHub across 9 common operations, then dug into the architecture to explain why the numbers look the way they do.

Methodology

All tests ran against private repositories with authentication, using the official CLI tools (cgt for Coregit, gh for GitHub). Each operation was measured 3 times; we report the median. Both services were tested from the same machine (US-West) on a stable network connection. Repos contained ~200 files of mixed content.

Results

Operation	GitHub	Coregit	Ratio
Commit 1 file	2,217 ms	2,148 ms	~Parity
Commit 5 files	4,829 ms	3,456 ms	Coregit 1.4x
Commit 10 files	8,387 ms	4,183 ms	Coregit 2.0x
Commit 100 files	72,064 ms	19,769 ms	Coregit 3.6x
Read file (warm)	735 ms	800 ms	GitHub 1.1x
List tree	797 ms	752 ms	Coregit 1.1x
List commits	829 ms	474 ms	Coregit 1.7x
Diff branches	738 ms	752 ms	~Parity
Write throughput	38/hr	15,000/hr	Coregit 394x

The headline numbers — 3.6x on multi-file commits, 394x on throughput — are real, but the story behind them is more interesting than the ratios.

Why single-file commits are at parity

A single-file commit on Coregit takes one API call. On GitHub, it takes five sequential calls (get ref → create blob → create tree → create commit → update ref). So why is the latency nearly identical?

Network round-trip time dominates. A single Coregit API call still needs to: authenticate the request (hash the API key, check edge cache), resolve the branch ref from object storage, flatten the current tree, build a new tree, create the commit object, update the ref with compare-and-swap, and return. That's ~2 seconds of server-side work.

GitHub's five calls are individually fast (~400ms each) because they do less work per call, and decades of infrastructure optimization means each one is served from hot caches.

The crossover happens at 5 files. At that point, GitHub's linear scaling (5 × ~400ms per additional file) starts losing to Coregit's sublinear scaling.

Why batch commits scale sublinearly

The 100-file commit is where the architecture difference becomes stark. GitHub: 72 seconds (105 sequential API calls). Coregit: 19.8 seconds (1 API call).

Here's what happens inside that single Coregit API call:

1. Tree flattening (cached). The current commit's tree is flattened into a Map<path, {sha, mode}>. This result is cached in the edge cache, keyed by the commit SHA — since git commits are immutable, this cache never invalidates. On a second commit to the same branch, this step is an edge cache lookup (~5ms) instead of a tree traversal (~200ms).

2. Change application. All 100 file changes are applied to the in-memory tree map. Creates, edits, deletes, renames — all applied in a single pass.

3. Parallel subtree construction. Git trees are hierarchical: src/app.ts lives in a src/ subtree which lives in the root tree. Coregit builds these subtrees in parallel using Promise.all. If your 100 files span 10 directories, those 10 subtrees are built concurrently.

4. Parallel storage writes. New blob objects are written to object storage in parallel batches of 20. The coordination layer (RepoHotDO) acknowledges writes in ~2ms and flushes to storage asynchronously every 30 seconds.

5. Atomic ref update. The branch ref is updated via conditional writes (onlyIf: { etagMatches }) — a compare-and-swap operation that prevents concurrent commits from clobbering each other.

The result: Coregit's latency grows with the number of unique directory subtrees (logarithmic in the number of files), not the number of files themselves.

Fire-and-forget: why the response returns before the work is done

The 19.8-second number for 100 files is the time until the client gets a 201 response. But not all work is done at that point. Coregit separates what the client needs to wait for from what can happen after the response.

Synchronous (blocks the response):

Auth validation
Tree flattening and change application
Blob/tree/commit object creation (written to coordination layer in ~2ms each)
Branch ref update via CAS

Asynchronous (runs after the response is sent):

Storage durability writes. When RepoHotDO is active, every putObject() call writes to the coordination layer (awaited, ~2ms) and fires off a storage write in parallel — but doesn't await it. The storage write completes 200-500ms later, but the response is already on the wire.
Semantic indexing. Queue messages for AI code embedding + vector database upsert are dispatched via ctx.waitUntil(). The edge compute worker stays alive to send the messages, but the HTTP response is already returned.
Code graph indexing. AST parsing and PostgreSQL node/edge insertion — also queued, not inline.
Usage tracking and billing. Database insert for usage_event + HTTP POST to the billing provider — both via ctx.waitUntil().
Cache warming. Edge cache writes for flattened trees, Edge Cache writes for immutable objects, auth cache updates — all fire-and-forget with .catch(() => {}).

The coordination layer acts as a write-ahead buffer. RepoHotDO flushes accumulated objects to storage every 30 seconds via an alarm. If the buffer has more than 2,000 unflushed objects, it returns HTTP 507 and the caller falls back to direct storage writes — a backpressure mechanism that prevents the buffer from growing unbounded.

With the Zero-Wait Protocol (X-Session-Id), writes are even more deferred: objects stay exclusively in session-scoped coordination storage until the session is closed or times out after 30 minutes. No storage writes at all during the session. This means a burst of 50 commits from an agent session can complete with near-zero storage latency — all durability is batched into a single flush at session close.

The practical effect: an agent that commits, reads the result, and commits again pays ~2ms per object write instead of ~200ms. Over a 50-commit session, that's the difference between 10 seconds and 3+ minutes of cumulative write latency.

Why write throughput is 394x higher

This is the most misunderstood number, so let's break it down.

GitHub's constraint isn't a single rate limit — it's a combination of factors:

The primary rate limit is 5,000 requests/hour for authenticated users
But write operations (creating blobs, trees, commits) hit secondary rate limits that kick in much earlier
Creating a commit from 100 files consumes 105 of those requests
With exponential backoff on 403/429 responses, sustained throughput drops to ~38 commits/hour for bulk operations

Coregit's design point is different:

Rate limits are per-API-key: 600 requests/minute, 15,000 requests/hour
Per-organization: 2,000 requests/minute, 50,000 requests/hour
A 100-file commit is 1 request, not 105
Rate limiting runs on the coordination layer (sliding window, per-key), not on the database

Combined with the Zero-Wait Protocol (session-based auth that skips database/cache lookups after the first call), sustained agent workloads hit the rate limit ceiling far less often.

Where GitHub wins (and why)

Single-file reads are slightly faster on GitHub (735ms vs 800ms). This is expected and not something we're trying to "fix."

GitHub has over a decade of caching investment. Their hot-path read infrastructure includes CDN edge caches, application-level caches, and highly optimized storage backends for popular repositories. A warm file read on GitHub is likely served from an edge cache without touching their storage layer at all.

Coregit's read path goes through a six-layer cache hierarchy:

Layer	Latency	Hit condition
In-memory (per-request)	0ms	Same object read twice in one request
RepoHotDO	~2ms	Object written in last 30 seconds
Edge cache	~5ms	Object read in last 60 seconds
Edge Cache API	<5ms	Immutable object cached at this colo (1-year TTL)
Database pooler	~10ms	Database query cached
Object storage	50-200ms	Cold read from storage

For a warm read (Edge Cache hit), Coregit serves in <5ms. The 800ms benchmark number reflects a cold read — first access after repo creation, no cache priming. In sustained agent workloads where the same files are read repeatedly, Coregit's caching narrows the gap significantly.

The hidden cost: developer time

The numbers above measure API latency. But there's a cost that doesn't show up in benchmarks: developer time integrating the API.

Committing files via GitHub's API requires understanding Git's object model (blobs, trees, commits, refs) and orchestrating multiple sequential calls with proper error handling. The octokit SDK helps, but the complexity is inherent to the multi-step protocol.

Coregit's commit endpoint is one call with a JSON body. Error handling is one check. Retry logic is one retry. The cognitive overhead difference is significant, especially when you're building an agent that needs to handle commits as a routine operation.

What's next

We're working on three improvements that will shift these numbers:

Pack-based reads. Currently, large tree traversals read objects individually. A packfile-aware read path would batch these into fewer storage requests, improving cold-read latency for deep directory trees.

Embedding-aware indexing. Semantic search indexing currently happens asynchronously after commits. We're exploring inline indexing for small commits (<10 files) to make search results immediately available.

Regional storage placement. Object storage is currently single-region. As regional placement becomes available, we'll place repo data closer to the agent's primary location.

Full methodology

See our detailed benchmark documentation for reproducible test scripts and raw data.