-
-
Notifications
You must be signed in to change notification settings - Fork 185
Description
Summary
First, thank you for building and maintaining Pact Broker! It's been an excellent tool for our contract testing infrastructure.
We've identified a significant performance optimization opportunity that would benefit all Pact Broker users as their installations grow. When publishing contracts, a verification query can take 13+ seconds due to a missing composite index on the branch_versions table. Adding the index branch_versions(version_id, branch_name) reduces execution time to ~7ms - a ~2,000x improvement.
This issue exists in Pact Broker 2.110.0 and is still present in the recent 2.117.0 release (tested locally).
Impact
- Severity: HIGH - Can cause timeouts and errors during contract publication
- Frequency: Triggered on every contract publication when querying latest verifications
- Affected query: Latest verification lookup with main branch filtering
- Scale: Affects installations as they grow beyond 50-100k branch_versions records
Data Volumetry Example
Here's an example of the scale at which this issue becomes critical (from a production Pact Broker after ~16 months of use):
| Metric | Volume |
|---|---|
| Pacticipants | ~60 services |
| Versions | ~190k versions (~400 versions/day) |
| Branch Versions | ~190k records across ~3,600 unique branches |
| Pact Publications | ~170k publications |
| Verifications | ~60k verification records |
Critical finding: With the branch_versions table having ~190k rows and no compound index on (version_id, branch_name), the query processes and filters over 330 million row comparisons for each execution.
Growth trajectory: Data naturally accumulates as:
- New services adopt contract testing
- CI/CD pipelines generate new versions daily
- Feature branches multiply
- Even with cleanup policies, active data keeps growing
Important: The volumes above exist despite running regular cleanup with this retention policy:
{
"keep_version_selectors": [
{"latest": true, "branch": true},
{"latest": true, "max_age": 30},
{"branch": "main", "max_age": 90},
{"branch": "master", "max_age": 90},
{"branch": "production", "max_age": 90},
{"latest": true, "tag": true},
{"max_age": 15}
]
}This retention policy keeps:
- All latest branch versions (no age limit)
- Latest versions for 30 days
- Main/master/production branches for 90 days
- Other versions for 15 days only
Even with regular cleanup, the table accumulated ~190k records over 16 months. Installations with less frequent cleanup will experience this issue sooner. The index is essential regardless of cleanup strategy.
The performance degradation is exponential as the branch_versions table grows, making this index increasingly critical over time.
Root Cause
The query is generated by the from_provider_main_branch method in lib/pact_broker/domain/verification.rb:38:
# TODO optimise this
def from_provider_main_branch
providers_join = {
Sequel[:verifications][:provider_id] => Sequel[:providers][:id]
}
branch_versions_join = {
Sequel[:verifications][:provider_version_id] => Sequel[:branch_versions][:version_id],
Sequel[:providers][:main_branch] => Sequel[:branch_versions][:branch_name]
}
join(:pacticipants, providers_join, { table_alias: :providers })
.join(:branch_versions, branch_versions_join)
endNote: The code includes a # TODO optimise this comment - this index addresses that optimization need!
This method is used when loading latest_verification_from_main_branch associations in lib/pact_broker/pacts/pact_version.rb. The resulting SQL joins latest_verification_id_for_pact_version_and_provider_version, pacticipants, and branch_versions to find the latest verification on the provider's main branch:
SELECT * FROM "verifications"
WHERE ("id" IN (
SELECT max("verification_id")
FROM "latest_verification_id_for_pact_version_and_provider_version"
INNER JOIN "pacticipants" AS "providers"
ON ("providers"."id" = "latest_verification_id_for_pact_version_and_provider_version"."provider_id")
INNER JOIN "branch_versions"
ON (("latest_verification_id_for_pact_version_and_provider_version"."provider_version_id" = "branch_versions"."version_id")
AND ("providers"."main_branch" = "branch_versions"."branch_name"))
WHERE ("pact_version_id" = $PACT_VERSION_ID)
))
LIMIT 1Without the index, PostgreSQL performs a Nested Loop join that processes and filters 330,089,377 rows:
-> Nested Loop (cost=47.79..4498.60 rows=585 width=4) (actual time=102.018..13530.468 rows=863 loops=1)
Output: latest_verification_id_for_pact_version_and_provider_version.verification_id
Join Filter: (latest_verification_id_for_pact_version_and_provider_version.provider_version_id = branch_versions.version_id)
Rows Removed by Join Filter: 330089377
-> Memoize (cost=0.43..36.68 rows=62 width=18) (actual time=0.000..1.242 rows=82112 loops=4020)
Output: branch_versions.version_id, branch_versions.branch_name
-> Index Scan using branch_versions_branch_name_index on branch_versions
Index Cond: (branch_versions.branch_name = providers.main_branch)
Planning Time: 1.153 ms
Execution Time: 13531.478 ms
This catastrophic row explosion occurs because:
- The query retrieves ~4,000 provider versions for a given pact
- For each provider version, PostgreSQL scans branch_versions using only the
branch_nameindex - Without a compound index on
(version_id, branch_name), it retrieves ~82k rows per loop (all rows for that branch) - Then applies a Join Filter on
provider_version_id = version_id, removing ~330M rows - Result: 4,000 loops × 82k rows = ~330 million comparisons filtered in memory
Solution
Add a composite index on branch_versions(version_id, branch_name):
CREATE INDEX branch_versions_version_id_branch_name_idx
ON branch_versions(version_id, branch_name);With the index, PostgreSQL uses an Index Only Scan:
-> Nested Loop (cost=0.71..3357.70 rows=4075 width=22) (actual time=0.038..5.821 rows=4020 loops=1)
Output: latest_verification_id_for_pact_version_and_provider_version.verification_id, ...
-> Index Only Scan using branch_versions_version_id_branch_name_idx on branch_versions
Index Cond: (branch_versions.version_id = latest_verification_id_for_pact_version_and_provider_version.provider_version_id)
Heap Fetches: 0
Buffers: shared hit=12061
Planning Time: 2.396 ms
Execution Time: 6.518 ms
Performance Comparison
| Metric | Before (No Index) | After (With Index) | Improvement |
|---|---|---|---|
| Execution Time | ~13.5s | ~6.5ms | ~2,000x faster |
| Planning Time | ~1.2ms | ~2.4ms | Slightly slower (acceptable) |
| Rows Removed by Filter | ~330M | 0 | 100% elimination |
| Actual Rows per Loop | ~82k (4k loops) | 1 (4k loops) | ~82,000x reduction |
| Heap Fetches | N/A (heap scan) | 0 | Index-only scan |
| Join Strategy | Nested Loop + Join Filter | Index Only Scan | Optimal |
Evidence
Query plan analysis shows:
- Before: Bitmap Heap Scan → Nested Loop → massive row filtering
- After: Index Scan → Index Only Scan → direct lookups
The join condition (provider_version_id = version_id) AND (main_branch = branch_name) is executed ~4,000 times (once per provider version), making the composite index critical for performance.
Proposed Fix
I have submitted a pull request with the database migration to add this index. The migration follows the existing pattern (see db/migrations/20230216_add_branch_heads_branch_version_id_index.rb):
File: db/migrations/YYYYMMDD_add_branch_versions_version_id_branch_name_index.rb
require_relative "migration_helper"
include PactBroker::MigrationHelper
Sequel.migration do
up do
if !mysql?
alter_table(:branch_versions) do
add_index([:version_id, :branch_name], name: "branch_versions_version_id_branch_name_idx")
end
end
end
down do
if !mysql?
alter_table(:branch_versions) do
drop_index([:version_id, :branch_name], name: "branch_versions_version_id_branch_name_idx")
end
end
end
endNotes:
- The migration follows the existing pattern of skipping MySQL (which has different foreign key index behavior)
- For production deployments, consider using
CREATE INDEX CONCURRENTLY(PostgreSQL) to avoid table locks - The index name follows the existing naming convention used in other migrations
Pull Request: I have submitted a PR with this migration.
Why This Matters for All Users
Even smaller Pact Broker installations will experience this issue as they grow:
- Startup phase (months 0-6): Negligible impact with <10k branch_versions
- Growth phase (months 6-12): Noticeable slowdowns (1-3s) with ~50k records
- Scale phase (12+ months): Critical timeouts (10+ seconds) with ~100k+ records
Based on our experience, the critical phase can be reached at ~16 months with ~190k records. The index ensures Pact Broker scales gracefully as adoption grows.
Environment Details
- Pact Broker versions affected: 2.110.0, 2.117.0 (tested)
- Database: PostgreSQL 15.12
- Docker image:
pactfoundation/pact-broker:2.120.0-pactbroker2.110.0
References
- Query appears in logs when consumers publish contracts
- Full EXPLAIN ANALYZE output available if needed for analysis
Query Plan Analysis
The performance improvement comes from eliminating massive in-memory row filtering:
Before index (~13.5s execution):
- Uses
Memoizenode with cache, but retrieves ~82k rows per cached entry - Applies
Join Filterremoving ~330M rows in memory - Total buffers: ~2,500 shared hits
After index (~6.5ms execution):
- Uses direct
Index Only Scanon compound index - No join filter needed - index satisfies both join conditions
- Heap Fetches: 0 (true index-only scan)
- Total buffers: ~15k shared hits
The improvement comes from eliminating CPU-intensive row filtering rather than reducing I/O.