Performance Testing with Grafana k6
This document outlines the comprehensive performance testing setup for Fast-Crawl using Grafana k6, including local development workflows and CI/CD integration with strict timeout safeguards.
Overview
Fast-Crawl includes a robust performance testing suite built with Grafana k6 that validates the performance and reliability of all API endpoints. The testing framework emphasizes safety first with strict timeout controls to prevent infinite runs and resource exhaustion.
Key Features
- ⏱️ Timeout Safety: All tests are limited to ≤15 seconds duration using explicit
--vus
and--duration
flags - 🔄 CI/CD Integration: Automated performance testing in GitHub Actions on every pull request
- 🐳 Local Docker Builds: Tests run against actual PR code changes, not registry images
- 📊 Comprehensive Metrics: Response times, error rates, and custom performance indicators
- 🛡️ CI-Compatible: Graceful handling of external service dependencies in CI environments
Test Coverage
The performance testing suite covers all API endpoints with optimized configurations:
Endpoint | Duration | Virtual Users | Focus Area |
---|---|---|---|
/v1/health | 10s | 10 VUs | Health check responsiveness |
/v1/search | 15s | 5 VUs | Search aggregation performance |
/v1/scrap | 12s | 3 VUs | Content scraping efficiency |
/v1/results | 10s | 8 VUs | Results processing speed |
Combined | 15s | 5 VUs | Mixed workload simulation |
Local Development
Prerequisites
- Install k6: Follow the official installation guide
- Start Fast-Crawl: Ensure the server is running locally:bash
bun run start
Running Tests
The project includes convenient npm scripts for running performance tests:
# Run all endpoints in a combined test (recommended)
npm run load-test
# Run individual endpoint tests
npm run load-test:health # Health endpoint only
npm run load-test:search # Search endpoint only
npm run load-test:scrap # Scraping endpoint only
npm run load-test:results # Results endpoint only
Custom Test Execution
For advanced testing scenarios, you can run k6 directly with custom parameters:
# Custom virtual users and duration (max 15s)
k6 run --vus 8 --duration 12s k6/health-test.js
# Different base URL (for testing different environments)
BASE_URL=http://localhost:8080 k6 run --vus 5 --duration 10s k6/combined-test.js
# Disable summary for CI-style output
k6 run --vus 5 --duration 10s --no-summary k6/search-test.js
GitHub Actions Workflow
The performance testing workflow (performance-test.yml
) runs automatically on:
- Push to main/develop branches
- Pull requests to main
- Manual trigger (workflow_dispatch)
Workflow Steps
- Code Checkout: Retrieves the latest code from the PR/branch
- Local Docker Build: Builds the Docker image from current code using
docker build -t fast-crawl:test .
- Service Startup: Starts the containerized application with test environment variables
- k6 Setup: Installs Grafana k6 using
grafana/setup-k6-action@v1
- Health Check: Waits for the service to be ready before running tests
- Load Testing: Executes all test suites with proper timeout flags
- Cleanup: Stops and removes test containers
Why Local Docker Builds?
The workflow builds Docker images locally instead of pulling from registries to ensure:
- Testing Actual Changes: Performance tests validate the exact code in the PR
- No Registry Dependencies: Eliminates failures due to missing or outdated images
- Consistency: Same build process across development and CI environments
Safety Features & Timeout Handling
Mandatory Timeout Protection
All k6 commands include explicit timeout parameters to prevent infinite execution:
# ✅ SAFE: Explicit duration and virtual users
k6 run --vus 10 --duration 10s k6/health-test.js
# ❌ UNSAFE: Could run indefinitely
k6 run k6/health-test.js
Configuration Safeguards
Each test script includes built-in safety configurations:
export const options = {
duration: '10s', // Maximum test duration
vus: 10, // Virtual user count
thresholds: {
http_req_duration: ['p(95)<500'], // Performance thresholds
http_req_failed: ['rate<0.1'], // Error rate limits
},
};
Resource Management
- Virtual User Limits: Optimized per endpoint complexity (3-10 VUs)
- Sleep Delays: Built-in delays prevent overwhelming the API
- Memory Limits: Container resource constraints in CI environment
Performance Thresholds
Each endpoint has specific performance criteria that must be met:
Health Endpoint (/v1/health
)
- Response Time: 95% of requests < 200ms
- Error Rate: < 10%
- Availability: Should always respond successfully
Search Endpoint (/v1/search
)
- Response Time: 95% of requests < 2.5s
- Error Rate: < 80% (CI-compatible due to external API dependencies)
- Throughput: Handle concurrent search aggregation
Scrap Endpoint (/v1/scrap
)
- Response Time: 95% of requests < 4s
- Error Rate: < 30% (Playwright browser automation)
- Resource Usage: Efficient content extraction
Results Endpoint (/v1/results
)
- Response Time: 95% of requests < 800ms
- Error Rate: < 10%
- Processing Speed: Fast results manipulation
CI Environment Considerations
External Service Dependencies
Some endpoints depend on external services (Google, Bing) that may not be available in CI:
- Error Rate Tolerance: Higher error rate thresholds (80%) for search/scrap endpoints
- Environment Variables:
DISABLE_EXTERNAL_APIS=true
for testing - Graceful Degradation: Tests validate API structure even when external calls fail
Firewall Restrictions
The CI environment may block certain domains:
- k6 telemetry endpoints (
stats.grafana.org
) are blocked - Tests use
--no-summary
flag to avoid telemetry calls when needed - Local testing remains unaffected
Test Scripts Architecture
Individual Test Files
Each endpoint has a dedicated test file with optimized configurations:
k6/
├── health-test.js # Health endpoint (10s, 10 VUs)
├── search-test.js # Search endpoint (15s, 5 VUs)
├── scrap-test.js # Scraping endpoint (12s, 3 VUs)
├── results-test.js # Results endpoint (10s, 8 VUs)
├── combined-test.js # Mixed workload (15s, 5 VUs)
└── README.md # Test-specific documentation
Custom Metrics
Tests include custom metrics for detailed performance insights:
import { Trend } from 'k6/metrics';
const customMetric = new Trend('endpoint_response_time');
export default function() {
const response = http.get('/v1/endpoint');
customMetric.add(response.timings.duration);
}
Troubleshooting
Common Issues
Tests timeout in CI
# Solution: Verify duration flags are set correctly
k6 run --vus 5 --duration 10s script.js
High error rates in CI
- Expected for endpoints with external dependencies
- Check if
DISABLE_EXTERNAL_APIS=true
is set - Verify thresholds are appropriate for CI environment
Docker build failures
# Debug locally:
docker build -t fast-crawl:test .
docker run -d -p 3000:3000 fast-crawl:test
Service not ready
- Workflow includes 30-second timeout for service startup
- Check container logs:
docker logs fast-crawl-test
Debugging Performance Issues
Run tests locally with verbose output:
bashk6 run --vus 1 --duration 5s --http-debug k6/search-test.js
Check application logs during test execution:
bashdocker logs -f fast-crawl-test
Monitor resource usage:
bashdocker stats fast-crawl-test
Customizing Tests
Adding New Endpoints
- Create a new test file in
k6/
directory - Follow the existing patterns for timeout safety
- Add npm script in
package.json
- Include in GitHub Actions workflow
- Update this documentation
Modifying Thresholds
Update the options.thresholds
object in test files:
export const options = {
thresholds: {
http_req_duration: ['p(95)<1000'], // 95% under 1s
http_req_failed: ['rate<0.05'], // 5% error rate
custom_metric: ['avg<500'], // Custom threshold
},
};
Environment-Specific Configuration
Use environment variables for different testing scenarios:
const BASE_URL = __ENV.BASE_URL || 'http://localhost:3000';
const ERROR_THRESHOLD = __ENV.CI ? 0.8 : 0.1; // Higher tolerance in CI
Best Practices
- Always Use Timeouts: Include
--duration
flag in all k6 commands - Optimize Virtual Users: Balance load testing with resource constraints
- Monitor External Dependencies: Account for third-party service availability
- Test Realistic Scenarios: Use representative data and request patterns
- Document Changes: Update thresholds and configurations as the API evolves
Related Documentation
- k6 Test Scripts README - Detailed test script documentation
- v1 API Reference - API endpoint specifications
- GitHub Actions Workflow - CI configuration