title: "Database Optimization Guide" description: "Comprehensive guide for optimizing PostgreSQL database performance in Navius applications, including indexing, query optimization, and schema design" category: "Guides" tags: ["database", "postgresql", "optimization", "performance", "indexing", "queries", "schema"] last_updated: "April 5, 2025" version: "1.0"
Database Optimization Guide
Overview
This guide provides comprehensive strategies for optimizing PostgreSQL database performance in Navius applications. Database performance is critical for application responsiveness and scalability, as database operations often represent the most significant bottleneck in web applications.
Database Design Principles
Schema Design
- Normalize with purpose - Follow normalization principles but prioritize query performance
- Choose appropriate data types - Use the most efficient data types for each column
- Limit column width - Use varchar with appropriate length limits instead of unlimited text fields
- Consider table partitioning - For very large tables (millions of rows)
Index Design
- Primary keys - Always define explicit primary keys
- Foreign keys - Index all foreign key columns
- Compound indexes - Create for commonly queried column combinations
- Cover indexes - Include additional columns to create covering indexes
- Partial indexes - Use for filtered queries on large tables
-- Example: Compound index for a commonly used query pattern
CREATE INDEX idx_users_email_status ON users (email, status);
-- Example: Covering index to avoid table lookups
CREATE INDEX idx_posts_author_created_title ON posts (author_id, created_at) INCLUDE (title);
-- Example: Partial index for active users
CREATE INDEX idx_active_users ON users (email, last_login) WHERE status = 'active';
Query Optimization
Query Analysis
Use EXPLAIN ANALYZE
to understand query execution plans:
EXPLAIN ANALYZE SELECT * FROM users
WHERE email LIKE 'user%' AND status = 'active'
ORDER BY created_at DESC LIMIT 10;
Look for these issues in query plans:
- Sequential scans on large tables
- High cost operations
- Unused indexes
- Poor join performance
Common Optimizations
1. Avoid SELECT *
-- Instead of this
SELECT * FROM users WHERE id = 1;
-- Do this
SELECT id, email, name, created_at FROM users WHERE id = 1;
2. Use Parameterized Queries
3. Batch Operations
4. Use Appropriate WHERE Clauses
-- Instead of functions on indexed columns
SELECT * FROM users WHERE LOWER(email) = 'user@example.com';
-- Do this
SELECT * FROM users WHERE email = 'user@example.com';
Connection Management
Connection Pooling
Configure your connection pool appropriately:
Guidelines for sizing:
- Measure maximum concurrent database operations during peak load
- Consider PostgreSQL's
max_connections
setting (usually 100-300) - Monitor connection usage over time
Transaction Management
- Keep transactions as short as possible
- Don't perform I/O or network operations within transactions
- Use appropriate isolation levels
- Consider using read-only transactions for queries
Advanced Optimization Techniques
PostgreSQL Configuration
Key PostgreSQL settings to tune:
# Memory settings
shared_buffers = 25% of system RAM (up to 8GB)
work_mem = 32-64MB
maintenance_work_mem = 256MB
# Checkpoint settings
checkpoint_timeout = 15min
checkpoint_completion_target = 0.9
# Planner settings
random_page_cost = 1.1 (for SSD storage)
effective_cache_size = 75% of system RAM
Materialized Views
For expensive reports or analytics queries:
CREATE MATERIALIZED VIEW user_stats AS
SELECT
date_trunc('day', created_at) as day,
count(*) as new_users,
avg(extract(epoch from now() - last_login)) as avg_days_since_login
FROM users
GROUP BY date_trunc('day', created_at);
-- Refresh the view:
REFRESH MATERIALIZED VIEW user_stats;
Database Monitoring
Monitor these metrics:
- Query execution times
- Index usage statistics
- Cache hit ratios
- Lock contention
- Deadlocks
Tools to use:
Implementing Repository Pattern in Navius
The Repository pattern helps maintain clean database access and makes queries easier to optimize:
Performance Testing Database Queries
Benchmarking Strategies
- Isolated Query Testing - Test queries independently from application
- Mock Production Data - Use production-sized datasets
- Concurrent Load Testing - Test under simultaneous connections
- EXPLAIN ANALYZE - Measure execution plan costs
- Cache Warmup/Cooldown - Test with both hot and cold cache scenarios
Testing with Criterion
Common Database Performance Issues
N+1 Query Problem
Missing Indexes
Signs of missing indexes:
- Sequential scans on large tables
- Slow filtering operations
- Slow ORDER BY or GROUP BY clauses
Oversized Queries
- Fetching unnecessary columns
- Not using LIMIT with large result sets
- Not using pagination
- Using subqueries when joins would be more efficient
Database Migration Strategies
When migrating or updating schemas:
-
Create indexes concurrently
CREATE INDEX CONCURRENTLY idx_users_status ON users (status);
-
Perform updates in batches
-
Use temporary tables for complex migrations
CREATE TEMPORARY TABLE temp_users AS SELECT * FROM users WHERE false; INSERT INTO temp_users SELECT * FROM users WHERE <condition>; -- Perform operations on temp_users -- Finally update or insert back to users
Case Study: Optimizing a High-Traffic User Service
Initial State
- Average query time: 150ms
- Database CPU: 85% utilization
- Cache hit ratio: 45%
- Frequent timeouts during peak traffic
Optimization Steps
- Added compound indexes on commonly queried fields
- Implemented result caching for frequent queries
- Optimized schema removing unused columns
- Implemented connection pooling with optimal settings
- Added database replicas for read operations
Results
- Average query time: 15ms (10x improvement)
- Database CPU: 40% utilization
- Cache hit ratio: 80%
- Zero timeouts during peak traffic