PostgreSQL 18 Asynchronous I/O: A Complete Guide
PostgreSQL 18 introduces asynchronous I/O support for read operations, marking a significant shift in how the database handles disk access. This feature addresses a longstanding limitation where PostgreSQL relied entirely on operating system readahead mechanisms that couldn't predict database-specific access patterns effectively.
In previous versions, PostgreSQL issued read requests one at a time, waiting for each to complete before moving to the next. The operating system attempted to anticipate which data blocks might be needed, but without insight into query execution plans or table structures, these predictions often missed their target.
Asynchronous I/O changes this by allowing PostgreSQL to submit multiple read requests concurrently. The database now controls readahead decisions based on query plans, anticipating data needs with much greater accuracy. In environments with high storage latency—particularly cloud platforms with network-attached storage—this can deliver performance improvements of 2-3x for read-intensive workloads.
PostgreSQL 18's asynchronous I/O currently supports sequential scans, bitmap heap scans, and maintenance operations like VACUUM. These operations benefit most because they access data in predictable patterns that the database can anticipate effectively.
This article guides you through configuring and testing PostgreSQL 18's asynchronous I/O on your own system. You'll learn how to measure its impact, understand when it helps, and configure it appropriately for your hardware and workload characteristics.
Prerequisites
Before proceeding with this article, ensure you have PostgreSQL 18 installed on a test system where you can safely experiment:
psql --version
psql (PostgreSQL) 18.0 (Ubuntu 18.0-1.pgdg24.04+3)
This guide assumes familiarity with basic PostgreSQL administration including editing configuration files, running SQL queries, and restarting the database server. You'll also need sufficient disk space to create test tables larger than your available memory, since asynchronous I/O benefits manifest primarily when queries read from disk rather than cache.
Getting started with PostgreSQL 18 asynchronous I/O
Start by creating a test database and examining your current configuration. This establishes a baseline before enabling any new features.
Create a dedicated test database:
sudo -u postgres createdb aio_test
Connect to the database:
sudo -u postgres psql aio_test
Check your current I/O method:
SHOW io_method;
io_method
-----------
worker
(1 row)
PostgreSQL 18 defaults to the worker
I/O method, which uses dedicated background processes for asynchronous I/O. This represents a significant improvement over PostgreSQL 17's synchronous approach.
Check the other key I/O parameters:
SHOW effective_io_concurrency;
SHOW maintenance_io_concurrency;
SHOW io_workers;
effective_io_concurrency
--------------------------
16
(1 row)
maintenance_io_concurrency
----------------------------
16
(1 row)
io_workers
------------
3
(1 row)
The default of 16 for effective_io_concurrency
increased from 1 in PostgreSQL 17, reflecting asynchronous I/O's capabilities. You'll learn how to tune these parameters for your specific hardware shortly.
Understanding the I/O methods
PostgreSQL 18 provides three I/O methods, controlled by the io_method
parameter. Understanding each helps you choose appropriately for your environment.
The worker method
The worker
method (the default) uses dedicated background processes to handle I/O operations. When a backend process needs data, it submits requests to these worker processes rather than blocking while waiting for disk access.
Check how many I/O workers are running on your system. Open a new terminal window (separate from your PostgreSQL session) and run:
ps aux | grep "io worker" | grep -v grep
postgres 890 0.0 0.1 231192 5824 ? Ss 08:44 0:00 postgres: 18/main: io worker 2
postgres 891 0.0 0.3 231324 15168 ? Ss 08:44 0:00 postgres: 18/main: io worker 0
postgres 892 0.0 0.1 231324 6848 ? Ss 08:44 0:00 postgres: 18/main: io worker 1
The three worker processes (labeled io worker 0
, io worker 1
, and io worker 2
) correspond to io_workers = 3
in your configuration. These processes remain active throughout the database's runtime, handling I/O requests from all connections and databases. Unlike regular backend processes that serve individual client connections, I/O workers are shared resources that process read requests submitted by any backend.
The io_uring method
The io_uring
method uses Linux's modern I/O interface (kernel 5.1+) for even more efficient asynchronous operations. This method eliminates the need for worker processes by interacting directly with the kernel through shared ring buffers.
Check if your kernel supports io_uring:
uname -r
6.8.0-85-generic
Your kernel version 6.8 is well above the minimum requirement of 5.1, so io_uring is available. However, your PostgreSQL binary must also be compiled with --with-liburing
support.
First, find the path to your PostgreSQL binary. You can get this from the running postgres process:
ps aux | grep "bin/postgres" | grep -v grep | head -1
postgres 860 0.0 0.8 231192 34304 ? Ss 08:44 0:00 /usr/lib/postgresql/18/bin/postgres -D /var/lib/postgresql/18/main -c config_file=/etc/postgresql/18/main/postgresql.conf
The path /usr/lib/postgresql/18/bin/postgres
is your PostgreSQL binary location. Now check if it has liburing support:
ldd /usr/lib/postgresql/18/bin/postgres | grep uring
liburing.so.2 => /lib/x86_64-linux-gnu/liburing.so.2 (0x00007c1c0951b000)
The presence of liburing.so.2
confirms your PostgreSQL binary was compiled with io_uring
support. This means you can use io_method = 'io_uring'
for potentially better performance than the worker method. The io_uring method typically provides lower overhead since it interacts directly with the kernel rather than using intermediary worker processes.
If the command returns no output, your binary doesn't have io_uring
support, but the worker method will still provide good asynchronous I/O performance.
The sync method
The sync
method disables asynchronous I/O entirely, reverting to PostgreSQL 17's synchronous behavior. This exists primarily for troubleshooting and performance comparison, allowing you to measure the impact of asynchronous I/O by toggling it off.
Creating a test workload
To measure asynchronous I/O's impact accurately, you need a dataset large enough to force disk reads. Return to the terminal where your PostgreSQL session is running.
Start by checking PostgreSQL's buffer cache allocation:
SHOW shared_buffers;
shared_buffers
----------------
128MB
Create a test table significantly larger than your shared buffers:
CREATE TABLE io_benchmark (
id SERIAL PRIMARY KEY,
data TEXT,
value NUMERIC,
created_at TIMESTAMP DEFAULT NOW()
);
Populate it with sample data:
INSERT INTO io_benchmark (data, value)
SELECT
md5(random()::text),
random() * 1000
FROM generate_series(1, 10000000);
This takes a few minutes to complete. Once finished, update the table statistics:
ANALYZE io_benchmark;
Verify the table size:
SELECT pg_size_pretty(pg_total_relation_size('io_benchmark'));
pg_size_pretty
----------------
1029 MB
(1 row)
The table is roughly 8x larger than shared buffers, ensuring queries will read from disk.
Understanding when asynchronous I/O helps
Before diving into performance testing, it's crucial to understand that asynchronous I/O doesn't universally improve all workloads. The benefits concentrate in specific scenarios.
Where async I/O provides dramatic improvements:
- Cloud environments with network-attached storage (AWS EBS, Azure Managed Disks)
- Storage with high latency (1-5ms per read)
- Sequential scans of tables exceeding available memory
- Systems without parallel worker processes
Where benefits are modest or minimal:
- Fast local NVMe or SATA SSDs with sub-millisecond latency
- Systems already using parallel workers effectively
- Tables that fit in memory (shared buffers + OS cache)
- Small systems with limited CPU cores
So your results on local storage will likely differ, and that's perfectly normal.
Benchmarking synchronous I/O performance
Before enabling asynchronous I/O, establish a performance baseline using synchronous I/O. This requires properly clearing caches to ensure queries actually read from disk rather than memory.
Switch to your second terminal (not the PostgreSQL session) and check your system's available memory:
free -h
total used free shared buff/cache available
Mem: 3.7Gi 677Mi 156Mi 147Mi 3.3Gi 3.1Gi
Swap: 0B 0B 0B
Notice the buff/cache
column showing 3.3GB—this is Linux's page cache, which stores recently accessed file data in RAM. After running a query once, subsequent executions read partially from this cache rather than disk, skewing measurements.
To get accurate cold-cache measurements, you must clear the OS cache before each test. In your second terminal, create a helper script:
cat << 'EOF' | sudo tee /usr/local/bin/clear-pg-cache.sh
#!/bin/bash
systemctl stop postgresql@18-main
sync
echo 3 > /proc/sys/vm/drop_caches
systemctl start postgresql@18-main
sleep 2
EOF
sudo chmod +x /usr/local/bin/clear-pg-cache.sh
This script stops PostgreSQL, clears the OS cache, and restarts the database. Now establish your synchronous I/O baseline by temporarily disabling async I/O.
In your second terminal, clear the cache:
sudo /usr/local/bin/clear-pg-cache.sh
Once PostgreSQL restarts, connect to the database in your PostgreSQL session terminal:
sudo -u postgres psql aio_test
Now run the following commands to test synchronous I/O performance:
SET effective_io_concurrency = 0;
\timing on
SELECT COUNT(*), AVG(value) FROM io_benchmark WHERE value > 500;
count | avg
---------+----------------------
4999932 | 750.1352998950516530
(1 row)
Time: 2912.988 ms (00:02.913)
This represents a true cold cache read—2913ms with all data read from disk. Run the same query again to see warm cache performance:
SELECT COUNT(*), AVG(value) FROM io_benchmark WHERE value > 500;
count | avg
---------+----------------------
4999932 | 750.1352998950516530
(1 row)
Time: 1678.946 ms (00:01.679)
The warm cache run is ~42% faster (1679ms vs 2913ms) because the OS cached much of the data from the first run. This demonstrates why cache clearing matters—without it, you're not measuring disk I/O performance.
Get the full execution plan with a cold cache. First, exit your PostgreSQL session:
\q
Then clear the cache:
sudo /usr/local/bin/clear-pg-cache.sh
Once PostgreSQL restarts, reconnect to the database:
sudo -u postgres psql aio_test
Now run the following commands:
SET effective_io_concurrency = 0;
EXPLAIN (ANALYZE, BUFFERS, TIMING)
SELECT COUNT(*), AVG(value)
FROM io_benchmark
WHERE value > 500;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=167665.04..167665.05 rows=1 width=40) (actual time=2777.946..2789.924 rows=1.00 loops=1)
Buffers: shared read=104214
-> Gather (cost=167664.82..167665.03 rows=2 width=40) (actual time=2776.622..2789.865 rows=3.00 loops=1)
Workers Planned: 2
Workers Launched: 2
Buffers: shared read=104214
-> Partial Aggregate (cost=166664.82..166664.83 rows=1 width=40) (actual time=2729.100..2729.101 rows=1.00 loops=3)
Buffers: shared read=104214
-> Parallel Seq Scan on io_benchmark (cost=0.00..156297.33 rows=2073496 width=11) (actual time=38.503..2345.798 rows=1666644.00 loops=3)
Filter: (value > '500'::numeric)
Rows Removed by Filter: 1666689
Buffers: shared read=104214
Planning:
Buffers: shared hit=54 read=24
Planning Time: 14.020 ms
JIT:
Functions: 17
Options: Inlining false, Optimization false, Expressions true, Deforming true
Timing: Generation 5.506 ms (Deform 1.537 ms), Inlining 0.000 ms, Optimizatio
n 9.534 ms, Emission 99.525 ms, Total 114.565 ms
Execution Time: 3102.813 ms
The key baseline metrics:
- Cold cache execution time: 3103ms
- Buffers read from disk: : 104,214 blocks
- Parallel workers: 2 workers launched
PostgreSQL already uses parallel execution with 2 workers, providing some concurrency even with synchronous I/O. This is important context for interpreting your async I/O results.
Testing asynchronous I/O performance
Now enable async I/O and compare performance using the same cold-cache methodology. Exit the Postgres session with \q
and clear the cache:
sudo /usr/local/bin/clear-pg-cache.sh
Once PostgreSQL restarts, switch back to your first terminal and connect to the database:
sudo -u postgres psql aio_test
Enable async I/O and run the test query:
SET effective_io_concurrency = 32;
\timing on
SELECT COUNT(*), AVG(value) FROM io_benchmark WHERE value > 500;
count | avg
---------+----------------------
4999932 | 750.1352998950516530
(1 row)
Time: 2220.764 ms (00:02.221)
The execution time dropped from 2913ms to 2221ms, a 24% improvement. This is a real gain, but modest compared to the 2-3x improvements seen in cloud environments with higher-latency network storage. Your fast local SSD responds in microseconds, and parallel workers already provide some concurrency, leaving less room for async I/O to improve performance.
Testing the io_uring method
Earlier you confirmed that your system supports io_uring
, which can provide even better performance than the worker method by interacting directly with the kernel. To use io_uring
, you need to change the io_method
setting, which requires restarting PostgreSQL.
Exit your PostgreSQL session with \q
, then edit the configuration file:
sudo nano /etc/postgresql/18/main/postgresql.conf
Locate the existing io_method
line, which is usually commented out. Uncomment this line and change it to:
...
#io_combine_limit = 128kB # usually 1-128 blocks (depends on OS)
io_method = io_uring
#io_max_concurrency = -1 # Max number of IOs that one process
...
Save the file and restart PostgreSQL:
sudo systemctl restart postgresql@18-main
Verify the change took effect:
sudo -u postgres psql -c "SHOW io_method;"
io_method
-----------
io_uring
(1 row)
Now test performance with io_uring. Clear the cache:
sudo /usr/local/bin/clear-pg-cache.sh
Connect to the database and run the same test:
sudo -u postgres psql aio_test
SET effective_io_concurrency = 32;
\timing on
SELECT COUNT(*), AVG(value) FROM io_benchmark WHERE value > 500;
Record this time and compare it with both your synchronous baseline (2913ms) and the worker method (2221ms). The io_uring
method typically shows similar or slightly better performance than worker, though the difference may be small on local SSDs. The real advantage of io_uring is lower CPU overhead since it doesn't need separate worker processes.
Making asynchronous I/O settings permanent
To keep your asynchronous I/O settings permanently, you would save them in PostgreSQL’s main configuration file instead of changing them temporarily during a session.
After finishing your tests, you would close psql
with \q
and find where PostgreSQL’s configuration file is stored. It’s usually something like:
/etc/postgresql/18/main/postgresql.conf
That’s the file where permanent settings go. Inside it, you could have lines like this:
effective_io_concurrency = 32 # your optimal value from testing
maintenance_io_concurrency = 100 # higher for VACUUM, CREATE INDEX
....
# Asynchronous I/O configuration
io_method = 'worker' # or 'io_uring' if supported
io_workers = 3 # number of I/O worker processes
The maintenance_io_concurrency
line controls how many operations can run at once during maintenance tasks (like VACUUM or CREATE INDEX). It’s often set higher since those tasks usually happen when the system isn’t busy.
After saving changes, PostgreSQL would need a restart for them to take effect. You could then check the settings to confirm they were applied.
When asynchronous I/O doesn't help
Understanding scenarios where async I/O provides minimal benefit prevents unrealistic expectations and guides configuration decisions.
High buffer cache hit rates
PostgreSQL's shared buffer cache keeps frequently accessed data in memory. When queries find required data already cached, no disk I/O occurs, leaving async I/O with nothing to optimize.
Check your cache hit rate:
SELECT
sum(heap_blks_read) as disk_reads,
sum(heap_blks_hit) as cache_hits,
round(sum(heap_blks_hit)::numeric /
nullif(sum(heap_blks_hit) + sum(heap_blks_read), 0) * 100, 2) as hit_ratio
FROM pg_statio_user_tables;
disk_reads | cache_hits | hit_ratio
------------+------------+-----------
8234 | 12456789 | 99.93
A hit ratio above 99% indicates most data access happens from memory. Async I/O won't significantly improve performance here because the workload doesn't generate enough disk I/O to benefit from parallelization.
CPU-bound queries
Complex analytical queries spending most execution time on sorting, aggregation, or joins won't improve from async I/O if the CPU represents the bottleneck rather than I/O.
If the plan shows minimal buffer reads but high execution times in sort and aggregate nodes, the query is CPU-bound. Async I/O optimization won't help since the database isn't waiting for storage.
Storage already saturated
Async I/O can only improve performance up to the storage system's physical limits. If your storage is already at 100% utilization with high service times, enabling async I/O won't increase throughput—it will simply reach the limit faster.
Monitor storage utilization:
iostat -x 2
Device r/s w/s rkB/s wkB/s await aqu-sz %util
sda 1234.5 89.2 123456.7 8912.3 8.45 12.45 100.0
If %util
consistently shows 100% with high await
(service time), you've hit a hardware bottleneck that software optimizations can't overcome.
Troubleshooting asynchronous I/O issues
Despite careful configuration, you may encounter situations where async I/O doesn't work as expected.
Verifying io_uring support
If you configured io_method = 'io_uring'
but PostgreSQL falls back to synchronous I/O, your system may lack necessary support.
Check kernel version:
uname -r
The io_uring
interface requires Linux kernel 5.1 or newer. Verify kernel support:
grep io_uring /proc/kallsyms | head -3
If this returns no results, your kernel doesn't support io_uring
. Either upgrade your kernel or use io_method = 'worker'
instead.
Verify PostgreSQL was compiled with liburing:
ldd /usr/lib/postgresql/18/bin/postgres | grep uring
liburing.so.2 => /lib/x86_64-linux-gnu/liburing.so.2 (0x00007f8b9c3d0000)
If this shows no liburing dependency, your binary wasn't compiled with --with-liburing
. Use the worker method or rebuild PostgreSQL with liburing support.
Memory pressure from high concurrency
Async I/O consumes kernel memory for I/O buffers and tracking structures. Systems with limited RAM may experience memory pressure when running high io_max_concurrency
values.
Monitor kernel memory usage:
cat /proc/meminfo | grep -E 'MemTotal|MemAvailable'
MemTotal: 3785876 kB
MemAvailable: 156789 kB
If available memory drops dangerously low after enabling async I/O, reduce concurrency settings or add system RAM. PostgreSQL shouldn't compromise overall system stability in pursuit of I/O performance.
Resolving worker process limits
When using io_method = 'worker'
, PostgreSQL spawns dedicated I/O worker processes at startup. If the server fails to start or logs show worker initialization failures, you may have hit system process limits.
Check the process limit:
ulimit -u
4096
PostgreSQL needs to spawn io_workers
processes in addition to regular backend processes. Increase system limits in /etc/security/limits.conf
if necessary:
postgres soft nproc 8192
postgres hard nproc 8192
Final thoughts
PostgreSQL 18’s asynchronous I/O is a major step forward in how the database handles disk reads. By allowing multiple read operations to run at once, it helps you reduce wait times and improve performance, especially if you use cloud storage or work with higher-latency systems.
You might not see large improvements if your system already uses fast SSDs, keeps most data in memory, or runs CPU-heavy queries. But if your database often reads from disk or handles large, read-intensive workloads, asynchronous I/O can give you noticeable performance gains with very little setup.
PostgreSQL 18 gives you more control over how data is read, making it easier to tune performance for your specific workload. For more details, check the official PostgreSQL 18 documentation.