Back to Databases guides

PostgreSQL vs MySQL

Stanley Ulili
Updated on October 3, 2025

Picking a database for your next project starts with a choice that affects everything: PostgreSQL or MySQL. Both handle relational data, both use SQL, and both power millions of applications worldwide. Yet they approach database design with fundamentally different priorities.

MySQL became popular by making databases approachable. It prioritized speed, simplicity, and getting your application running quickly. The database worked well for web applications that needed straightforward data storage without complex requirements.

PostgreSQL took a different path. It focused on data integrity, advanced features, and SQL standards compliance from the start. The database handles complex queries, sophisticated data types, and concurrent operations that would break simpler systems.

PostgreSQL 18, released in September 2025, widened the gap between these databases with asynchronous I/O, skip scans, temporal constraints, and other architectural improvements. Choosing between them now means deciding what matters more: MySQL's straightforward performance and wide hosting support, or PostgreSQL's expanding technical advantages.

Let's examine how they actually differ in practice.

What is PostgreSQL?

PostgreSQL is an object-relational database that emphasizes standards compliance and extensibility. Originally developed at UC Berkeley in 1986 as POSTGRES, the project became PostgreSQL in 1996 when it added SQL support.

The database treats data integrity as non-negotiable. Foreign keys, check constraints, and transactional DDL prevent invalid data from entering your database. PostgreSQL's MVCC (Multi-Version Concurrency Control) lets multiple transactions read and write simultaneously without locking each other out.

PostgreSQL's extension system changes what you can do with a database. Extensions add full-text search, geographic data types, JSON operators, and even custom data types. You're not limited to what ships in the box.

What is MySQL?

MySQL logo and description

MySQL is a relational database management system that prioritizes speed and ease of use. Created by MySQL AB in 1995, Oracle acquired it in 2010. The database remains available under the GPL license, with commercial licenses available for proprietary use.

The database became the default choice for web applications during the LAMP stack era. Its straightforward setup, familiar SQL syntax, and good performance for read-heavy workloads made it accessible to developers just getting started with databases.

MySQL's storage engine architecture lets you choose different engines for different tables. InnoDB handles transactions and foreign keys, while MyISAM offers faster reads for tables that don't need transactions. This flexibility helps you tune individual tables for their specific access patterns.

PostgreSQL vs MySQL: quick comparison

Feature PostgreSQL MySQL
First released 1989 (as Postgres), 1996 (as PostgreSQL) 1995
License PostgreSQL License (similar to BSD/MIT) GPL v2 (community), Commercial
ACID compliance Full support including DDL Full support in InnoDB
MVCC Yes, all transactions Yes, InnoDB only
Window functions Yes, extensive support Yes, since 8.0
CTEs (WITH queries) Yes, including recursive Yes, since 8.0
JSON support Native jsonb type with indexing JSON type, limited indexing
Full-text search Built-in with GIN/GiST indexes Basic support, requires InnoDB
Replication Streaming, logical, synchronous options Asynchronous, semi-synchronous, group
Partitioning Declarative, list, range, hash Range, list, hash, key
Performance focus Complex queries, concurrent writes Simple queries, read-heavy loads
Index types B-tree, Hash, GiST, SP-GiST, GIN, BRIN B-tree, Hash, R-tree (spatial)
Skip scans Yes (PostgreSQL 18+) No
Temporal constraints Yes (PostgreSQL 18+) No
Virtual generated columns Yes, default in PostgreSQL 18 No
Asynchronous I/O Yes (PostgreSQL 18+) No
Stored procedures Functions in multiple languages Procedures in SQL
Triggers BEFORE, AFTER, INSTEAD OF BEFORE, AFTER
Custom data types Yes, full support Limited
Community governance Independent foundation Oracle Corporation

Installation and initial setup

I set up both databases on a fresh Ubuntu server to compare the installation experience. Right from the start, the security models worked differently.

PostgreSQL requires a few steps:

 
# Install PostgreSQL
sudo apt update
sudo apt install postgresql postgresql-contrib

# PostgreSQL creates a 'postgres' user automatically
# Switch to it to access the database
sudo -i -u postgres
psql

# Create your application user
CREATE USER myapp WITH PASSWORD 'secure_password';
CREATE DATABASE myapp_db OWNER myapp;
GRANT ALL PRIVILEGES ON DATABASE myapp_db TO myapp;

PostgreSQL starts automatically after installation with peer authentication for local connections. This means local system users can connect to matching database users without passwords. You need to configure pg_hba.conf to allow network connections or password authentication.

MySQL installation asks for configuration upfront:

 
# Install MySQL
sudo apt update
sudo apt install mysql-server

# Run security script
sudo mysql_secure_installation

# This prompts you to:
# - Set root password
# - Remove anonymous users
# - Disable remote root login
# - Remove test database

MySQL binds to localhost by default and requires a root password. The security script walks you through hardening the installation. This guided setup gets you to a secure configuration faster, though it's less flexible than PostgreSQL's approach.

Creating your first database

After getting both databases installed, I created a simple blog database to see where the differences started appearing. Even this basic task revealed different design priorities.

PostgreSQL emphasizes precision:

 
-- Create a database
CREATE DATABASE blog
    ENCODING 'UTF8'
    LC_COLLATE 'en_US.UTF-8'
    LC_CTYPE 'en_US.UTF-8';

-- Connect to it
\c blog

-- Create a table with constraints
CREATE TABLE users (
    id SERIAL PRIMARY KEY,
    email VARCHAR(255) NOT NULL UNIQUE,
    username VARCHAR(50) NOT NULL,
    created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
    CONSTRAINT valid_email CHECK (email ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Z|a-z]{2,}$')
);

-- Add a check constraint
ALTER TABLE users 
ADD CONSTRAINT username_length CHECK (LENGTH(username) >= 3);

PostgreSQL required me to specify encoding and locale during database creation. The TIMESTAMP WITH TIME ZONE type stores timezone information, preventing the common mistake of losing timezone context. Check constraints validated data at the database level, catching invalid emails before they entered the table.

MySQL kept it simpler:

 
-- Create a database
CREATE DATABASE blog
    CHARACTER SET utf8mb4
    COLLATE utf8mb4_unicode_ci;

-- Use it
USE blog;

-- Create a table
CREATE TABLE users (
    id INT AUTO_INCREMENT PRIMARY KEY,
    email VARCHAR(255) NOT NULL UNIQUE,
    username VARCHAR(50) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_email (email)
) ENGINE=InnoDB;

-- MySQL doesn't support check constraints in older versions
-- Use application-level validation instead

MySQL's utf8mb4 character set handled emoji and special characters that the older utf8 couldn't. The TIMESTAMP type didn't store timezone information, assuming all times used the server's timezone. I noticed check constraints only appeared in MySQL 8.0.16, so older applications had to validate data at the application layer.

Virtual generated columns become the default

While building out my test schema, I needed a calculated field for the total price including tax. Both databases supported generated columns, but PostgreSQL 18 had just changed how they worked by default.

PostgreSQL 18 uses virtual generated columns by default:

 
-- Virtual generated column (PostgreSQL 18 default)
CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    price NUMERIC(10,2),
    tax_rate NUMERIC(3,2),
total_price NUMERIC(10,2) GENERATED ALWAYS AS (price * (1 + tax_rate))
); -- Value computed when selected, not stored -- Explicitly use stored if needed CREATE TABLE products_stored ( id SERIAL PRIMARY KEY, price NUMERIC(10,2), tax_rate NUMERIC(3,2),
total_price NUMERIC(10,2) GENERATED ALWAYS AS (price * (1 + tax_rate)) STORED
);

The virtual default saved storage space and eliminated write overhead. For tables where I didn't always need the calculated value, this was perfect. When read performance mattered more, I could explicitly use STORED.

MySQL only supported the stored approach:

 
-- MySQL requires STORED (or omit, defaults to STORED)
CREATE TABLE products (
    id INT AUTO_INCREMENT PRIMARY KEY,
    price DECIMAL(10,2),
    tax_rate DECIMAL(3,2),
    total_price DECIMAL(10,2) AS (price * (1 + tax_rate)) STORED
);
-- Value always computed and stored on write

MySQL computed and stored generated column values on every INSERT or UPDATE. This gave faster reads since values didn't need computation, but consumed more storage. I couldn't choose the virtual behavior even when it made more sense.

Skip scans unlock multicolumn indexes

My test application had an orders table that needed indexing on status, customer, and date. I created the same multicolumn index in both databases, then ran queries that only filtered on customer and date.

PostgreSQL 18 surprised me with skip scans:

 
-- Create a multicolumn index
CREATE INDEX idx_orders ON orders (status, customer_id, order_date);

-- Query without filtering first column
SELECT * FROM orders 
WHERE customer_id = 123 
  AND order_date > '2025-01-01';
-- PostgreSQL 18 uses skip scan on this index

PostgreSQL 18 used the index even though my query didn't filter on status. The skip scan feature let it iterate through distinct status values and dive into the index for matching customers. This eliminated a frustration I'd dealt with for years: creating duplicate indexes just to match different query patterns.

MySQL fell back to a table scan:

 
-- Same index structure
CREATE INDEX idx_orders ON orders (status, customer_id, order_date);

-- Query without first column filter
SELECT * FROM orders 
WHERE customer_id = 123 
  AND order_date > '2025-01-01';
-- MySQL can't use this index
-- Falls back to full table scan

Without skip scan support, MySQL required the index to match query filter order. I had to create a separate INDEX (customer_id, order_date), consuming more storage and slowing down writes that now had to maintain multiple indexes.

Temporal constraints for time-based data

That index discovery got me thinking about the booking system I'd built last year. It used PostgreSQL exclusion constraints to prevent overlapping room reservations, but the syntax always felt clunky. PostgreSQL 18 had just added proper temporal constraints.

The new syntax cleaned things up considerably:

 
-- Prevent overlapping bookings
CREATE TABLE room_bookings (
    room_id INTEGER,
    booking_period TSTZRANGE,
PRIMARY KEY (room_id, booking_period WITHOUT OVERLAPS)
); -- Foreign keys with temporal constraints CREATE TABLE booking_payments ( booking_id INTEGER, payment_period TSTZRANGE, amount NUMERIC(10,2),
FOREIGN KEY (booking_id, payment_period PERIOD)
REFERENCES room_bookings (booking_id, booking_period PERIOD)
); -- Overlapping insert fails automatically INSERT INTO room_bookings VALUES (101, '[2025-01-15, 2025-01-20)'); INSERT INTO room_bookings VALUES (101, '[2025-01-18, 2025-01-22)'); -- ERROR: overlapping ranges not allowed

The temporal constraint integrated directly with primary keys and foreign keys. No more exclusion constraints with GiST indexes. The error messages were clearer too.

I tried implementing the same thing in MySQL:

 
-- No native temporal constraint support
CREATE TABLE room_bookings (
    room_id INT,
    check_in DATE,
    check_out DATE,
    PRIMARY KEY (room_id, check_in)
);

-- Must implement overlap check in trigger
DELIMITER $$
CREATE TRIGGER check_overlap BEFORE INSERT ON room_bookings
FOR EACH ROW
BEGIN
    IF EXISTS (
        SELECT 1 FROM room_bookings
        WHERE room_id = NEW.room_id
        AND check_in < NEW.check_out
        AND check_out > NEW.check_in
    ) THEN
        SIGNAL SQLSTATE '45000' 
        SET MESSAGE_TEXT = 'Booking overlap detected';
    END IF;
END$$
DELIMITER ;

The trigger worked, but required more code, didn't integrate with foreign keys, and could be bypassed if someone disabled triggers. I still needed application-level validation as a safety net.

UUID generation and indexing

Those temporal constraints reminded me of another project where I'd used UUIDs for distributed ID generation. The random UUIDs had caused terrible index fragmentation as the table grew. PostgreSQL 18 added UUID v7 specifically to fix this.

I tested the difference with a simple events table:

 
-- UUID v7 maintains temporal ordering
CREATE TABLE events (
id UUID PRIMARY KEY DEFAULT uuidv7(),
event_type VARCHAR(50), created_at TIMESTAMPTZ DEFAULT NOW() ); -- UUIDs insert sequentially into index -- Reduces page splits and fragmentation

UUID v7 values contained timestamp information and sorted chronologically. New records appended to the index rather than scattering across pages. After inserting a million records, the index stayed compact and inserts remained fast.

MySQL only offered random UUIDs:

 
-- Only random UUID generation
CREATE TABLE events (
    id BINARY(16) PRIMARY KEY DEFAULT (UUID_TO_BIN(UUID())),
    event_type VARCHAR(50),
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

-- Random UUIDs fragment the index
-- Causes page splits throughout the B-tree

MySQL's UUID() function generated random values that fragmented the index. After inserting the same million records, the index was significantly larger and inserts had slowed down. The UUID_TO_BIN() function with reordering helped slightly, but didn't provide the same sequential benefits.

OLD and NEW in RETURNING clauses

While testing those UUID inserts, I wanted to track balance changes for an audit log. I'd always used triggers for this, but PostgreSQL 18's new RETURNING syntax made it trivial.

One query captured both values:

 
-- Access old and new values in UPDATE
UPDATE accounts 
SET balance = balance - 50 
WHERE account_id = 123
RETURNING OLD.balance AS previous_balance, NEW.balance AS current_balance;
-- Returns both values: 200, 150 -- Works with DELETE too DELETE FROM user_sessions WHERE expires_at < NOW()
RETURNING OLD.user_id, OLD.session_id, OLD.expires_at;

This simplified audit logging significantly. One query replaced what used to require triggers writing to separate audit tables, or multiple queries wrapped in transactions to capture both states.

MySQL had no RETURNING clause at all:

 
-- Must query before and after separately
SELECT balance INTO @old_balance 
FROM accounts WHERE account_id = 123;

UPDATE accounts 
SET balance = balance - 50 
WHERE account_id = 123;

SELECT balance INTO @new_balance 
FROM accounts WHERE account_id = 123;

I needed three separate queries wrapped in a transaction. This added complexity and created potential race conditions if another transaction modified the row between my queries.

Statistics preservation during upgrades

I recently upgraded my PostgreSQL 15 cluster to the newly released version 18. Previous major version upgrades had always come with a performance hit immediately after the upgrade finished. The query planner needed time to gather statistics, and applications ran slower until ANALYZE completed on all tables.

PostgreSQL 18 changed this completely:

 
# Upgrade preserves statistics by default
pg_upgrade \
  --old-datadir=/old/datadir \
  --new-datadir=/new/datadir \
  --old-bindir=/old/bin \
  --new-bindir=/new/bin

# Database immediately has accurate statistics

The query planner had accurate information immediately after the upgrade. My applications performed normally from the moment I switched over. This eliminated hours of waiting for ANALYZE to run on a multi-terabyte database.

When I upgraded MySQL 5.7 to 8.0 last year, the experience was different:

 
# Upgrade MySQL
sudo apt install mysql-server-8.0

# Statistics not preserved
# Must analyze all tables
mysqlcheck -u root -p --all-databases --analyze

Without preserved statistics, queries performed poorly until I analyzed every table. For my largest database, this took over six hours. Applications experienced slower query performance during this entire period.## JSON storage and indexing

With all these structural improvements in PostgreSQL, I wanted to test something more common: JSON storage. Modern applications use JSON everywhere, and how databases handle it reveals a lot about their design.

PostgreSQL treated JSON as a real data type:

 
-- JSONB with native indexing
CREATE TABLE events (
    id SERIAL PRIMARY KEY,
    event_data JSONB NOT NULL,
    tags TEXT[] DEFAULT '{}'
);

-- Index specific JSON fields
CREATE INDEX idx_event_type ON events ((event_data->>'type'));

-- GIN index for containment queries
CREATE INDEX idx_event_data ON events USING GIN (event_data);

-- Efficient queries
SELECT * FROM events
WHERE event_data->>'type' = 'user_login'
  AND event_data @> '{"user": {"role": "admin"}}';

The JSONB type stored data in binary format that supported real indexing. I could index specific paths or use GIN indexes for containment queries. The @> operator checked if one JSON document contained another, all happening at the index level.

MySQL stored JSON as text:

 
-- JSON stored as text
CREATE TABLE events (
    id INT AUTO_INCREMENT PRIMARY KEY,
    event_data JSON NOT NULL,
    tags JSON DEFAULT NULL
) ENGINE=InnoDB;

-- Create generated column for indexing
ALTER TABLE events
ADD COLUMN event_type VARCHAR(50) 
AS (event_data->>'$.type') STORED;

CREATE INDEX idx_event_type ON events(event_type);

-- Less efficient queries
SELECT * FROM events
WHERE JSON_EXTRACT(event_data, '$.type') = 'user_login'
  AND JSON_CONTAINS(event_data, '{"user": {"role": "admin"}}');

MySQL couldn't index JSON directly. I had to create generated columns for any field I wanted to index. The JSON_EXTRACT and JSON_CONTAINS functions were more verbose than PostgreSQL's operators, and I couldn't index the entire JSON structure for containment queries.

Query planner improvements

Those JSON queries got me curious about the optimizer improvements in PostgreSQL 18. I ran EXPLAIN on several queries to see what was happening behind the scenes.

PostgreSQL 18 eliminated redundant operations automatically:

 
-- Self-join that's actually pointless
SELECT u1.name, u2.email 
FROM users u1 
JOIN users u2 ON u1.id = u2.id;
-- Optimizer eliminates redundant join
-- Multiple OR conditions SELECT * FROM products WHERE category = 'electronics' OR category = 'computers' OR category = 'phones';
-- Converted to indexed array scan

The planner recognized these patterns and transformed them automatically. My self-join ran as a simple table scan. The OR conditions became an efficient array operation that used indexes properly.

MySQL executed these queries as written:

 
-- Self-join executes literally
SELECT u1.name, u2.email 
FROM users u1 
JOIN users u2 ON u1.id = u2.id;
-- Actually performs the join operation

-- OR conditions stay as OR
SELECT * FROM products 
WHERE category = 'electronics' 
   OR category = 'computers' 
   OR category = 'phones';
-- Separate index lookups for each OR

MySQL didn't optimize these patterns automatically. The self-join performed unnecessary work, and the OR conditions did separate index scans for each value.

Asynchronous I/O for large scans

Running those queries on large tables revealed another PostgreSQL 18 improvement I hadn't noticed initially. Sequential scans on my 50GB orders table ran noticeably faster than before the upgrade.

The asynchronous I/O subsystem explained why:

 
-- Sequential scan benefits from async I/O
SELECT * FROM large_orders 
WHERE order_date > '2024-01-01';

-- PostgreSQL 18 queues multiple page reads
-- OS optimizes disk access patterns

PostgreSQL 18 queued multiple page reads before waiting for results. The operating system could optimize these requests, and the database didn't sit idle waiting for each page. My sequential scans ran about 30% faster on spinning disks.

MySQL used synchronous I/O:

 
-- Same sequential scan
SELECT * FROM large_orders 
WHERE order_date > '2024-01-01';

-- Reads one page, waits
-- Requests next page, waits

MySQL requested pages one at a time, waiting for each to complete. On my test system with traditional hard drives, this made sequential scans significantly slower. The difference was less noticeable on SSDs, but still measurable.

Final thoughts

After testing both databases thoroughly with PostgreSQL 18's new features, the differences became clear. PostgreSQL handled complex scenarios better across the board. Skip scans made multicolumn indexes useful in more situations. Temporal constraints simplified time-based data. Virtual generated columns saved storage. UUID v7 prevented index fragmentation. OLD/NEW in RETURNING eliminated trigger complexity. Async I/O sped up large scans.

PostgreSQL 18's improvements widened the capability gap between these databases. For new projects where you control the hosting environment, PostgreSQL offers more sophisticated features that become valuable as applications grow. MySQL remains solid for straightforward applications where hosting compatibility and team familiarity outweigh advanced features.

Got an article suggestion? Let us know
Licensed under CC-BY-NC-SA

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.