pg_dump Compression Methods and Levels

Craig Younkins
2 min readNov 13, 2024

--

Photo by Clare Black

As of Postgres 16, pg_dump can accept a compression method and level. Which method is best? What level should I choose?

I ran some tests on an AWS t3a.large instance. Postgres 16.4 was installed on Ubuntu, and an example database was set up with pgbench --initialize --scale=100

Then I ran this script, which runs pg_dump 3 times with each of the compression methods using--format=custom. It runs a sample of levels — note that the level ranges and defaults are different for each method. zstd levels 19–22 were left out because they took too long.

The median results for each configuration:

Spreadsheet. Asterisk(*) indicates default compression level for the method.

The fastest options were gzip:1 and gzip:3, which were even faster than no compression. Compression takes CPU time but can mean that less data has to be written to disk, and disks are slow.

The time to dump and compress was (mostly) proportional to the compression level, with lower compression levels taking less time. This provides a lever to trade off time for space, but it seemed to only significantly affect file size for zstd.

The biggest change in file size came from the choice of algorithm, with zstd coming out far and away as the best option provided you don’t use super low compression levels.

The zstd long=1 options don’t seem to decrease the file size, take the same or more time as the default long=0, and significantly increase memory usage.

The characteristics of your system may change how these numbers come out. The t3a.large test system has 2 vCPUs of AMD EPYC 7000 series processors. If you have faster CPUs, it will favor higher compression levels. The disk is a standard gp3 with 3000 IOPS and 125 MB/sec throughput. If you have faster disks, it will favor lower compression levels.

If you need the very lowest memory usage or the very fastest dump, use gzip:1. If you need the very smallest file, use the highest zstd level you can tolerate. For most use cases, zstd (with the default level of 3) provides a great mix of small file size, low memory usage, and fast compression.

On Postgres 15 and below, you can only use gzip but can adjust the level. --compress=1 is equivalent to --compress=gzip:1 and can be a good option to speed up the dump.

--

--

Craig Younkins
Craig Younkins

Written by Craig Younkins

Hacker, entrepreneur, and quantified self nerd. cyounkins at gmail.

No responses yet