February 2025 – Cloudy DATA Blog

Postgres Database Dump and Restore

Please find below link to detailed write up of my investigations of Pg_Dump and Pg_Restore command line tools for postgres.

Detailed description of using PG_Dump and PG_Restore

As ever performing database backups is absolutely vital on so many levels. Knowing where your data is, how to back it up and restore it with regular backups of the system of record is your best defence against data breach, data corruption, server failure or server room/building failure.

Learn backup and learn them well.

I also created a online tool to help create commands to be fed into pg_dump and pg_restore

PG_RESTORE – Parameters

These are the parameters the PG_RESTORE takes there is no strict required order for the parameters. However there are some best practices and dependencies to keep in mind I list these at the end of this post

Quick Command Generator here

General Options

-d, --dbname=DBNAME
Specifies the database to restore into. If omitted, pg_restore outputs SQL to stdout.

-h, --host=HOSTNAME
Specifies the database server hostname or IP. Defaults to a local connection.

-p, --port=PORT
Specifies the database server port. Defaults to 5432.

-U, --username=USERNAME
Specifies the username for authentication.

-f, --file=FILENAME
Specifies the ouput file or directory where the dump or restore should be saved or read from

-w, --no-password
Prevents password prompt. Used when authentication does not require a password.

-W, --password
Forces a password prompt. Useful when the password is not stored in .pgpass.

Example Command

pg_restore -h prod-server-01 -p 5432 -U username -d db003target "c:/dump/2025db003_backup.dump"

Format Options

-F, --format=c|d|t|p
Specifies the archive format:

c → Custom format (from pg_dump -Fc)

d → Directory format (from pg_dump -Fd)

t → Tar format (from pg_dump -Ft)

p → Plain SQL format (not applicable to pg_restore)

Restore Control Options

-a, –data-only
Restores only data (excludes schema). Useful when schema already exists.

-c, –clean
Drops database objects before recreating them. Prevents conflicts with existing objects.

-C, –create
Includes a CREATE DATABASE command. The database will be recreated before restoring.

-I, –index=NAME
Restores only a specific index. Useful for partial restores.

-j, –jobs=NUM
Uses multiple jobs (parallel restore). Speeds up restore for -Fd format dumps.

-L, –use-list=FILE
Restores only objects listed in FILE. The list can be generated using pg_restore -l.

-n, --schema=NAME
Restores only a specific schema. Useful for restoring part of a dump.

-P, --function=NAME(args)
Restores only a specific function.

-s, --schema-only
Restores only the schema (no data). Useful for setting up database structure.

-S, --superuser=USERNAME
Specifies a superuser for certain restore operations.

-t, --table=NAME
Restores only a specific table. Useful for partial restores.

-T, --trigger=NAME
Restores only a specific trigger.

Output Control Options

-e, --exit-on-error
Stops on the first error. Useful for debugging issues.

-v, --verbose
Provides detailed output. Helps with troubleshooting.

--no-owner
Does not restore ownership metadata. Useful when restoring as a different user.

--no-privileges, --no-acl
Does not restore GRANT/REVOKE statements.

--disable-triggers
Disables triggers on tables during restore. Useful for bulk data loading.

--if-exists
Includes IF EXISTS in DROP statements. Prevents errors if objects don’t exist.

--strict-names
Requires exact match of table/schema names. Prevents unintended restores.

Help & Version

-?
Displays help information.

--version
Shows the version of pg_restore.

General Rules of IMPLEMENTATION

These are the parameters the PG_DUMP takes there is no strict required order for the parameters. However there are some best practices and dependencies to keep in mind

Options Can Be Placed in Any Order

pg_dump -U postgres -h localhost -d mydb -F c -f backup.dump

pg_dump -F c -f backup.dump -U postgres -d mydb -h localhost

Both commands are valid.

Options That Take Arguments Must Be Followed by Their Values

✅ pg_dump -U postgres -h localhost -d mydb -F c -f backup.dump

❌ pg_dump -U -h localhost postgres -d mydb -F c -f backup.dump (Invalid because -U needs a value immediately after)

Multiple Parameters Can Be Combined (if they don’t need arguments)

✅ pg_dump -U postgres -v -F c -f backup.dump

✅ pg_dump -U postgres -F c -v -f backup.dump

❌ pg_dump -F -U postgres c -f backup.dump (Invalid because -F needs an argument immediately after)

For Readability, It’s Best to Keep Related Options Together

Authentication options (-U, -h, -p, -W) should be placed together.

Dump/restore format options (-F, -j) should be grouped logically.

Filtering options (-n, -t, -T) should be near each other.

Specific Cases to Watch Out For

1. pg_restore with -d (Database Restore)

If using pg_restore -d mydb, you must not use -C (create database) because the restore is running inside an existing database.

If using -C, omit -d and restore into the default postgres database first: pg_restore -C -d postgres backup.dump

2. pg_dump Parallel Jobs (-j)

-j (parallel jobs) only works with directory format (-F d).

Example: pg_dump -F d -j 4 -f my_backup_dir -U postgres -d mydb

If -j is used with a non-directory format, pg_dump will fail.

3. pg_restore and -j (Parallel Restore)

-j (parallel restore) is only valid for directory format dumps (-F d).

Example:pg_restore -j 4 -d mydb my_backup_dir

Using -j with -F c (custom format) does not work unless restoring only selected tables.

Best Practices

✅ Authentication options first (-U, -h, -p)
✅ Output control next (-F, -f)
✅ Content filtering later (-s, -a, -t, -n)
✅ Parallelization at the end (-j)

PG_DUMP – Parameters

These are the parameters the PG_DUMP takes there is no strict required order for the parameters. However there are some best practices and dependencies to keep in mind which is at the end.

Quick Command Generator here

General Options

-d, --dbname=DBNAME
Specifies the database to restore into. If omitted, pg_restore outputs SQL to stdout.

-h, --host=HOSTNAME
Specifies the database server hostname or IP. Defaults to a local connection.

-p, --port=PORT
Specifies the database server port. Defaults to 5432.

-U, --username=USERNAME
Specifies the username for authentication.

-f, --file=FILENAME
Specifies the ouput file or directory where the dump or restore should be saved or read from

-w, --no-password
Prevents password prompt. Used when authentication does not require a password.

-W, --password
Forces a password prompt. Useful when the password is not stored in .pgpass.

Example below

pg_dump -h prod-server-01 -p 5432 -U username -d db003 -F c -f "c:/dump/2025db003_backup.dump"

Format Options

-F, --format=c|d|t|p
Specifies the archive format:

c → Custom format (from pg_dump -Fc)

d → Directory format (from pg_dump -Fd)

t → Tar format (from pg_dump -Ft)

p → Plain SQL format (not applicable to pg_restore)

Restore Control Options

-a, --data-only
Restores only data (excludes schema). Useful when schema already exists.

-c, --clean
Drops database objects before recreating them. Prevents conflicts with existing objects.

-C, --create
Includes a CREATE DATABASE command. The database will be recreated before restoring.

-I, --index=NAME
Restores only a specific index. Useful for partial restores.

-j, --jobs=NUM
Uses multiple jobs (parallel restore). Speeds up restore for -Fd format dumps.

-L, --use-list=FILE
Restores only objects listed in FILE. The list can be generated using pg_restore -l.

-n, --schema=NAME
Restores only a specific schema. Useful for restoring part of a dump.

-P, --function=NAME(args)
Restores only a specific function.

-s, --schema-only
Restores only the schema (no data). Useful for setting up database structure.

-S, --superuser=USERNAME
Specifies a superuser for certain restore operations.

-t, --table=NAME
Restores only a specific table. Useful for partial restores.

-T, --trigger=NAME
Restores only a specific trigger.

Output Control Options

-e, --exit-on-error
Stops on the first error. Useful for debugging issues.

-v, --verbose
Provides detailed output. Helps with troubleshooting.

--no-owner
Does not restore ownership metadata. Useful when restoring as a different user.

--no-privileges, --no-acl
Does not restore GRANT/REVOKE statements.

--disable-triggers
Disables triggers on tables during restore. Useful for bulk data loading.

--if-exists
Includes IF EXISTS in DROP statements. Prevents errors if objects don’t exist.

--strict-names
Requires exact match of table/schema names. Prevents unintended restores.

Help & Version

-?
Displays help information.

--version
Shows the version of pg_restore.

General Rules of IMPLEMENTATION

These are the parameters the PG_DUMP takes there is no strict required order for the parameters. However there are some best practices and dependencies to keep in mind

Options Can Be Placed in Any Order

pg_dump -U postgres -h localhost -d mydb -F c -f backup.dump

pg_dump -F c -f backup.dump -U postgres -d mydb -h localhost

Both commands are valid.

Options That Take Arguments Must Be Followed by Their Values

✅ pg_dump -U postgres -h localhost -d mydb -F c -f backup.dump

❌ pg_dump -U -h localhost postgres -d mydb -F c -f backup.dump (Invalid because -U needs a value immediately after)

Multiple Parameters Can Be Combined (if they don’t need arguments)

✅ pg_dump -U postgres -v -F c -f backup.dump

✅ pg_dump -U postgres -F c -v -f backup.dump

❌ pg_dump -F -U postgres c -f backup.dump (Invalid because -F needs an argument immediately after)

For Readability, It’s Best to Keep Related Options Together

Authentication options (-U, -h, -p, -W) should be placed together.

Dump/restore format options (-F, -j) should be grouped logically.

Filtering options (-n, -t, -T) should be near each other.

Specific Cases to Watch Out For

1. pg_restore with -d (Database Restore)

If using pg_restore -d mydb, you must not use -C (create database) because the restore is running inside an existing database.

If using -C, omit -d and restore into the default postgres database first: pg_restore -C -d postgres backup.dump

2. pg_dump Parallel Jobs (-j)

-j (parallel jobs) only works with directory format (-F d).

Example: pg_dump -F d -j 4 -f my_backup_dir -U postgres -d mydb

If -j is used with a non-directory format, pg_dump will fail.

3. pg_restore and -j (Parallel Restore)

-j (parallel restore) is only valid for directory format dumps (-F d).

Example: pg_restore -j 4 -d mydb my_backup_dir

Using -j with -F c (custom format) does not work unless restoring only selected tables.

Best Practices

✅ Authentication options first (-U, -h, -p)
✅ Output control next (-F, -f)
✅ Content filtering later (-s, -a, -t, -n)
✅ Parallelization at the end (-j)

Postgres – Summary Check of Size No of Tables Views and Rows

It is a good idea to do a general reconcilliation before and after a database dump and restore to peform some sense check as to whether restoration has completed successfully.

SELECT datname AS database_name, pg_size_pretty(pg_database_size(datname)) AS size
FROM pg_database
ORDER BY pg_database_size(datname) DESC;

And here is a further breakdown of numbers in a database

DO $$
DECLARE
    rec         RECORD;
    rec_count   bigint;
    total_rows  bigint := 0;
    total_items int := 0;
BEGIN
    -- Create a temporary table to store individual counts
    CREATE TEMP TABLE temp_table_counts (
        table_name text,
        row_count  bigint
    ) ON COMMIT DROP;

    -- Loop through each user table and view (exclude system schemas)
    FOR rec IN
        SELECT table_schema, table_name, table_type
        FROM information_schema.tables
        WHERE table_schema NOT IN ('pg_catalog', 'information_schema')
          AND table_type IN ('BASE TABLE', 'VIEW')
        ORDER BY table_name
    LOOP
        BEGIN
            -- Dynamically count the rows for the current table/view
            EXECUTE format('SELECT count(*) FROM %I.%I', rec.table_schema, rec.table_name)
            INTO rec_count;
        EXCEPTION WHEN OTHERS THEN
            -- In case of error (for example, if the view is not countable), set to NULL
            rec_count := NULL;
        END;

        -- Insert the result into our temporary table.
        INSERT INTO temp_table_counts(table_name, row_count)
        VALUES (rec.table_schema || '.' || rec.table_name, rec_count);

        -- Accumulate totals (if count succeeded)
        IF rec_count IS NOT NULL THEN
            total_rows := total_rows + rec_count;
        END IF;
        total_items := total_items + 1;
    END LOOP;

    -- Return the per-table results
    RAISE NOTICE 'Table Name | Row Count';
    FOR rec IN 
        SELECT table_name, row_count 
        FROM temp_table_counts 
        ORDER BY table_name
    LOOP
        RAISE NOTICE '% | %', rec.table_name, rec.row_count;
    END LOOP;

    -- Return the summary totals
    RAISE NOTICE '--------------------------------------';
    RAISE NOTICE 'Total Rows: %, Total Tables/Views: %', total_rows, total_items;
END $$;

And on running this is the kind of result it should return

SQL Server / Postgres – Pattern for Distinct List / Combo Box source

This primarily refers to variable text fields but in all of the CRUD systems I’ve ever designed every single page has contained fields that use lookup dialogs. Their sheer ubiquity makes their design and operation particularly important significantly impacting an applications feel to users.

In principle sourcing the values for a lookup appears to be a very simple choice.

Hard coded through the user interface (e.g Y/N)

Referenced to a separate “lookup” table

As you design more systems you start to see further important options

Referenced to distinct values sourced from a self referencing query of the table

Whether to store values as foreign keys

Whether to enforce validation at the database level.

Some general rules I follow;

1)If the application allows users to bulk import data into a table consider using referencing to a distinct values list sourced from a self referencing query of the table.
2)Almost NEVER alias values / use foreign keys in tables that will receive imported data but often better not to use it even if there is no import facility.
3)The one exception to rule 2 can be customer account codes – account codes for clients can be incredibly important and often link to a table that specifies say a company name and address – things that may change in time. Users really need to be trained to understand the derivation of foreign keys to use systems or the codes will generally break import routines or break referencing and require correction. Account codes tend to be a weird case of a derived surrogate where the value is obviously derived from the lengthened value – e.g AMZN for Amazon or MSFT for Microsoft in NASDAQ listings. This works for NASDAQ because of its ubiquity and because NASDAQ has educated its users be prepared that you will need to educate your users if you use this pattern. For this reason my general rule is only use foreign key aliases for Account codes.
4)If you have a field or table that will receive values which may be the subject of a lookup – don’t be too harsh with the field validation in the database data definition and be more permissive with the field variable type and length than you might expect.

And here is suggested SQL that can be used to identify distinct values in a field.

SELECT DISTINCT account_code FROM table_name;