TBD: diagram

1. Create replication slot

This way, you preserve WAL (new changes) from being deleted. This would also create a snapshot of a current state. Snapshot can be exported, or reused in current transaction.

Create new replication slot to preserve future changes

2. Import existing data.

In replication only changes are streamed. Current state needs to be backuped manually.

Copy existing data; AKA backup

Physical replication:

There is a mechanism to make a backup of a running server, called “base backup”. You need to specify snapshot name, to avoid data races. Base backup is tar archive of current pg_data folder, that can be used as replica initial data.

Logical replication:

Base backup would not work here, but we have a snapshot, that allows us to query DB, using that exact state. So we fetch all the data in old fashioned way, using COPY foo TO or SELECT FROM foo command.

COPY is prefered, since it’s faster, and is available in simple protocol.

3. Subscribe and Stream changes. Protocol

After START_REPLICATION SLOT command server sends all updates. Format depends of slot. Physical is raw WAL, while Logical depends of decoding plugin, specified it this command

Receive and apply changes from WAL stream

pgoutput for example, uses same encoding as regular query results. Since 14.0 binary encoded rows are also awailable.

4. Always Commit changes

After data was successfuly processed and is no longer needed, update WAL positions via Standby status update message. This would unlock garbage collection of old log files.

5. Resume streaming

After restart (regular, crash, powerloss…)

If slot IS NOT lost:

Goto P3 and continue streaming from last commited position, no additional actions are required.

If slot IS lost:

Clean the data Goto P1, and start from scratch