Real-time database journaling and backups
posted 2022.06.02 by Clark Wilkins

We just installed a backup and journaling system for the PostgreSQL database running behind Healthica.

Seven-day rotating backups

We decided to do full backups on a seven-day schedule, dumping the entire database at midnight, daily, to a GZip file using a standard Bash script. This gives us Mon.backup.gz, Tue.backup.gz, etc.

#!/bin/bash
pg_dump healthica > ~/healthica/`date +%a`.backup
gzip ~/healthica/`date +%a`.backup

Journaling database changes

While this would allow us to “roll back” to midnight, we would still lose anything that changed during the day, so an adjunct method was needed. What we did was rewrite the Node.js function that connects to the database, so it records all database queries that modify the data sequentially, and as they happen. This is our version of journaling.

const { Pool } = require( "pg" );
const _ = require( "lodash" );
const fs = require ( "fs" );
const moment = require ( "moment" );
const { recordError } = require( "../functions.js" );

const pool = new Pool(); // initializing a connection pool for the route that calls this

module.exports = {

// queryText is the SQL to execute
// apiTesting is a flag to not commit the changes (testing)

  async query( queryText, apiTesting ) {

    const client = await pool.connect();

    try {

      await client.query( "BEGIN; " );
      const result = await client.query( queryText );
      apiTesting ? await client.query( "ROLLBACK; " ) : await client.query( "COMMIT; " );
      return result;
// this is what the client gets back as a result of the queries

    } catch ( e ) {

// rollback before throwing exception
      await client.query( "ROLLBACK; " );
// and record the error with a custom function
      recordError( queryText, 3 );

    } finally {

      client.release();

      if ( !apiTesting ) {

        let statements = _.split( queryText, ';' );
        let logStatements = '';
        statements.map( theStatement => {

// break queryText into an array of statement lines
          theStatement = theStatement.trim();
// if the statement modifies data, we are going to log it
          _.startsWith( theStatement, 'DELETE' ) ? logStatements += theStatement + ";\n" : null;
          _.startsWith( theStatement, 'INSERT' ) ? logStatements += theStatement + ";\n" : null;
          _.startsWith( theStatement, 'UPDATE' ) ? logStatements += theStatement + ";\n" : null;

        } );

// set up the path to the destination file
        let logfile = process.env.HEALTHICA_LOGS + moment().format( "YYYY.MM.DD" );

        if ( logStatements ) {

          logStatements = "\n/* " + moment().format( "HH.mm.ss" ) + " */\n" + logStatements;

// append the statements we kept to the current log file
          (async () => {

            const fsp = require('fs').promises;
            await fsp.appendFile( logfile, logStatements           );

        })();

      }

    }

  }

}

The result is live-journaling to a file that can be “played back” against the latest backup from midnight, and bring us right back to the desired current state. Here's a sample of the journal file logs/2022.06.03 (with one value redacted for our own security):

/* 08.11.48 */

INSERT INTO requests ( description, expires, id, listed, manufacturer, modality, name, notes, owner, sku, type ) VALUES ( '', 1655478708, '64cdc834-031e-43a6-9f7b-54e43f001087', 1654269108, 1, 1, '"PHANTOM SPHERICAL D240, BLUE =G="', null, 'REDACTED', '10496685', 2 ) RETURNING *;

If we want to recover from last night, we do the following:

  1. Unzip and restore the latest database backup.
  2. Navigate to the logs directory and psql healthica < 2022.06.03

That's how it works. Full credit for the appendFile code belongs to "vivek agarwal" in this post.