Postgresql

PostgreSQL Schema Design Patterns for Web Applications

A practical guide to PostgreSQL schema design for web applications, covering data types, primary keys, relationships, JSONB patterns, soft deletes, migrations, and a complete CMS schema example.

PostgreSQL Schema Design Patterns for Web Applications

Overview

Your schema is the foundation of your application. Get it right and every query is clean, every migration is straightforward, and your data integrity survives even the most creative bugs in your application code. Get it wrong and you spend months working around bad column types, missing constraints, and join tables that should have been there from the start. This guide covers every schema design decision I make on day one of a new PostgreSQL-backed web application, from data type selection through migration strategy, with a complete content management system schema you can use as a starting point.

Prerequisites

  • PostgreSQL 14+ installed and running
  • Node.js 18+ with the pg driver
  • Basic SQL knowledge (CREATE TABLE, ALTER TABLE, SELECT, INSERT)
  • Familiarity with Express.js routing
  • node-pg-migrate for migration management
npm install pg node-pg-migrate

Choosing Data Types Wisely

Every column deserves the right type. PostgreSQL has a richer type system than most developers realize, and picking the wrong type costs you storage, performance, and correctness.

text vs varchar

-- Do this
CREATE TABLE articles (
    title text NOT NULL,
    slug text NOT NULL
);

-- Not this
CREATE TABLE articles (
    title varchar(255) NOT NULL,
    slug varchar(100) NOT NULL
);

In PostgreSQL, text and varchar(n) have identical performance. There is no storage benefit to varchar(255). The only difference is that varchar(n) adds a length check. If you need a length constraint, use a CHECK constraint instead -- it gives you a better error message and you can change the limit without rewriting the column:

CREATE TABLE articles (
    title text NOT NULL CHECK (char_length(title) <= 500),
    slug text NOT NULL CHECK (char_length(slug) <= 200)
);

My rule: use text everywhere. Add CHECK constraints when business logic demands a maximum length.

integer vs bigint

-- For most tables: integer is fine (max ~2.1 billion)
CREATE TABLE categories (
    id serial PRIMARY KEY
);

-- For high-volume tables: use bigint (max ~9.2 quintillion)
CREATE TABLE events (
    id bigserial PRIMARY KEY
);

An integer is 4 bytes. A bigint is 8 bytes. For a table that will never exceed 2 billion rows -- categories, users, products -- integer saves space in the table, indexes, and foreign keys that reference it. For event logs, analytics, IoT data, or anything that grows unbounded, start with bigint. Changing from integer to bigint later requires an ALTER TABLE that rewrites the entire table and every index.

timestamptz vs timestamp

-- Always use this
CREATE TABLE articles (
    created_at timestamptz NOT NULL DEFAULT now(),
    updated_at timestamptz NOT NULL DEFAULT now()
);

-- Never use this
CREATE TABLE articles (
    created_at timestamp NOT NULL DEFAULT now(),
    updated_at timestamp NOT NULL DEFAULT now()
);

timestamp (without time zone) stores the literal value you give it. If your server is in UTC and someone inserts from a connection set to America/New_York, the value is silently wrong. timestamptz stores the value in UTC internally and converts on output based on the session timezone. Always use timestamptz. There is no performance difference and no storage difference -- both are 8 bytes.

UUID vs Serial

-- Serial: simple, sequential, compact
CREATE TABLE users (
    id serial PRIMARY KEY
);

-- UUID: globally unique, no sequence contention
CREATE TABLE users (
    id uuid PRIMARY KEY DEFAULT gen_random_uuid()
);

I will cover this in depth in the next section.


Primary Key Strategies

This is one of the most debated decisions in schema design. I have shipped production systems with all three approaches, and each has its place.

Serial / Bigserial

CREATE TABLE users (
    id serial PRIMARY KEY,
    email text NOT NULL UNIQUE
);

Pros: compact (4 bytes), sequential (B-tree friendly for inserts), human-readable, great for joins. Cons: exposes row count and insertion order to anyone who can see the ID, creates sequence contention in high-concurrency inserts across multiple nodes.

Use serial for internal tables that users never see the ID of, or for simple applications where sequential IDs are not a security concern.

UUID

CREATE TABLE users (
    id uuid PRIMARY KEY DEFAULT gen_random_uuid(),
    email text NOT NULL UNIQUE
);

Pros: globally unique without coordination, safe to expose in URLs, no information leakage about row count or creation order. Cons: 16 bytes (4x a serial), random UUIDs cause B-tree page splits and worse insert performance, harder to debug by hand.

Use UUID when you need to generate IDs on the client side, in distributed systems, or when IDs appear in URLs and APIs. If B-tree fragmentation concerns you, consider UUIDv7 (time-ordered) which became better supported in PostgreSQL 17.

Short IDs (nanoid / hashid pattern)

CREATE TABLE articles (
    id serial PRIMARY KEY,
    short_id text NOT NULL UNIQUE DEFAULT generate_short_id()
);

This is my preferred hybrid approach. Keep a serial primary key for joins and internal use. Add a short_id column for URLs and public-facing identifiers. You get the join performance of integers and the URL safety of opaque IDs:

CREATE OR REPLACE FUNCTION generate_short_id() RETURNS text AS $$
DECLARE
    chars text := 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
    result text := '';
    i integer;
BEGIN
    FOR i IN 1..8 LOOP
        result := result || substr(chars, floor(random() * length(chars) + 1)::integer, 1);
    END LOOP;
    RETURN result;
END;
$$ LANGUAGE plpgsql;

Foreign Key Constraints and Referential Integrity

Foreign keys are not optional. Every relationship in your schema should be enforced at the database level. Application-level checks are not sufficient -- a bug, a manual SQL query, or a migration script will eventually violate them.

CREATE TABLE articles (
    id serial PRIMARY KEY,
    author_id integer NOT NULL REFERENCES users(id) ON DELETE RESTRICT,
    category_id integer REFERENCES categories(id) ON DELETE SET NULL,
    title text NOT NULL
);

ON DELETE behaviors

Behavior When to Use
RESTRICT (default) Prevent deleting a user who has articles
CASCADE Deleting a thread deletes all its replies
SET NULL Deleting a category sets articles to uncategorized
SET DEFAULT Rare. Sets to the column default on delete

My rule of thumb: use RESTRICT for critical relationships (user -> content), CASCADE for parent-child ownership (order -> order_items), and SET NULL for optional categorization.

Always index your foreign key columns. PostgreSQL does not create indexes on foreign keys automatically (unlike MySQL):

CREATE INDEX idx_articles_author_id ON articles(author_id);
CREATE INDEX idx_articles_category_id ON articles(category_id);

Without these indexes, deleting a user triggers a sequential scan on the articles table to check for references.


Nullable Columns and Default Values

Be deliberate about nullability. A nullable column means "this value might not exist" -- not "I did not think about it."

CREATE TABLE users (
    id serial PRIMARY KEY,
    email text NOT NULL,               -- Required, always present
    display_name text NOT NULL,         -- Required
    bio text,                           -- Optional, might not be set
    avatar_url text,                    -- Optional
    email_verified_at timestamptz,      -- NULL means not verified
    created_at timestamptz NOT NULL DEFAULT now()
);

NULL is meaningful when it represents "unknown" or "not yet." email_verified_at being NULL means the email has not been verified -- that is a legitimate use. But if you find yourself checking for NULL in application code constantly, the column should probably be NOT NULL with a default.

Common default patterns:

status text NOT NULL DEFAULT 'draft',
is_active boolean NOT NULL DEFAULT true,
view_count integer NOT NULL DEFAULT 0,
metadata jsonb NOT NULL DEFAULT '{}',
tags text[] NOT NULL DEFAULT '{}'

created_at / updated_at Patterns with Triggers

Every table should have created_at and updated_at. Here is the pattern I use on every project:

-- The trigger function (create once per database)
CREATE OR REPLACE FUNCTION update_updated_at()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = now();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- Apply to each table
CREATE TABLE articles (
    id serial PRIMARY KEY,
    title text NOT NULL,
    content text NOT NULL,
    created_at timestamptz NOT NULL DEFAULT now(),
    updated_at timestamptz NOT NULL DEFAULT now()
);

CREATE TRIGGER set_updated_at
    BEFORE UPDATE ON articles
    FOR EACH ROW
    EXECUTE FUNCTION update_updated_at();

This guarantees updated_at is always correct regardless of which application, script, or migration modifies the row. Never rely on your application code to set updated_at -- a direct SQL update will skip it.


Soft Delete Patterns

Soft deletes are valuable when you need audit trails, undo functionality, or data recovery. The simplest approach is a deleted_at column:

CREATE TABLE articles (
    id serial PRIMARY KEY,
    title text NOT NULL,
    content text NOT NULL,
    deleted_at timestamptz,
    created_at timestamptz NOT NULL DEFAULT now(),
    updated_at timestamptz NOT NULL DEFAULT now()
);

-- Create a partial index for active records (the common query)
CREATE INDEX idx_articles_active ON articles(id)
    WHERE deleted_at IS NULL;

-- View for convenience
CREATE VIEW active_articles AS
    SELECT * FROM articles WHERE deleted_at IS NULL;

Soft delete:

UPDATE articles SET deleted_at = now() WHERE id = 42;

Restore:

UPDATE articles SET deleted_at = NULL WHERE id = 42;

Hard delete (when you really mean it):

DELETE FROM articles WHERE id = 42;

The partial index WHERE deleted_at IS NULL is critical. Without it, every query that filters active records scans the full table including deleted rows. With the partial index, PostgreSQL only indexes non-deleted rows, which is both smaller and faster.

When NOT to soft delete

Do not soft delete everything. High-volume tables like logs, events, or session records should be hard deleted or partitioned. Soft deleting millions of rows bloats your table and indexes for no benefit.


JSONB Columns for Flexible Attributes

JSONB is one of PostgreSQL's killer features. It gives you schema-flexible storage inside a schema-rigid relational database. The key is knowing when to use it.

When to use JSONB

  • User preferences: notification settings, UI customizations, feature flags
  • External data: API responses, webhook payloads, import records
  • Dynamic attributes: product attributes that vary by category (a shirt has size/color, a laptop has RAM/CPU)
  • Metadata: SEO fields, analytics tags, integration configs
CREATE TABLE products (
    id serial PRIMARY KEY,
    name text NOT NULL,
    category_id integer NOT NULL REFERENCES categories(id),
    price numeric(10, 2) NOT NULL,
    attributes jsonb NOT NULL DEFAULT '{}',
    created_at timestamptz NOT NULL DEFAULT now()
);

INSERT INTO products (name, category_id, price, attributes) VALUES
    ('Classic T-Shirt', 1, 29.99, '{"size": "L", "color": "navy", "material": "cotton"}'),
    ('MacBook Pro 14"', 2, 1999.00, '{"ram_gb": 16, "storage_gb": 512, "chip": "M3"}');

Query JSONB fields:

-- Exact match
SELECT * FROM products WHERE attributes->>'color' = 'navy';

-- Numeric comparison (cast required)
SELECT * FROM products WHERE (attributes->>'ram_gb')::integer >= 16;

-- Contains operator (uses GIN index)
SELECT * FROM products WHERE attributes @> '{"chip": "M3"}';

Index JSONB for performance:

-- GIN index for @> containment queries
CREATE INDEX idx_products_attributes ON products USING gin(attributes);

-- Expression index for a specific key you query often
CREATE INDEX idx_products_color ON products ((attributes->>'color'));

When NOT to use JSONB

If you query a field in WHERE, JOIN, or ORDER BY in most of your queries, it should be a proper column. JSONB querying is slower than column access, and expression indexes only cover the specific path you index. If your "flexible" schema is really just a fixed set of fields you are too lazy to create columns for, create the columns.


Enum Types vs Lookup Tables

PostgreSQL has a native ENUM type, and I almost never use it.

-- Enum approach (inflexible)
CREATE TYPE article_status AS ENUM ('draft', 'published', 'archived');

CREATE TABLE articles (
    id serial PRIMARY KEY,
    status article_status NOT NULL DEFAULT 'draft'
);

The problem: adding a new value requires ALTER TYPE article_status ADD VALUE 'review'. That cannot run inside a transaction in PostgreSQL versions before 12, and even in later versions the ordering is tricky. Removing a value is essentially impossible without recreating the type.

The check constraint approach (my preference for short lists)

CREATE TABLE articles (
    id serial PRIMARY KEY,
    status text NOT NULL DEFAULT 'draft'
        CHECK (status IN ('draft', 'review', 'published', 'archived'))
);

Adding a new status is a single ALTER TABLE that adds a new check constraint. Dropping and replacing a check constraint is transactional and instant.

Lookup tables (for longer lists or when you need metadata)

CREATE TABLE statuses (
    id serial PRIMARY KEY,
    name text NOT NULL UNIQUE,
    display_name text NOT NULL,
    sort_order integer NOT NULL DEFAULT 0
);

INSERT INTO statuses (name, display_name, sort_order) VALUES
    ('draft', 'Draft', 1),
    ('review', 'In Review', 2),
    ('published', 'Published', 3),
    ('archived', 'Archived', 4);

CREATE TABLE articles (
    id serial PRIMARY KEY,
    status_id integer NOT NULL REFERENCES statuses(id) DEFAULT 1
);

Use a lookup table when you need to store additional metadata about each value (display names, sort order, icons, permissions), when the list changes frequently, or when non-technical users manage the values through an admin UI.


One-to-Many and Many-to-Many Relationships

One-to-Many

The most common relationship. One user has many articles. The foreign key lives on the "many" side:

CREATE TABLE users (
    id serial PRIMARY KEY,
    email text NOT NULL UNIQUE
);

CREATE TABLE articles (
    id serial PRIMARY KEY,
    author_id integer NOT NULL REFERENCES users(id),
    title text NOT NULL
);

CREATE INDEX idx_articles_author ON articles(author_id);

Many-to-Many

Articles have tags. A tag belongs to many articles. This requires a junction table:

CREATE TABLE tags (
    id serial PRIMARY KEY,
    name text NOT NULL UNIQUE,
    slug text NOT NULL UNIQUE
);

CREATE TABLE article_tags (
    article_id integer NOT NULL REFERENCES articles(id) ON DELETE CASCADE,
    tag_id integer NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
    PRIMARY KEY (article_id, tag_id)
);

CREATE INDEX idx_article_tags_tag ON article_tags(tag_id);

The composite primary key (article_id, tag_id) prevents duplicates and serves as the index for lookups by article_id. You need a separate index on tag_id for reverse lookups ("all articles with this tag").

If the junction table carries its own data (when the tag was applied, who applied it), add columns:

CREATE TABLE article_tags (
    article_id integer NOT NULL REFERENCES articles(id) ON DELETE CASCADE,
    tag_id integer NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
    created_at timestamptz NOT NULL DEFAULT now(),
    created_by integer REFERENCES users(id),
    PRIMARY KEY (article_id, tag_id)
);

Composite Unique Constraints

Unique constraints enforce business rules at the database level. A single-column unique is simple (email text UNIQUE), but many business rules span multiple columns:

-- A user can only have one subscription per plan
CREATE TABLE subscriptions (
    id serial PRIMARY KEY,
    user_id integer NOT NULL REFERENCES users(id),
    plan_id integer NOT NULL REFERENCES plans(id),
    status text NOT NULL DEFAULT 'active',
    UNIQUE (user_id, plan_id)
);

-- A slug must be unique within its category
CREATE TABLE articles (
    id serial PRIMARY KEY,
    category_id integer NOT NULL REFERENCES categories(id),
    slug text NOT NULL,
    UNIQUE (category_id, slug)
);

For soft-delete tables, a standard unique constraint breaks because deleted records still occupy the unique space. Use a partial unique index:

-- Slug must be unique among active articles only
CREATE UNIQUE INDEX idx_articles_unique_slug
    ON articles(category_id, slug)
    WHERE deleted_at IS NULL;

Check Constraints for Data Validation

Check constraints are your last line of defense. Even if your application validates input, a check constraint catches bugs in migration scripts, admin tools, and direct SQL.

CREATE TABLE products (
    id serial PRIMARY KEY,
    name text NOT NULL CHECK (char_length(name) >= 1),
    price numeric(10, 2) NOT NULL CHECK (price >= 0),
    discount_pct numeric(5, 2) CHECK (discount_pct >= 0 AND discount_pct <= 100),
    sku text NOT NULL CHECK (sku ~ '^[A-Z]{2}-[0-9]{4,8}$'),
    status text NOT NULL CHECK (status IN ('active', 'discontinued', 'draft'))
);

The ~ operator uses a regular expression. The SKU constraint ensures it matches a pattern like AB-12345. These constraints make your data self-documenting -- anyone reading the schema knows the valid ranges.

Named constraints produce better error messages:

CREATE TABLE products (
    id serial PRIMARY KEY,
    price numeric(10, 2) NOT NULL,
    sale_price numeric(10, 2),
    CONSTRAINT price_must_be_positive CHECK (price >= 0),
    CONSTRAINT sale_below_regular CHECK (sale_price IS NULL OR sale_price < price)
);

When a constraint is violated:

ERROR:  new row for relation "products" violates check constraint "sale_below_regular"
DETAIL:  Failing row contains (1, 29.99, 39.99).

Table Inheritance vs JSONB for Polymorphism

When different entity types share some fields but have unique ones, you have two main options.

JSONB approach (my preference)

CREATE TABLE notifications (
    id serial PRIMARY KEY,
    user_id integer NOT NULL REFERENCES users(id),
    type text NOT NULL CHECK (type IN ('email', 'sms', 'push')),
    payload jsonb NOT NULL,
    sent_at timestamptz,
    created_at timestamptz NOT NULL DEFAULT now()
);

-- Email: {"to": "[email protected]", "subject": "Welcome", "html": "<p>...</p>"}
-- SMS: {"phone": "+1234567890", "body": "Your code is 123456"}
-- Push: {"device_token": "abc123", "title": "New message", "badge": 3}

This is simple, queryable, and does not require multiple tables or complex joins. The tradeoff is that you lose column-level constraints on the type-specific fields, but for most polymorphic scenarios that is acceptable.

Table inheritance (PostgreSQL-specific)

PostgreSQL supports table inheritance, but I do not recommend it for web applications. It has quirks with foreign keys, unique constraints do not span parent and child tables, and ORMs handle it poorly. If you need true polymorphism with strict typing, use separate tables with a shared base and a type discriminator.


Indexing Considerations During Schema Design

Design indexes alongside your schema, not as an afterthought. Think about the queries your application will run on day one.

-- Primary key: automatic index
-- UNIQUE constraint: automatic index
-- Foreign keys: add indexes manually

-- Common query patterns to index:
CREATE INDEX idx_articles_published_at ON articles(published_at DESC)
    WHERE deleted_at IS NULL AND status = 'published';

CREATE INDEX idx_articles_category ON articles(category_id, published_at DESC);

CREATE INDEX idx_users_email_lower ON users(lower(email));

The first index is a partial index -- it only indexes published, non-deleted articles. This is exactly the subset your homepage and category pages query. Smaller index, faster lookups.

The second is a composite index that supports queries like "all articles in a category, newest first."

The third is an expression index for case-insensitive email lookups. Your application should always lower() emails on insert and query.


Migration Strategies

Never run raw ALTER TABLE statements in production. Use a migration tool that tracks which migrations have run and supports rollbacks.

node-pg-migrate setup

npx node-pg-migrate create add-articles-table --migration-file-language js

This creates a timestamped file in migrations/:

// migrations/1706900000000_add-articles-table.js

exports.up = function(pgm) {
    pgm.createTable('articles', {
        id: 'id',  // shorthand for serial PRIMARY KEY
        short_id: {
            type: 'text',
            notNull: true,
            unique: true
        },
        author_id: {
            type: 'integer',
            notNull: true,
            references: 'users(id)',
            onDelete: 'RESTRICT'
        },
        category_id: {
            type: 'integer',
            references: 'categories(id)',
            onDelete: 'SET NULL'
        },
        title: {
            type: 'text',
            notNull: true
        },
        slug: {
            type: 'text',
            notNull: true
        },
        content: {
            type: 'text',
            notNull: true,
            default: ''
        },
        status: {
            type: 'text',
            notNull: true,
            default: 'draft',
            check: "status IN ('draft', 'review', 'published', 'archived')"
        },
        metadata: {
            type: 'jsonb',
            notNull: true,
            default: '{}'
        },
        published_at: { type: 'timestamptz' },
        deleted_at: { type: 'timestamptz' },
        created_at: {
            type: 'timestamptz',
            notNull: true,
            default: pgm.func('now()')
        },
        updated_at: {
            type: 'timestamptz',
            notNull: true,
            default: pgm.func('now()')
        }
    });

    pgm.addConstraint('articles', 'unique_category_slug', {
        unique: ['category_id', 'slug'],
        where: 'deleted_at IS NULL'
    });

    pgm.createIndex('articles', 'author_id');
    pgm.createIndex('articles', 'category_id');
    pgm.createIndex('articles', ['published_at DESC'], {
        where: "deleted_at IS NULL AND status = 'published'",
        name: 'idx_articles_published_active'
    });
};

exports.down = function(pgm) {
    pgm.dropTable('articles');
};

Run migrations:

# Run all pending migrations
DATABASE_URL=postgres://user:pass@localhost:5432/myapp npx node-pg-migrate up

# Roll back the last migration
DATABASE_URL=postgres://user:pass@localhost:5432/myapp npx node-pg-migrate down

Migration best practices

  1. Every migration must have a down. Even if you never use it in production, it makes development cycling possible.
  2. One concern per migration. Do not create five tables in one migration file.
  3. Never modify a migration that has already run in production. Create a new migration to alter the table.
  4. Use transactions. node-pg-migrate wraps each migration in a transaction by default.
  5. Test migrations on a copy of production data before deploying.

Schema Naming Conventions

Consistency matters more than which convention you pick. Here is what I use:

Element Convention Example
Tables lowercase, plural, snake_case users, article_tags
Columns lowercase, snake_case first_name, created_at
Primary keys id users.id
Foreign keys {singular_table}_id articles.author_id
Indexes idx_{table}_{columns} idx_articles_author_id
Unique constraints unq_{table}_{columns} unq_articles_slug
Check constraints descriptive name price_must_be_positive
Junction tables {table1}_{table2} alphabetically article_tags
Timestamps {verb}_at created_at, deleted_at, verified_at
Booleans is_{adjective} or has_{noun} is_active, has_verified_email

Complete Working Example: Content Management System

Here is a complete CMS schema using all the patterns discussed. This is production-ready and covers users, articles, categories, tags, and comments.

Schema Migration

// migrations/1706900000000_create-cms-schema.js

exports.up = function(pgm) {
    // Utility function for updated_at trigger
    pgm.sql(`
        CREATE OR REPLACE FUNCTION update_updated_at()
        RETURNS TRIGGER AS $$
        BEGIN
            NEW.updated_at = now();
            RETURN NEW;
        END;
        $$ LANGUAGE plpgsql;
    `);

    // Short ID generator
    pgm.sql(`
        CREATE OR REPLACE FUNCTION generate_short_id() RETURNS text AS $$
        DECLARE
            chars text := 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789';
            result text := '';
            i integer;
        BEGIN
            FOR i IN 1..8 LOOP
                result := result || substr(chars, floor(random() * length(chars) + 1)::integer, 1);
            END LOOP;
            RETURN result;
        END;
        $$ LANGUAGE plpgsql;
    `);

    // ── Users ──
    pgm.createTable('users', {
        id: 'id',
        short_id: { type: 'text', notNull: true, unique: true, default: pgm.func('generate_short_id()') },
        email: { type: 'text', notNull: true, unique: true },
        display_name: { type: 'text', notNull: true },
        password_hash: { type: 'text', notNull: true },
        bio: { type: 'text' },
        avatar_url: { type: 'text' },
        role: {
            type: 'text', notNull: true, default: 'author',
            check: "role IN ('admin', 'editor', 'author', 'reader')"
        },
        is_active: { type: 'boolean', notNull: true, default: true },
        email_verified_at: { type: 'timestamptz' },
        last_login_at: { type: 'timestamptz' },
        deleted_at: { type: 'timestamptz' },
        created_at: { type: 'timestamptz', notNull: true, default: pgm.func('now()') },
        updated_at: { type: 'timestamptz', notNull: true, default: pgm.func('now()') }
    });
    pgm.createIndex('users', 'lower(email)', { name: 'idx_users_email_lower' });
    pgm.sql(`CREATE TRIGGER set_users_updated_at BEFORE UPDATE ON users FOR EACH ROW EXECUTE FUNCTION update_updated_at();`);

    // ── Categories ──
    pgm.createTable('categories', {
        id: 'id',
        name: { type: 'text', notNull: true },
        slug: { type: 'text', notNull: true, unique: true },
        description: { type: 'text' },
        parent_id: { type: 'integer', references: 'categories(id)', onDelete: 'SET NULL' },
        sort_order: { type: 'integer', notNull: true, default: 0 },
        created_at: { type: 'timestamptz', notNull: true, default: pgm.func('now()') },
        updated_at: { type: 'timestamptz', notNull: true, default: pgm.func('now()') }
    });
    pgm.createIndex('categories', 'parent_id');
    pgm.sql(`CREATE TRIGGER set_categories_updated_at BEFORE UPDATE ON categories FOR EACH ROW EXECUTE FUNCTION update_updated_at();`);

    // ── Tags ──
    pgm.createTable('tags', {
        id: 'id',
        name: { type: 'text', notNull: true, unique: true },
        slug: { type: 'text', notNull: true, unique: true },
        created_at: { type: 'timestamptz', notNull: true, default: pgm.func('now()') }
    });

    // ── Articles ──
    pgm.createTable('articles', {
        id: 'id',
        short_id: { type: 'text', notNull: true, unique: true, default: pgm.func('generate_short_id()') },
        author_id: { type: 'integer', notNull: true, references: 'users(id)', onDelete: 'RESTRICT' },
        category_id: { type: 'integer', references: 'categories(id)', onDelete: 'SET NULL' },
        title: { type: 'text', notNull: true, check: 'char_length(title) >= 1' },
        slug: { type: 'text', notNull: true },
        excerpt: { type: 'text' },
        content: { type: 'text', notNull: true, default: '' },
        status: {
            type: 'text', notNull: true, default: 'draft',
            check: "status IN ('draft', 'review', 'published', 'archived')"
        },
        metadata: { type: 'jsonb', notNull: true, default: '{}' },
        view_count: { type: 'integer', notNull: true, default: 0 },
        published_at: { type: 'timestamptz' },
        deleted_at: { type: 'timestamptz' },
        created_at: { type: 'timestamptz', notNull: true, default: pgm.func('now()') },
        updated_at: { type: 'timestamptz', notNull: true, default: pgm.func('now()') }
    });

    pgm.createIndex('articles', 'author_id');
    pgm.createIndex('articles', 'category_id');
    pgm.createIndex('articles', ['published_at DESC'], {
        where: "deleted_at IS NULL AND status = 'published'",
        name: 'idx_articles_published_active'
    });
    pgm.sql(`CREATE UNIQUE INDEX idx_articles_unique_slug ON articles(category_id, slug) WHERE deleted_at IS NULL;`);
    pgm.createIndex('articles', 'metadata', { method: 'gin' });
    pgm.sql(`CREATE TRIGGER set_articles_updated_at BEFORE UPDATE ON articles FOR EACH ROW EXECUTE FUNCTION update_updated_at();`);

    // ── Article Tags (junction) ──
    pgm.createTable('article_tags', {
        article_id: { type: 'integer', notNull: true, references: 'articles(id)', onDelete: 'CASCADE' },
        tag_id: { type: 'integer', notNull: true, references: 'tags(id)', onDelete: 'CASCADE' }
    });
    pgm.addConstraint('article_tags', 'pk_article_tags', { primaryKey: ['article_id', 'tag_id'] });
    pgm.createIndex('article_tags', 'tag_id');

    // ── Comments ──
    pgm.createTable('comments', {
        id: 'id',
        article_id: { type: 'integer', notNull: true, references: 'articles(id)', onDelete: 'CASCADE' },
        author_id: { type: 'integer', notNull: true, references: 'users(id)', onDelete: 'CASCADE' },
        parent_id: { type: 'integer', references: 'comments(id)', onDelete: 'CASCADE' },
        body: { type: 'text', notNull: true, check: 'char_length(body) >= 1' },
        is_approved: { type: 'boolean', notNull: true, default: false },
        deleted_at: { type: 'timestamptz' },
        created_at: { type: 'timestamptz', notNull: true, default: pgm.func('now()') },
        updated_at: { type: 'timestamptz', notNull: true, default: pgm.func('now()') }
    });
    pgm.createIndex('comments', 'article_id');
    pgm.createIndex('comments', 'author_id');
    pgm.createIndex('comments', 'parent_id');
    pgm.sql(`CREATE TRIGGER set_comments_updated_at BEFORE UPDATE ON comments FOR EACH ROW EXECUTE FUNCTION update_updated_at();`);
};

exports.down = function(pgm) {
    pgm.dropTable('comments');
    pgm.dropTable('article_tags');
    pgm.dropTable('articles');
    pgm.dropTable('tags');
    pgm.dropTable('categories');
    pgm.dropTable('users');
    pgm.sql('DROP FUNCTION IF EXISTS update_updated_at();');
    pgm.sql('DROP FUNCTION IF EXISTS generate_short_id();');
};

Seed Data

// seeds/001-cms-seed.js

var { Pool } = require('pg');

var pool = new Pool({
    connectionString: process.env.DATABASE_URL
});

function seed() {
    return pool.query('BEGIN')
        .then(function() {
            return pool.query(`
                INSERT INTO users (email, display_name, password_hash, role, email_verified_at)
                VALUES
                    ('[email protected]', 'Admin User', '$2b$10$placeholder_hash_1', 'admin', now()),
                    ('[email protected]', 'Shane', '$2b$10$placeholder_hash_2', 'author', now()),
                    ('[email protected]', 'Jane Smith', '$2b$10$placeholder_hash_3', 'editor', now())
                RETURNING id, email;
            `);
        })
        .then(function(result) {
            var users = result.rows;
            console.log('Created users:', users);

            return pool.query(`
                INSERT INTO categories (name, slug, description, sort_order) VALUES
                    ('Engineering', 'engineering', 'Software engineering topics', 1),
                    ('Databases', 'databases', 'Database design and management', 2),
                    ('DevOps', 'devops', 'Infrastructure and deployment', 3),
                    ('AI & ML', 'ai-ml', 'Artificial intelligence and machine learning', 4)
                RETURNING id, slug;
            `);
        })
        .then(function(result) {
            var categories = result.rows;
            console.log('Created categories:', categories);

            return pool.query(`
                INSERT INTO tags (name, slug) VALUES
                    ('postgresql', 'postgresql'),
                    ('nodejs', 'nodejs'),
                    ('docker', 'docker'),
                    ('api-design', 'api-design'),
                    ('performance', 'performance'),
                    ('security', 'security')
                RETURNING id, slug;
            `);
        })
        .then(function(result) {
            var tags = result.rows;
            console.log('Created tags:', tags);

            return pool.query(`
                INSERT INTO articles (author_id, category_id, title, slug, excerpt, content, status, published_at, metadata)
                VALUES
                    (2, 2, 'PostgreSQL Schema Design Patterns', 'postgresql-schema-design',
                     'Practical schema patterns for web apps',
                     'Full article content here...',
                     'published', now(),
                     '{"reading_time_min": 12, "featured": true}'),
                    (2, 1, 'Building REST APIs with Express', 'rest-apis-express',
                     'A guide to RESTful API design',
                     'Full article content here...',
                     'published', now() - interval '7 days',
                     '{"reading_time_min": 8, "featured": false}'),
                    (3, 3, 'Docker Compose for Local Development', 'docker-compose-local',
                     'Simplify your local dev environment',
                     'Full article content here...',
                     'draft', NULL,
                     '{"reading_time_min": 10}')
                RETURNING id, title;
            `);
        })
        .then(function(result) {
            var articles = result.rows;
            console.log('Created articles:', articles);

            return pool.query(`
                INSERT INTO article_tags (article_id, tag_id) VALUES
                    (1, 1), (1, 5),
                    (2, 2), (2, 4),
                    (3, 3);
            `);
        })
        .then(function() {
            return pool.query(`
                INSERT INTO comments (article_id, author_id, body, is_approved) VALUES
                    (1, 3, 'Great article on schema design. The soft delete pattern is exactly what I needed.', true),
                    (1, 2, 'Thanks! I have been using this pattern for years.', true),
                    (2, 3, 'The error handling section was really helpful.', true);
            `);
        })
        .then(function() {
            return pool.query('COMMIT');
        })
        .then(function() {
            console.log('Seed completed successfully');
            pool.end();
        })
        .catch(function(err) {
            console.error('Seed failed:', err);
            return pool.query('ROLLBACK').then(function() {
                pool.end();
                process.exit(1);
            });
        });
}

seed();

Node.js Data Access Layer

// models/articleModel.js

var { Pool } = require('pg');

var pool = new Pool({
    connectionString: process.env.DATABASE_URL,
    max: 20,
    idleTimeoutMillis: 30000,
    connectionTimeoutMillis: 5000
});

var ArticleModel = {

    findPublished: function(options) {
        var page = options.page || 1;
        var limit = options.limit || 20;
        var offset = (page - 1) * limit;
        var categoryId = options.categoryId;
        var tagSlug = options.tagSlug;

        var params = [limit, offset];
        var conditions = ["a.deleted_at IS NULL", "a.status = 'published'"];
        var joins = '';
        var paramIndex = 3;

        if (categoryId) {
            conditions.push('a.category_id = $' + paramIndex);
            params.push(categoryId);
            paramIndex++;
        }

        if (tagSlug) {
            joins += ' INNER JOIN article_tags at2 ON at2.article_id = a.id';
            joins += ' INNER JOIN tags t2 ON t2.id = at2.tag_id AND t2.slug = $' + paramIndex;
            params.push(tagSlug);
            paramIndex++;
        }

        var sql = `
            SELECT
                a.id,
                a.short_id,
                a.title,
                a.slug,
                a.excerpt,
                a.status,
                a.view_count,
                a.metadata,
                a.published_at,
                a.created_at,
                u.display_name AS author_name,
                c.name AS category_name,
                c.slug AS category_slug,
                COALESCE(
                    (SELECT json_agg(json_build_object('name', t.name, 'slug', t.slug))
                     FROM article_tags at1
                     JOIN tags t ON t.id = at1.tag_id
                     WHERE at1.article_id = a.id),
                    '[]'
                ) AS tags
            FROM articles a
            INNER JOIN users u ON u.id = a.author_id
            LEFT JOIN categories c ON c.id = a.category_id
            ` + joins + `
            WHERE ` + conditions.join(' AND ') + `
            ORDER BY a.published_at DESC
            LIMIT $1 OFFSET $2
        `;

        return pool.query(sql, params).then(function(result) {
            return result.rows;
        });
    },

    findByShortId: function(shortId) {
        var sql = `
            SELECT
                a.*,
                u.display_name AS author_name,
                u.avatar_url AS author_avatar,
                c.name AS category_name,
                c.slug AS category_slug,
                COALESCE(
                    (SELECT json_agg(json_build_object('name', t.name, 'slug', t.slug))
                     FROM article_tags at1
                     JOIN tags t ON t.id = at1.tag_id
                     WHERE at1.article_id = a.id),
                    '[]'
                ) AS tags
            FROM articles a
            INNER JOIN users u ON u.id = a.author_id
            LEFT JOIN categories c ON c.id = a.category_id
            WHERE a.short_id = $1
              AND a.deleted_at IS NULL
        `;

        return pool.query(sql, [shortId]).then(function(result) {
            return result.rows[0] || null;
        });
    },

    create: function(data) {
        var sql = `
            INSERT INTO articles (author_id, category_id, title, slug, excerpt, content, status, metadata)
            VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
            RETURNING *
        `;

        var params = [
            data.authorId,
            data.categoryId || null,
            data.title,
            data.slug,
            data.excerpt || null,
            data.content,
            data.status || 'draft',
            JSON.stringify(data.metadata || {})
        ];

        return pool.query(sql, params).then(function(result) {
            return result.rows[0];
        });
    },

    update: function(id, data) {
        var fields = [];
        var params = [];
        var paramIndex = 1;

        var allowed = ['title', 'slug', 'excerpt', 'content', 'status', 'category_id'];
        allowed.forEach(function(field) {
            if (data[field] !== undefined) {
                fields.push(field + ' = $' + paramIndex);
                params.push(data[field]);
                paramIndex++;
            }
        });

        if (data.metadata) {
            fields.push('metadata = $' + paramIndex);
            params.push(JSON.stringify(data.metadata));
            paramIndex++;
        }

        if (data.status === 'published') {
            fields.push('published_at = COALESCE(published_at, now())');
        }

        if (fields.length === 0) {
            return Promise.resolve(null);
        }

        params.push(id);
        var sql = 'UPDATE articles SET ' + fields.join(', ') +
                  ' WHERE id = $' + paramIndex +
                  ' AND deleted_at IS NULL RETURNING *';

        return pool.query(sql, params).then(function(result) {
            return result.rows[0] || null;
        });
    },

    softDelete: function(id) {
        var sql = 'UPDATE articles SET deleted_at = now() WHERE id = $1 AND deleted_at IS NULL RETURNING id';
        return pool.query(sql, [id]).then(function(result) {
            return result.rowCount > 0;
        });
    },

    restore: function(id) {
        var sql = 'UPDATE articles SET deleted_at = NULL WHERE id = $1 AND deleted_at IS NOT NULL RETURNING id';
        return pool.query(sql, [id]).then(function(result) {
            return result.rowCount > 0;
        });
    },

    incrementViews: function(id) {
        var sql = 'UPDATE articles SET view_count = view_count + 1 WHERE id = $1';
        return pool.query(sql, [id]);
    },

    setTags: function(articleId, tagIds) {
        var client;
        return pool.connect().then(function(c) {
            client = c;
            return client.query('BEGIN');
        }).then(function() {
            return client.query('DELETE FROM article_tags WHERE article_id = $1', [articleId]);
        }).then(function() {
            if (tagIds.length === 0) {
                return Promise.resolve();
            }
            var values = tagIds.map(function(tagId, i) {
                return '($1, $' + (i + 2) + ')';
            }).join(', ');
            var params = [articleId].concat(tagIds);
            return client.query(
                'INSERT INTO article_tags (article_id, tag_id) VALUES ' + values,
                params
            );
        }).then(function() {
            return client.query('COMMIT');
        }).then(function() {
            client.release();
        }).catch(function(err) {
            return client.query('ROLLBACK').then(function() {
                client.release();
                throw err;
            });
        });
    }
};

module.exports = ArticleModel;

Express Route

// routes/articles.js

var express = require('express');
var router = express.Router();
var ArticleModel = require('../models/articleModel');

router.get('/', function(req, res, next) {
    var page = parseInt(req.query.page, 10) || 1;
    var categoryId = req.query.category ? parseInt(req.query.category, 10) : null;
    var tagSlug = req.query.tag || null;

    ArticleModel.findPublished({
        page: page,
        limit: 20,
        categoryId: categoryId,
        tagSlug: tagSlug
    })
    .then(function(articles) {
        res.json({ articles: articles, page: page });
    })
    .catch(function(err) {
        next(err);
    });
});

router.get('/:shortId', function(req, res, next) {
    ArticleModel.findByShortId(req.params.shortId)
        .then(function(article) {
            if (!article) {
                return res.status(404).json({ error: 'Article not found' });
            }
            ArticleModel.incrementViews(article.id);
            res.json({ article: article });
        })
        .catch(function(err) {
            next(err);
        });
});

router.post('/', function(req, res, next) {
    ArticleModel.create({
        authorId: req.body.author_id,
        categoryId: req.body.category_id,
        title: req.body.title,
        slug: req.body.slug,
        excerpt: req.body.excerpt,
        content: req.body.content,
        status: req.body.status,
        metadata: req.body.metadata
    })
    .then(function(article) {
        if (req.body.tag_ids && req.body.tag_ids.length > 0) {
            return ArticleModel.setTags(article.id, req.body.tag_ids)
                .then(function() { return article; });
        }
        return article;
    })
    .then(function(article) {
        res.status(201).json({ article: article });
    })
    .catch(function(err) {
        next(err);
    });
});

router.delete('/:id', function(req, res, next) {
    ArticleModel.softDelete(parseInt(req.params.id, 10))
        .then(function(deleted) {
            if (!deleted) {
                return res.status(404).json({ error: 'Article not found' });
            }
            res.json({ message: 'Article deleted' });
        })
        .catch(function(err) {
            next(err);
        });
});

module.exports = router;

Common Issues & Troubleshooting

1. Foreign Key Violation on Delete

ERROR:  update or delete on table "users" violates foreign key constraint
        "articles_author_id_fkey" on table "articles"
DETAIL:  Key (id)=(5) is still referenced from table "articles".

Cause: You are trying to delete a user who has authored articles, and the foreign key uses ON DELETE RESTRICT (the default).

Fix: Either delete or reassign the user's articles first, or change the foreign key to ON DELETE SET NULL if the business logic allows orphaned articles:

ALTER TABLE articles DROP CONSTRAINT articles_author_id_fkey;
ALTER TABLE articles ADD CONSTRAINT articles_author_id_fkey
    FOREIGN KEY (author_id) REFERENCES users(id) ON DELETE SET NULL;

2. Unique Constraint Violation from Soft-Deleted Rows

ERROR:  duplicate key value violates unique constraint "articles_slug_key"
DETAIL:  Key (slug)=(my-article-slug) already exists.

Cause: A soft-deleted article still occupies the unique slug. Your standard UNIQUE (slug) constraint does not distinguish between active and deleted rows.

Fix: Replace the plain unique constraint with a partial unique index:

ALTER TABLE articles DROP CONSTRAINT articles_slug_key;
CREATE UNIQUE INDEX idx_articles_unique_slug ON articles(slug) WHERE deleted_at IS NULL;

3. JSONB Operator Type Mismatch

ERROR:  operator does not exist: text = integer
LINE 1: ...LECT * FROM products WHERE attributes->>'ram_gb' > 16;
HINT:  No operator matches the given name and argument types.
      You might need to add explicit type casts.

Cause: The ->> operator always returns text. Comparing a text value against an integer fails.

Fix: Cast the extracted value:

SELECT * FROM products WHERE (attributes->>'ram_gb')::integer > 16;

Or use the -> operator with a typed comparison:

SELECT * FROM products WHERE (attributes->'ram_gb')::integer > 16;

4. Slow Deletes Due to Missing Foreign Key Indexes

QUERY PLAN
Seq Scan on articles  (cost=0.00..125432.00 rows=1 width=6)
  Filter: (author_id = 42)

Cause: When you delete a row from users, PostgreSQL must check all tables that reference it. Without an index on articles.author_id, it does a full sequential scan of the articles table.

Fix: Always create indexes on foreign key columns:

CREATE INDEX idx_articles_author_id ON articles(author_id);

5. Check Constraint Blocks Valid Data After Business Logic Changes

ERROR:  new row for relation "articles" violates check constraint "articles_status_check"
DETAIL:  Failing row contains (..., review, ...).

Cause: You added a new status value (review) in your application code but forgot to update the check constraint.

Fix: Drop and recreate the constraint in a migration:

ALTER TABLE articles DROP CONSTRAINT articles_status_check;
ALTER TABLE articles ADD CONSTRAINT articles_status_check
    CHECK (status IN ('draft', 'review', 'published', 'archived'));

6. Integer Overflow on High-Volume Tables

ERROR:  integer out of range

Cause: A serial (integer) column hit 2,147,483,647 on a high-volume table like events or logs.

Fix: Migrate to bigint. This requires a table rewrite:

ALTER TABLE events ALTER COLUMN id SET DATA TYPE bigint;

For large tables, this locks the table for the duration of the rewrite. Consider using pg_repack or creating a new table and migrating data incrementally in production.


Best Practices

  • Always use timestamptz, never timestamp. You will eventually have users, servers, or logs in different time zones. timestamptz handles this correctly. timestamp does not.

  • Index every foreign key column. PostgreSQL does not do this automatically. Missing FK indexes cause slow cascading deletes and slow joins that get worse as your data grows.

  • Use text over varchar(n). There is zero performance difference in PostgreSQL. If you need a length limit, use a CHECK constraint that you can modify without rewriting the column.

  • Design for soft deletes from the start on core entities. Adding deleted_at later means retrofitting every query with WHERE deleted_at IS NULL. Do it on day one for users, articles, and anything a user might want to "undo."

  • Put business rules in check constraints, not just application code. Your application is not the only thing that touches the database. Migration scripts, admin panels, data imports, and direct SQL all bypass your application validation. Check constraints catch everything.

  • Keep JSONB for truly flexible data. If you find yourself querying a JSONB field in every WHERE clause, it should be a column. JSONB is for attributes that vary by row or change shape over time -- not for avoiding schema changes.

  • One migration per schema change. Do not bundle five table creations into one migration. Each migration should be independently reversible and describe a single logical change.

  • Use partial indexes on tables with soft deletes. A partial index WHERE deleted_at IS NULL is smaller and faster than a full index that includes rows your queries never touch.

  • Test migrations on production-scale data before deploying. A migration that runs in 50ms on your dev database with 100 rows might lock the table for 20 minutes on production with 10 million rows. ALTER TABLE ... ADD COLUMN with a non-null default rewrites the entire table in PostgreSQL versions before 11.

  • Name your constraints. CONSTRAINT price_must_be_positive CHECK (price >= 0) gives you a clear error message. An unnamed constraint gives you articles_check which tells you nothing at 3 AM.


References

Powered by Contentful