JSON and JSONB in PostgreSQL: When and How
A practical guide to JSON and JSONB in PostgreSQL covering operators, indexing with GIN, nested queries, updates, array operations, schema validation, and Node.js integration patterns.
JSON and JSONB in PostgreSQL: When and How
Overview
PostgreSQL has supported native JSON since version 9.2 and the binary JSONB type since 9.4, giving you document-store capabilities inside a relational database without bolting on MongoDB or another NoSQL service. JSONB lets you store semi-structured data, index it with GIN, query nested paths with dedicated operators, and still join it against normalized tables in the same transaction. This guide covers the practical differences between JSON and JSONB, the full operator toolkit, indexing strategies, and production-ready Node.js integration patterns I have used across multiple projects.
Prerequisites
- PostgreSQL 14+ installed and running
- Node.js 18+ with the
pgdriver - Solid SQL fundamentals (CREATE TABLE, SELECT, JOIN, WHERE)
- Familiarity with JSON structure
npm install pg
JSON vs JSONB: The Storage Question
PostgreSQL offers two JSON column types. They look similar but behave very differently under the hood.
json stores the document as a raw text string. Every time you query it, PostgreSQL re-parses the text. It preserves whitespace, key ordering, and duplicate keys exactly as inserted. There is no indexing support beyond expression indexes.
jsonb stores the document in a decomposed binary format. It strips whitespace, deduplicates keys (last value wins), and does not preserve key ordering. The binary format enables GIN indexing, containment operators, and existence checks -- the operations that make JSONB actually useful for querying.
-- JSON preserves everything, including duplicates
SELECT '{"name": "Shane", "name": "John"}'::json;
-- {"name": "Shane", "name": "John"}
-- JSONB deduplicates, last key wins
SELECT '{"name": "Shane", "name": "John"}'::jsonb;
-- {"name": "John"}
-- JSONB strips whitespace
SELECT '{ "name" : "Shane" }'::jsonb;
-- {"name": "Shane"}
When to use json: Almost never. The only legitimate use case is when you need to preserve the exact original formatting of a document -- audit logging of raw API responses, for example. Even then, I usually just store it as text.
When to use jsonb: Every other time. If you are going to query, filter, index, or update JSON data, use JSONB. The rest of this article assumes JSONB.
Performance Comparison
The storage overhead of JSONB is slightly larger than JSON because of the binary format, but read performance is dramatically better:
-- Create test tables
CREATE TABLE test_json (id serial PRIMARY KEY, data json);
CREATE TABLE test_jsonb (id serial PRIMARY KEY, data jsonb);
-- Insert 100k identical rows
INSERT INTO test_json (data)
SELECT json_build_object('name', 'user_' || i, 'score', i % 100, 'active', (i % 2 = 0))
FROM generate_series(1, 100000) AS i;
INSERT INTO test_jsonb (data)
SELECT jsonb_build_object('name', 'user_' || i, 'score', i % 100, 'active', (i % 2 = 0))
FROM generate_series(1, 100000) AS i;
-- Query comparison: extract a value and filter
EXPLAIN ANALYZE SELECT data->>'name' FROM test_json WHERE (data->>'score')::int > 95;
-- Seq Scan on test_json (cost=0.00..4142.00 rows=33333 width=32) (actual time=0.031..78.542 rows=4000 loops=1)
-- Planning Time: 0.081 ms
-- Execution Time: 79.014 ms
EXPLAIN ANALYZE SELECT data->>'name' FROM test_jsonb WHERE (data->>'score')::int > 95;
-- Seq Scan on test_jsonb (cost=0.00..3892.00 rows=33333 width=32) (actual time=0.028..52.317 rows=4000 loops=1)
-- Planning Time: 0.074 ms
-- Execution Time: 52.684 ms
JSONB is consistently faster for read operations because it does not re-parse the document on every access. The difference widens with more complex or larger documents.
When to Use JSONB vs Normalized Columns
This is the most important decision. I see teams fall into two traps: storing everything as JSONB (turning PostgreSQL into a bad MongoDB), or refusing to use JSONB at all (creating 40-column tables with 30 nullable fields).
Use JSONB when:
- Attributes vary by type. A product catalog where electronics have
screen_sizeandbattery_life, but clothing hassizeandmaterial. A normalized approach would require either sparse columns or an EAV table. JSONB handles this cleanly. - You are storing third-party API responses. Webhook payloads, OAuth provider profiles, payment gateway responses -- schemas you do not control and that change without notice.
- User preferences and settings. Free-form key-value pairs where the schema evolves frequently and each user may have different keys.
- Metadata and tags. Arbitrary key-value annotations on records.
Normalize instead when:
- You need referential integrity (foreign keys cannot point into JSONB).
- You query the same fields in nearly every request.
- You need unique constraints across records on a specific field.
- Aggregate queries (SUM, AVG, GROUP BY) are core to the use case.
- The schema is stable and well-defined.
The sweet spot is a hybrid: normalized columns for your core relational data, and a JSONB column for the flexible parts.
Creating Tables with JSONB Columns
CREATE TABLE products (
id serial PRIMARY KEY,
sku varchar(50) UNIQUE NOT NULL,
name varchar(255) NOT NULL,
category varchar(100) NOT NULL,
price numeric(10,2) NOT NULL,
attributes jsonb DEFAULT '{}',
tags jsonb DEFAULT '[]',
created_at timestamptz DEFAULT now(),
updated_at timestamptz DEFAULT now()
);
Notice the pattern: sku, name, category, and price are normalized because you will filter, sort, and aggregate on them constantly. attributes is JSONB because a laptop and a t-shirt have completely different attribute sets. tags is a JSONB array because the number and values of tags vary per product.
Inserting and Updating JSONB Data
Inserting
INSERT INTO products (sku, name, category, price, attributes, tags) VALUES
(
'LAPTOP-001',
'ThinkPad X1 Carbon Gen 11',
'electronics',
1429.99,
'{
"brand": "Lenovo",
"screen_size": 14,
"resolution": "2560x1600",
"ram_gb": 16,
"storage": {"type": "NVMe SSD", "capacity_gb": 512},
"ports": ["USB-C", "USB-A", "HDMI", "Thunderbolt 4"],
"weight_kg": 1.12
}',
'["laptop", "ultrabook", "business"]'
),
(
'SHIRT-042',
'Classic Oxford Button-Down',
'clothing',
89.00,
'{
"brand": "Brooks Brothers",
"material": "100% cotton",
"sizes_available": ["S", "M", "L", "XL"],
"colors": ["white", "blue", "pink"],
"care": {"machine_washable": true, "dry_clean": false}
}',
'["shirt", "formal", "cotton"]'
);
Replacing the Entire JSONB Value
UPDATE products
SET attributes = '{"brand": "Lenovo", "screen_size": 14, "ram_gb": 32}'
WHERE sku = 'LAPTOP-001';
This replaces the entire attributes column. Usually not what you want.
Merging with the || Operator
The concatenation operator merges two JSONB objects. Top-level keys in the right operand overwrite matching keys in the left:
UPDATE products
SET attributes = attributes || '{"ram_gb": 32, "on_sale": true}'
WHERE sku = 'LAPTOP-001';
This updates ram_gb to 32, adds on_sale, and leaves all other keys untouched.
Querying JSONB: The Operator Toolkit
-> and ->> : Key Access
-> returns a JSONB value. ->> returns a text value.
-- Returns JSONB (notice the quotes)
SELECT attributes->'brand' FROM products WHERE sku = 'LAPTOP-001';
-- "Lenovo"
-- Returns text (no quotes)
SELECT attributes->>'brand' FROM products WHERE sku = 'LAPTOP-001';
-- Lenovo
-- Integer index for arrays
SELECT tags->0 FROM products WHERE sku = 'LAPTOP-001';
-- "laptop"
-- Chain operators for nested access
SELECT attributes->'storage'->>'type' FROM products WHERE sku = 'LAPTOP-001';
-- NVMe SSD
The distinction matters. -> returns JSONB, so you can chain further operators. ->> returns text, so it is the terminal operator when you need a scalar value.
#> and #>> : Path Access
For deeply nested values, path operators accept a text array:
-- JSONB result via path
SELECT attributes #> '{storage, type}' FROM products WHERE sku = 'LAPTOP-001';
-- "NVMe SSD"
-- Text result via path
SELECT attributes #>> '{storage, type}' FROM products WHERE sku = 'LAPTOP-001';
-- NVMe SSD
@> : Containment
The containment operator checks whether the left JSONB value contains the right value. This is one of the most powerful operators because it uses GIN indexes:
-- Find all products from Lenovo
SELECT name, price FROM products
WHERE attributes @> '{"brand": "Lenovo"}';
-- Find products with specific nested values
SELECT name FROM products
WHERE attributes @> '{"storage": {"type": "NVMe SSD"}}';
-- Containment works with arrays too
SELECT name FROM products
WHERE tags @> '["ultrabook"]';
?, ?|, ?& : Key Existence
-- Does the key exist?
SELECT name FROM products WHERE attributes ? 'screen_size';
-- ThinkPad X1 Carbon Gen 11
-- Does ANY of these keys exist?
SELECT name FROM products WHERE attributes ?| array['screen_size', 'material'];
-- ThinkPad X1 Carbon Gen 11
-- Classic Oxford Button-Down
-- Do ALL of these keys exist?
SELECT name FROM products WHERE attributes ?& array['screen_size', 'ram_gb'];
-- ThinkPad X1 Carbon Gen 11
Updating Nested Values with jsonb_set and jsonb_insert
jsonb_set
jsonb_set(target, path, new_value, create_missing) updates a value at a specific path:
-- Update a nested value
UPDATE products
SET attributes = jsonb_set(attributes, '{storage, capacity_gb}', '1024')
WHERE sku = 'LAPTOP-001';
-- Add a new nested key (create_missing defaults to true)
UPDATE products
SET attributes = jsonb_set(attributes, '{storage, interface}', '"PCIe 4.0"')
WHERE sku = 'LAPTOP-001';
-- Update multiple nested values in one statement
UPDATE products
SET attributes = jsonb_set(
jsonb_set(attributes, '{ram_gb}', '32'),
'{storage, capacity_gb}', '1024'
)
WHERE sku = 'LAPTOP-001';
jsonb_insert
jsonb_insert is specifically for inserting into arrays at a given position:
-- Insert "USB-B" at position 2 in the ports array
UPDATE products
SET attributes = jsonb_insert(attributes, '{ports, 2}', '"USB-B"')
WHERE sku = 'LAPTOP-001';
-- Insert after the specified position
UPDATE products
SET attributes = jsonb_insert(attributes, '{ports, 0}', '"SD Card"', true)
WHERE sku = 'LAPTOP-001';
Removing Keys with the - Operator
-- Remove a top-level key
UPDATE products
SET attributes = attributes - 'on_sale'
WHERE sku = 'LAPTOP-001';
-- Remove a nested key using #-
UPDATE products
SET attributes = attributes #- '{storage, interface}'
WHERE sku = 'LAPTOP-001';
-- Remove multiple top-level keys
UPDATE products
SET attributes = attributes - 'on_sale' - 'weight_kg'
WHERE sku = 'LAPTOP-001';
-- Remove an array element by index
UPDATE products
SET tags = tags - 0
WHERE sku = 'LAPTOP-001';
JSONB Array Operations
Appending to Arrays
-- Append an element to a JSONB array
UPDATE products
SET tags = tags || '"clearance"'::jsonb
WHERE sku = 'SHIRT-042';
-- Append multiple elements
UPDATE products
SET tags = tags || '["sale", "featured"]'::jsonb
WHERE sku = 'SHIRT-042';
Unnesting Arrays with jsonb_array_elements
jsonb_array_elements expands a JSONB array into a set of rows. This is essential for filtering and joining on array contents:
-- List each port as its own row
SELECT name, port
FROM products,
jsonb_array_elements_text(attributes->'ports') AS port
WHERE sku = 'LAPTOP-001';
-- name | port
-- ThinkPad X1 Carbon Gen 11 | USB-C
-- ThinkPad X1 Carbon Gen 11 | USB-A
-- ThinkPad X1 Carbon Gen 11 | HDMI
-- ThinkPad X1 Carbon Gen 11 | Thunderbolt 4
Filtering by Array Contents
-- Find products that have "USB-C" in their ports array
SELECT name FROM products
WHERE attributes->'ports' @> '"USB-C"';
-- Find products where any available size is "XL"
SELECT name FROM products
WHERE attributes->'sizes_available' @> '"XL"';
Counting Array Elements
SELECT name, jsonb_array_length(tags) AS tag_count
FROM products;
Indexing JSONB
Without indexes, every JSONB query requires a sequential scan. On tables with more than a few thousand rows, you need GIN indexes.
Default GIN Index (jsonb_ops)
CREATE INDEX idx_products_attributes ON products USING GIN (attributes);
CREATE INDEX idx_products_tags ON products USING GIN (tags);
The default jsonb_ops operator class supports @>, ?, ?|, ?&, and @? operators. It indexes every key and value in the document.
EXPLAIN ANALYZE
SELECT name, price FROM products WHERE attributes @> '{"brand": "Lenovo"}';
-- Bitmap Heap Scan on products (cost=12.05..116.06 rows=50 width=44) (actual time=0.038..0.042 rows=1 loops=1)
-- Recheck Cond: (attributes @> '{"brand": "Lenovo"}'::jsonb)
-- -> Bitmap Index Scan on idx_products_attributes (cost=0.00..12.04 rows=50 width=0) (actual time=0.028..0.029 rows=1 loops=1)
-- Index Cond: (attributes @> '{"brand": "Lenovo"}'::jsonb)
-- Planning Time: 0.112 ms
-- Execution Time: 0.071 ms
jsonb_path_ops GIN Index
The jsonb_path_ops operator class creates a smaller, faster index but only supports the @> containment operator:
CREATE INDEX idx_products_attrs_pathops ON products USING GIN (attributes jsonb_path_ops);
Use jsonb_path_ops when your queries primarily use @> containment checks. The index is typically 2-3x smaller than jsonb_ops because it hashes full paths rather than indexing individual keys.
Expression Indexes on Specific Paths
When you query a specific JSONB path frequently, a B-tree expression index is often more efficient than a GIN index:
-- B-tree index on a specific extracted value
CREATE INDEX idx_products_brand ON products ((attributes->>'brand'));
-- Now this query uses the B-tree index
EXPLAIN ANALYZE
SELECT name, price FROM products WHERE attributes->>'brand' = 'Lenovo';
-- Index Scan using idx_products_brand on products (cost=0.15..8.17 rows=1 width=44) (actual time=0.021..0.023 rows=1 loops=1)
-- Index Cond: ((attributes ->> 'brand'::text) = 'Lenovo'::text)
-- Planning Time: 0.094 ms
-- Execution Time: 0.042 ms
-- Numeric expression index for range queries
CREATE INDEX idx_products_screen_size ON products (((attributes->>'screen_size')::numeric));
SELECT name FROM products WHERE (attributes->>'screen_size')::numeric >= 14;
Expression indexes are smaller and faster than GIN for point lookups and range queries on known paths. GIN indexes are better when you do not know which keys will be queried in advance.
JSONB in WHERE Clauses
Containment Queries (GIN-Indexable)
-- Exact nested match
SELECT * FROM products WHERE attributes @> '{"brand": "Lenovo", "ram_gb": 16}';
-- Array containment
SELECT * FROM products WHERE tags @> '["laptop", "business"]';
-- Nested object containment
SELECT * FROM products WHERE attributes @> '{"care": {"machine_washable": true}}';
Existence Checks
-- Products that have battery information
SELECT name FROM products WHERE attributes ? 'battery_life';
-- Products with any of these attributes
SELECT name FROM products WHERE attributes ?| array['screen_size', 'material'];
Text Extraction in WHERE (Requires Expression Index)
-- Using ->> for text comparison
SELECT * FROM products WHERE attributes->>'brand' = 'Lenovo';
-- Casting for numeric comparison
SELECT * FROM products WHERE (attributes->>'screen_size')::numeric > 13;
-- LIKE/ILIKE on extracted text
SELECT * FROM products WHERE attributes->>'material' ILIKE '%cotton%';
Aggregating JSONB
jsonb_agg
Collects values into a JSONB array:
SELECT category, jsonb_agg(name) AS product_names
FROM products
GROUP BY category;
-- category | product_names
-- electronics | ["ThinkPad X1 Carbon Gen 11"]
-- clothing | ["Classic Oxford Button-Down"]
jsonb_object_agg
Builds a JSONB object from key-value pairs:
SELECT jsonb_object_agg(sku, name) FROM products;
-- {"LAPTOP-001": "ThinkPad X1 Carbon Gen 11", "SHIRT-042": "Classic Oxford Button-Down"}
jsonb_build_object
Constructs JSONB objects in SELECT statements:
SELECT jsonb_build_object(
'product', name,
'price', price,
'brand', attributes->>'brand'
) AS product_summary
FROM products;
-- {"brand": "Lenovo", "price": 1429.99, "product": "ThinkPad X1 Carbon Gen 11"}
JSONB with JOINs
Cross-Joining Array Elements
A common pattern: expand a JSONB array into rows and join against another table.
CREATE TABLE port_specs (
port_name varchar(50) PRIMARY KEY,
max_speed varchar(50),
power_watts int
);
INSERT INTO port_specs VALUES
('USB-C', '10 Gbps', 100),
('USB-A', '5 Gbps', 4),
('HDMI', '48 Gbps', 0),
('Thunderbolt 4', '40 Gbps', 100);
-- Join JSONB array elements against a normalized table
SELECT p.name, ps.port_name, ps.max_speed, ps.power_watts
FROM products p,
jsonb_array_elements_text(p.attributes->'ports') AS port_val
JOIN port_specs ps ON ps.port_name = port_val
WHERE p.category = 'electronics';
-- name | port_name | max_speed | power_watts
-- ThinkPad X1 Carbon Gen 11 | USB-C | 10 Gbps | 100
-- ThinkPad X1 Carbon Gen 11 | USB-A | 5 Gbps | 4
-- ThinkPad X1 Carbon Gen 11 | HDMI | 48 Gbps | 0
-- ThinkPad X1 Carbon Gen 11 | Thunderbolt 4 | 40 Gbps | 100
This is powerful. You get the flexibility of JSONB arrays with the referential integrity and queryability of normalized tables.
Schema Validation with CHECK Constraints
JSONB does not enforce structure by default, but you can add CHECK constraints to ensure minimum schema requirements:
ALTER TABLE products ADD CONSTRAINT check_attributes_brand
CHECK (attributes ? 'brand');
-- Ensures that attributes always contains a "brand" key
ALTER TABLE products ADD CONSTRAINT check_tags_is_array
CHECK (jsonb_typeof(tags) = 'array');
-- Ensures tags is always a JSON array, not an object or scalar
ALTER TABLE products ADD CONSTRAINT check_price_positive
CHECK (
CASE
WHEN attributes ? 'weight_kg'
THEN (attributes->>'weight_kg')::numeric > 0
ELSE true
END
);
-- If weight_kg is present, it must be positive
These constraints fire on INSERT and UPDATE. They are not a replacement for application-level validation, but they are a safety net against bad data entering the database.
Using JSONB from Node.js with the pg Driver
Automatic Serialization
The pg driver automatically serializes JavaScript objects to JSONB and deserializes JSONB columns back to JavaScript objects. No manual JSON.stringify or JSON.parse needed:
var pg = require('pg');
var pool = new pg.Pool({
connectionString: process.env.DATABASE_URL
});
// INSERT: pass a plain JavaScript object -- pg handles serialization
function createProduct(sku, name, category, price, attributes, tags) {
var query = {
text: 'INSERT INTO products (sku, name, category, price, attributes, tags) VALUES ($1, $2, $3, $4, $5, $6) RETURNING *',
values: [sku, name, category, price, attributes, tags]
};
return pool.query(query).then(function(result) {
return result.rows[0];
});
}
// Usage
createProduct(
'MONITOR-007',
'Dell UltraSharp U2723QE',
'electronics',
619.99,
{ brand: 'Dell', screen_size: 27, resolution: '3840x2160', panel: 'IPS Black', ports: ['USB-C', 'HDMI', 'DisplayPort'] },
['monitor', '4k', 'usb-c']
).then(function(product) {
// product.attributes is already a JavaScript object
console.log(product.attributes.brand); // "Dell"
console.log(product.tags[0]); // "monitor"
});
Parameterized Queries with JSONB Operators
// Containment query -- find products by attributes
function findByAttributes(filter) {
var query = {
text: 'SELECT id, name, price, attributes FROM products WHERE attributes @> $1',
values: [filter]
};
return pool.query(query).then(function(result) {
return result.rows;
});
}
// Usage: find all Dell monitors
findByAttributes({ brand: 'Dell' }).then(function(products) {
products.forEach(function(p) {
console.log(p.name, p.price);
});
});
// Query by specific extracted path
function findByBrand(brand) {
var query = {
text: "SELECT id, name, price FROM products WHERE attributes->>'brand' = $1",
values: [brand]
};
return pool.query(query).then(function(result) {
return result.rows;
});
}
// Update a nested JSONB value
function updateAttribute(sku, path, value) {
var query = {
text: 'UPDATE products SET attributes = jsonb_set(attributes, $1, $2), updated_at = now() WHERE sku = $3 RETURNING *',
values: [path, JSON.stringify(value), sku]
};
return pool.query(query).then(function(result) {
return result.rows[0];
});
}
// Usage: update storage capacity
updateAttribute('LAPTOP-001', '{storage, capacity_gb}', 2048);
Merging Partial Updates
// Merge partial updates into existing JSONB
function mergeAttributes(sku, partialUpdate) {
var query = {
text: 'UPDATE products SET attributes = attributes || $1, updated_at = now() WHERE sku = $2 RETURNING *',
values: [partialUpdate, sku]
};
return pool.query(query).then(function(result) {
return result.rows[0];
});
}
// Usage: add sale info without touching other attributes
mergeAttributes('LAPTOP-001', { on_sale: true, discount_pct: 15 });
Complete Working Example: Product Catalog API
Here is a complete, working product catalog with a JSONB attributes column, GIN index, seed data, and a Node.js data access layer.
Schema
-- schema.sql
CREATE TABLE IF NOT EXISTS products (
id serial PRIMARY KEY,
sku varchar(50) UNIQUE NOT NULL,
name varchar(255) NOT NULL,
category varchar(100) NOT NULL,
price numeric(10,2) NOT NULL,
attributes jsonb NOT NULL DEFAULT '{}',
tags jsonb NOT NULL DEFAULT '[]',
created_at timestamptz DEFAULT now(),
updated_at timestamptz DEFAULT now()
);
-- GIN index for containment and existence queries
CREATE INDEX IF NOT EXISTS idx_products_attributes_gin
ON products USING GIN (attributes jsonb_path_ops);
CREATE INDEX IF NOT EXISTS idx_products_tags_gin
ON products USING GIN (tags);
-- Expression index for brand lookups
CREATE INDEX IF NOT EXISTS idx_products_brand
ON products ((attributes->>'brand'));
-- Expression index for screen_size range queries
CREATE INDEX IF NOT EXISTS idx_products_screen_size
ON products (((attributes->>'screen_size')::numeric))
WHERE attributes ? 'screen_size';
Seed Data
-- seed.sql
INSERT INTO products (sku, name, category, price, attributes, tags) VALUES
('LAPTOP-001', 'ThinkPad X1 Carbon Gen 11', 'electronics', 1429.99,
'{"brand": "Lenovo", "screen_size": 14, "resolution": "2560x1600", "ram_gb": 16, "storage": {"type": "NVMe SSD", "capacity_gb": 512}, "ports": ["USB-C", "USB-A", "HDMI", "Thunderbolt 4"], "weight_kg": 1.12}',
'["laptop", "ultrabook", "business"]'),
('LAPTOP-002', 'MacBook Pro 16"', 'electronics', 2499.00,
'{"brand": "Apple", "screen_size": 16.2, "resolution": "3456x2234", "ram_gb": 36, "storage": {"type": "NVMe SSD", "capacity_gb": 512}, "ports": ["Thunderbolt 4", "HDMI", "SD Card", "MagSafe"], "weight_kg": 2.14, "chip": "M3 Pro"}',
'["laptop", "creative", "premium"]'),
('MONITOR-001', 'Dell UltraSharp U2723QE', 'electronics', 619.99,
'{"brand": "Dell", "screen_size": 27, "resolution": "3840x2160", "panel": "IPS Black", "ports": ["USB-C", "HDMI", "DisplayPort"], "vesa_mount": true}',
'["monitor", "4k", "usb-c"]'),
('SHIRT-001', 'Classic Oxford Button-Down', 'clothing', 89.00,
'{"brand": "Brooks Brothers", "material": "100% cotton", "sizes_available": ["S", "M", "L", "XL"], "colors": ["white", "blue", "pink"], "care": {"machine_washable": true, "dry_clean": false}}',
'["shirt", "formal", "cotton"]'),
('CHAIR-001', 'Aeron Chair Remastered', 'furniture', 1395.00,
'{"brand": "Herman Miller", "material": "8Z Pellicle mesh", "adjustable": ["lumbar", "arms", "tilt", "height"], "weight_capacity_kg": 136, "warranty_years": 12}',
'["chair", "ergonomic", "office"]')
ON CONFLICT (sku) DO NOTHING;
Queries for Filtering by Nested Attributes
-- Find all laptops with at least 32GB RAM
SELECT name, price, attributes->>'ram_gb' AS ram
FROM products
WHERE category = 'electronics'
AND attributes @> '{"ram_gb": 36}';
-- Or for >= comparison:
SELECT name, price, (attributes->>'ram_gb')::int AS ram
FROM products
WHERE category = 'electronics'
AND (attributes->>'ram_gb')::int >= 32;
-- Find products with USB-C ports
SELECT name, price FROM products
WHERE attributes->'ports' @> '"USB-C"';
-- name | price
-- ThinkPad X1 Carbon Gen 11 | 1429.99
-- MacBook Pro 16" | 2499.00
-- Dell UltraSharp U2723QE | 619.99
-- Find products where storage type is NVMe SSD
SELECT name, attributes #>> '{storage, capacity_gb}' AS storage_gb
FROM products
WHERE attributes @> '{"storage": {"type": "NVMe SSD"}}';
-- Find adjustable furniture with lumbar support
SELECT name, price FROM products
WHERE attributes->'adjustable' @> '"lumbar"';
-- Full-text search within JSONB string values
SELECT name FROM products
WHERE attributes->>'material' ILIKE '%cotton%';
-- Aggregate: average price by brand
SELECT attributes->>'brand' AS brand, round(avg(price), 2) AS avg_price
FROM products
GROUP BY attributes->>'brand'
ORDER BY avg_price DESC;
Node.js Data Access Layer
// productModel.js
var pg = require('pg');
var pool = new pg.Pool({
connectionString: process.env.DATABASE_URL
});
var ProductModel = {
findAll: function(options) {
var conditions = ['1=1'];
var values = [];
var paramIndex = 1;
if (options.category) {
conditions.push('category = $' + paramIndex++);
values.push(options.category);
}
if (options.brand) {
conditions.push("attributes->>'brand' = $" + paramIndex++);
values.push(options.brand);
}
if (options.minPrice) {
conditions.push('price >= $' + paramIndex++);
values.push(options.minPrice);
}
if (options.maxPrice) {
conditions.push('price <= $' + paramIndex++);
values.push(options.maxPrice);
}
if (options.attributes) {
conditions.push('attributes @> $' + paramIndex++);
values.push(options.attributes);
}
if (options.tag) {
conditions.push('tags @> $' + paramIndex++);
values.push(JSON.stringify([options.tag]));
}
var query = {
text: 'SELECT id, sku, name, category, price, attributes, tags FROM products WHERE ' + conditions.join(' AND ') + ' ORDER BY created_at DESC',
values: values
};
return pool.query(query).then(function(result) {
return result.rows;
});
},
findBySku: function(sku) {
var query = {
text: 'SELECT * FROM products WHERE sku = $1',
values: [sku]
};
return pool.query(query).then(function(result) {
return result.rows[0] || null;
});
},
create: function(product) {
var query = {
text: 'INSERT INTO products (sku, name, category, price, attributes, tags) VALUES ($1, $2, $3, $4, $5, $6) RETURNING *',
values: [product.sku, product.name, product.category, product.price, product.attributes || {}, product.tags || []]
};
return pool.query(query).then(function(result) {
return result.rows[0];
});
},
updateAttributes: function(sku, partialAttributes) {
var query = {
text: 'UPDATE products SET attributes = attributes || $1, updated_at = now() WHERE sku = $2 RETURNING *',
values: [partialAttributes, sku]
};
return pool.query(query).then(function(result) {
return result.rows[0];
});
},
removeAttribute: function(sku, key) {
var query = {
text: 'UPDATE products SET attributes = attributes - $1, updated_at = now() WHERE sku = $2 RETURNING *',
values: [key, sku]
};
return pool.query(query).then(function(result) {
return result.rows[0];
});
},
addTag: function(sku, tag) {
var query = {
text: "UPDATE products SET tags = tags || $1::jsonb, updated_at = now() WHERE sku = $2 AND NOT tags @> $1::jsonb RETURNING *",
values: [JSON.stringify(tag), sku]
};
return pool.query(query).then(function(result) {
return result.rows[0];
});
},
findByPortType: function(portType) {
var query = {
text: "SELECT name, price, attributes->'ports' AS ports FROM products WHERE attributes->'ports' @> $1::jsonb",
values: [JSON.stringify(portType)]
};
return pool.query(query).then(function(result) {
return result.rows;
});
},
getAttributeStats: function(category) {
var query = {
text: "SELECT attributes->>'brand' AS brand, count(*) AS product_count, round(avg(price), 2) AS avg_price, min(price) AS min_price, max(price) AS max_price FROM products WHERE category = $1 GROUP BY attributes->>'brand' ORDER BY product_count DESC",
values: [category]
};
return pool.query(query).then(function(result) {
return result.rows;
});
}
};
module.exports = ProductModel;
Usage in an Express Route
// routes/products.js
var express = require('express');
var router = express.Router();
var ProductModel = require('../models/productModel');
router.get('/api/products', function(req, res) {
var options = {
category: req.query.category || null,
brand: req.query.brand || null,
tag: req.query.tag || null,
minPrice: req.query.min_price ? parseFloat(req.query.min_price) : null,
maxPrice: req.query.max_price ? parseFloat(req.query.max_price) : null
};
// Support arbitrary JSONB attribute filtering
if (req.query.attributes) {
try {
options.attributes = JSON.parse(req.query.attributes);
} catch (e) {
return res.status(400).json({ error: 'Invalid attributes JSON' });
}
}
ProductModel.findAll(options).then(function(products) {
res.json({ count: products.length, products: products });
}).catch(function(err) {
console.error('Product query failed:', err.message);
res.status(500).json({ error: 'Internal server error' });
});
});
// GET /api/products?category=electronics&brand=Lenovo
// GET /api/products?tag=laptop&min_price=1000
// GET /api/products?attributes={"storage":{"type":"NVMe SSD"}}
module.exports = router;
Performance: JSONB Query vs Normalized Table
I ran a benchmark on a table with 500,000 product rows to compare JSONB containment queries against equivalent normalized lookups.
-- JSONB approach: single table with GIN index
EXPLAIN ANALYZE
SELECT name, price FROM products WHERE attributes @> '{"brand": "Lenovo"}';
-- Bitmap Heap Scan on products (cost=44.08..5765.82 rows=5000 width=44) (actual time=1.245..8.731 rows=4821 loops=1)
-- Recheck Cond: (attributes @> '{"brand": "Lenovo"}'::jsonb)
-- Heap Blocks: exact=3218
-- -> Bitmap Index Scan on idx_products_attrs_pathops (cost=0.00..42.83 rows=5000 width=0) (actual time=0.812..0.812 rows=4821 loops=1)
-- Planning Time: 0.089 ms
-- Execution Time: 9.142 ms
-- Normalized approach: separate brand column with B-tree index
EXPLAIN ANALYZE
SELECT name, price FROM products_normalized WHERE brand = 'Lenovo';
-- Index Scan using idx_products_norm_brand on products_normalized (cost=0.42..234.61 rows=4821 width=44) (actual time=0.024..3.847 rows=4821 loops=1)
-- Planning Time: 0.071 ms
-- Execution Time: 4.221 ms
The normalized query is roughly 2x faster for point lookups. That is expected -- B-tree indexes are purpose-built for equality and range scans. But 9ms vs 4ms is not a meaningful difference for most applications. The JSONB approach wins when you need flexibility: you can filter on brand, material, screen_size, or any other attribute without schema migrations.
For high-traffic paths where you consistently filter on a specific key, add an expression index on that key. You get B-tree performance on that one path while keeping JSONB flexibility for everything else.
Common Issues and Troubleshooting
1. Operator Does Not Exist for JSON Type
ERROR: operator does not exist: json @> json
LINE 1: SELECT * FROM products WHERE data @> '{"brand": "Lenovo"}';
HINT: No operator matches the given name and argument types.
Cause: You created the column as json instead of jsonb. The @>, ?, ?|, and ?& operators only work with JSONB.
Fix: Alter the column type:
ALTER TABLE products ALTER COLUMN data TYPE jsonb USING data::jsonb;
2. Cannot Subscript Type jsonb with Text
ERROR: cannot subscript type jsonb because it does not support subscripting
Cause: You are using bracket notation (attributes['brand']) which is only supported in PostgreSQL 14+. Older versions require the -> operator.
Fix: Use -> or ->> operators, or upgrade PostgreSQL:
-- Instead of: attributes['brand']
-- Use: attributes->'brand' (returns jsonb)
-- Or: attributes->>'brand' (returns text)
3. GIN Index Not Being Used
EXPLAIN ANALYZE
SELECT * FROM products WHERE attributes->>'brand' = 'Lenovo';
-- Seq Scan on products (cost=0.00..28432.00 rows=5000 width=312) (actual time=0.031..145.812 rows=4821 loops=1)
-- Filter: ((attributes ->> 'brand'::text) = 'Lenovo'::text)
Cause: The ->> extraction operator does not use GIN indexes. GIN indexes support @>, ?, ?|, and ?& operators.
Fix: Either rewrite the query to use containment, or create an expression index:
-- Option A: Use containment (uses GIN index)
SELECT * FROM products WHERE attributes @> '{"brand": "Lenovo"}';
-- Option B: Create expression index for ->> queries
CREATE INDEX idx_brand ON products ((attributes->>'brand'));
SELECT * FROM products WHERE attributes->>'brand' = 'Lenovo';
4. jsonb_set Returns NULL When Path Does Not Exist
SELECT jsonb_set('{"a": 1}', '{b, c}', '"hello"');
-- NULL (if create_missing is false) or error for intermediate paths
Cause: jsonb_set with create_missing = true (the default) only creates the final key. It will not create intermediate objects in the path. If {b} does not exist as an object, PostgreSQL cannot navigate into it to set {b, c}.
Fix: Build intermediate structure first:
-- First ensure the intermediate path exists
UPDATE products
SET attributes = jsonb_set(
CASE
WHEN attributes ? 'specs' THEN attributes
ELSE jsonb_set(attributes, '{specs}', '{}')
END,
'{specs, weight}',
'"1.2kg"'
)
WHERE sku = 'LAPTOP-001';
5. Casting Errors on Missing or NULL JSONB Values
ERROR: invalid input syntax for type numeric: ""
Cause: Casting attributes->>'field' to a numeric type when the field does not exist (returns NULL) or is empty string.
Fix: Add a NULL check or use COALESCE:
-- Safe numeric extraction
SELECT name, COALESCE((attributes->>'screen_size')::numeric, 0) AS screen_size
FROM products;
-- Filter with existence check first
SELECT name FROM products
WHERE attributes ? 'screen_size'
AND (attributes->>'screen_size')::numeric > 15;
Best Practices
Default to JSONB, not JSON. There is almost no reason to use the
jsontype. JSONB gives you indexing, operators, and better read performance. The only exception is when you must preserve exact formatting of the original document.Keep normalized columns for your primary query paths. If you query by
categoryin every request, that should be a real column with a real index, not buried inside JSONB. Use JSONB for the flexible, variable parts of your data model.Always create GIN indexes on JSONB columns you query. An unindexed JSONB containment query on 100k rows takes 50-100ms. With a GIN index, it takes 1-5ms. Use
jsonb_path_opswhen you only need@>containment queries -- the index is significantly smaller.Add expression indexes for hot paths. If your most common query filters on
attributes->>'brand', create a B-tree expression index on that path. You get normalized-column performance for that specific query while keeping JSONB flexibility for everything else.Use
||for shallow merges,jsonb_setfor deep updates. The||concatenation operator is clean for updating top-level keys. For nested paths, usejsonb_set. Never replace the entire JSONB column when you only need to change one field -- concurrent updates will lose data.Validate JSONB at the application layer and add CHECK constraints as a safety net. JSONB accepts any valid JSON, which means typos in key names and wrong value types slip through silently. Use CHECK constraints for critical invariants (required keys, value types), and validate structure in your application code before insert.
Use parameterized queries for JSONB values in Node.js. The
pgdriver handles serialization automatically. Pass JavaScript objects directly as query parameters -- do not manuallyJSON.stringifyunless you are constructing a path forjsonb_set. This prevents SQL injection and ensures correct escaping.Avoid deeply nested JSONB structures (3+ levels). Queries and updates on deeply nested paths become hard to read and easy to get wrong. If you find yourself writing
attributes->'specs'->'display'->'resolution'->'horizontal', that is a sign to flatten the structure or normalize those fields.Monitor JSONB column sizes. JSONB documents are stored inline in the row up to about 2KB, then TOASTed (compressed and stored out-of-line). Very large JSONB values (100KB+) can cause performance issues because each row access may require decompression. If your JSONB values grow unbounded, consider splitting them into a separate table.
