Embeddings

Hybrid Search: Combining Full-Text and Vector Search

Build hybrid search combining PostgreSQL full-text and pgvector semantic search with Reciprocal Rank Fusion in Node.js.

Hybrid Search: Combining Full-Text and Vector Search

Hybrid search merges traditional keyword-based full-text search with vector-based semantic search to deliver results that are both precisely relevant and conceptually aware. Neither approach alone is sufficient for production search systems. Full-text search excels at exact term matching but misses synonyms and intent, while vector search captures meaning but can return results that lack the specific terms users expect. Combining them with a principled fusion strategy like Reciprocal Rank Fusion produces search quality that neither approach achieves independently.

Prerequisites

  • Node.js v18+ installed
  • PostgreSQL 14+ with the pgvector extension enabled
  • A working understanding of PostgreSQL full-text search concepts
  • Familiarity with embeddings and vector similarity (covered in earlier articles in this series)
  • An OpenAI API key or another embeddings provider
  • The pg and openai npm packages installed

Why Hybrid Search Beats Pure Approaches

I have built search systems that relied entirely on Elasticsearch full-text search and others that went all-in on vector similarity. Both approaches have failure modes that frustrate users.

Pure keyword search fails when a user searches for "how to fix memory leaks in node" but your best article is titled "Debugging V8 Heap Exhaustion in Server-Side JavaScript." There is zero term overlap, so the article never surfaces. Full-text search is also brittle with abbreviations, misspellings, and domain-specific jargon.

Pure vector search fails in the opposite direction. A user searches for "Express.js middleware error handling" and gets results about Koa error handling, Fastify error hooks, and general Node.js error patterns. Semantically these are close, but the user specifically wants Express.js. Vector search lacks the precision to enforce exact term presence.

Hybrid search solves both problems. The keyword component ensures that documents containing the user's exact terms get a ranking boost, while the semantic component surfaces conceptually relevant documents that the keyword engine would miss entirely. In every system I have deployed this pattern, search satisfaction metrics improved by 20-40% compared to either approach alone.

PostgreSQL Full-Text Search Fundamentals

PostgreSQL ships with a powerful full-text search engine that most developers underutilize. The core primitives are tsvector (a sorted list of lexemes derived from a document) and tsquery (a structured search query).

Creating a Search-Ready Table

CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL,
    body TEXT NOT NULL,
    category TEXT,
    created_at TIMESTAMP DEFAULT NOW(),
    search_vector tsvector GENERATED ALWAYS AS (
        setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
        setweight(to_tsvector('english', coalesce(body, '')), 'B')
    ) STORED,
    embedding vector(1536)
);

CREATE INDEX idx_documents_search ON documents USING GIN (search_vector);
CREATE INDEX idx_documents_embedding ON documents USING ivfflat (embedding vector_cosine_ops)
    WITH (lists = 100);

The setweight function assigns importance levels A through D to different fields. Title matches weighted as 'A' will score higher than body matches weighted as 'B'. The GENERATED ALWAYS AS ... STORED column automatically maintains the search vector whenever the title or body changes, eliminating the need for manual triggers.

Understanding ts_rank

PostgreSQL's ts_rank function scores how well a document matches a query. It considers term frequency, proximity, and the weights assigned to different document sections.

SELECT id, title,
    ts_rank(search_vector, plainto_tsquery('english', 'memory management')) AS rank
FROM documents
WHERE search_vector @@ plainto_tsquery('english', 'memory management')
ORDER BY rank DESC
LIMIT 20;

The @@ operator filters to only matching documents, and ts_rank provides a numeric score for ordering. This is the keyword half of our hybrid system.

pgvector Semantic Search Recap

With the pgvector extension, PostgreSQL stores and queries high-dimensional vectors natively. You generate an embedding for the user's query, then find documents whose embeddings are closest in vector space.

SELECT id, title,
    1 - (embedding <=> $1::vector) AS similarity
FROM documents
ORDER BY embedding <=> $1::vector
LIMIT 20;

The <=> operator computes cosine distance. Subtracting from 1 converts it to a similarity score where higher is better. This is the semantic half of our hybrid system.

Reciprocal Rank Fusion (RRF)

Reciprocal Rank Fusion is a rank aggregation method that combines result lists from multiple retrieval systems without requiring score normalization. This is critical because ts_rank scores and cosine similarity scores are on completely different scales. Trying to normalize and combine them directly is fragile and error-prone.

RRF works on a simple principle: a document's fused score is the sum of 1 / (k + rank) across each retrieval system, where rank is the document's position in that system's result list and k is a constant (typically 60) that dampens the influence of high-ranking positions.

The formula:

RRF_score(d) = sum( 1 / (k + rank_i(d)) ) for each retrieval system i

If a document ranks 3rd in keyword search and 7th in vector search with k=60:

RRF_score = 1/(60+3) + 1/(60+7) = 0.01587 + 0.01493 = 0.03080

Documents appearing in both result lists naturally score higher than documents appearing in only one, which is exactly the behavior we want.

Implementing Hybrid Search in a Single PostgreSQL Query

The most efficient approach runs both searches and fuses results in a single database round trip using Common Table Expressions (CTEs).

WITH keyword_results AS (
    SELECT id, title, body, category, created_at,
        ROW_NUMBER() OVER (ORDER BY ts_rank(search_vector, query) DESC) AS kw_rank
    FROM documents, plainto_tsquery('english', $1) AS query
    WHERE search_vector @@ query
    LIMIT 50
),
vector_results AS (
    SELECT id, title, body, category, created_at,
        ROW_NUMBER() OVER (ORDER BY embedding <=> $2::vector) AS vec_rank
    FROM documents
    ORDER BY embedding <=> $2::vector
    LIMIT 50
),
fused AS (
    SELECT
        COALESCE(k.id, v.id) AS id,
        COALESCE(k.title, v.title) AS title,
        COALESCE(k.body, v.body) AS body,
        COALESCE(k.category, v.category) AS category,
        COALESCE(k.created_at, v.created_at) AS created_at,
        COALESCE(1.0 / (60 + k.kw_rank), 0) AS kw_score,
        COALESCE(1.0 / (60 + v.vec_rank), 0) AS vec_score,
        COALESCE(1.0 / (60 + k.kw_rank), 0) * $3 +
        COALESCE(1.0 / (60 + v.vec_rank), 0) * $4 AS hybrid_score
    FROM keyword_results k
    FULL OUTER JOIN vector_results v ON k.id = v.id
)
SELECT id, title, body, category, created_at,
    kw_score, vec_score, hybrid_score
FROM fused
ORDER BY hybrid_score DESC
LIMIT $5;

Parameters $3 and $4 are the keyword and semantic weights respectively. Setting them to 1.0 and 1.0 gives equal weight. Setting $3 to 1.5 and $4 to 1.0 biases toward keyword matches. The FULL OUTER JOIN ensures documents appearing in only one result set are still included.

Weighted Scoring: Tuning Keyword vs Semantic Weight

The optimal weight ratio depends on your content and users. In my experience, the following guidelines work as a starting point:

  • Technical documentation: keyword weight 1.5, semantic weight 1.0. Users often search for exact function names, error codes, and configuration keys.
  • Blog posts and tutorials: keyword weight 1.0, semantic weight 1.3. Users describe problems in natural language that may not match article terminology.
  • Product catalogs: keyword weight 1.2, semantic weight 1.0. Product names and model numbers matter, but "comfortable office chair" should also work.
  • Support tickets: keyword weight 0.8, semantic weight 1.5. Users describe symptoms in varied language, so semantic matching dominates.

These weights should be validated through measurement, which we cover in the search quality section below.

Handling Edge Cases

No Keyword Matches

When no documents match the keyword query, the keyword_results CTE returns zero rows. The FULL OUTER JOIN ensures vector results still appear, with kw_score defaulting to 0 via COALESCE. This graceful degradation means the system falls back to pure semantic search automatically.

No Vector Matches

This case is rare because vector search always returns results ordered by distance. However, if the embedding service fails and you cannot generate a query embedding, you need a fallback.

function buildHybridQuery(searchText, queryEmbedding, options) {
    var keywordWeight = options.keywordWeight || 1.0;
    var semanticWeight = options.semanticWeight || 1.0;
    var limit = options.limit || 20;

    if (!queryEmbedding) {
        // Fall back to keyword-only search
        return {
            text: 'SELECT id, title, body, category, created_at, ' +
                  'ts_rank(search_vector, plainto_tsquery($1)) AS score ' +
                  'FROM documents WHERE search_vector @@ plainto_tsquery($1) ' +
                  'ORDER BY score DESC LIMIT $2',
            values: [searchText, limit]
        };
    }

    if (!searchText || searchText.trim() === '') {
        // Fall back to vector-only search
        return {
            text: 'SELECT id, title, body, category, created_at, ' +
                  '1 - (embedding <=> $1::vector) AS score ' +
                  'FROM documents ORDER BY embedding <=> $1::vector LIMIT $2',
            values: [JSON.stringify(queryEmbedding), limit]
        };
    }

    // Full hybrid query
    return {
        text: HYBRID_QUERY_SQL,
        values: [searchText, JSON.stringify(queryEmbedding),
                 keywordWeight, semanticWeight, limit]
    };
}

Empty or Very Short Queries

Single-character or empty queries should be rejected before hitting the database. Queries of one or two characters rarely produce meaningful full-text search results and put unnecessary load on the GIN index.

Query Preprocessing

Effective hybrid search requires preprocessing the query differently for each search pathway.

Keyword Path: Expanding and Cleaning

var ABBREVIATIONS = {
    'js': 'javascript',
    'ts': 'typescript',
    'db': 'database',
    'api': 'application programming interface',
    'k8s': 'kubernetes',
    'pg': 'postgresql',
    'auth': 'authentication',
    'env': 'environment'
};

function preprocessKeywordQuery(rawQuery) {
    var query = rawQuery.toLowerCase().trim();

    // Expand known abbreviations while keeping the original
    var words = query.split(/\s+/);
    var expanded = words.map(function(word) {
        if (ABBREVIATIONS[word]) {
            return word + ' ' + ABBREVIATIONS[word];
        }
        return word;
    });

    return expanded.join(' ');
}

For the keyword path, expanding abbreviations helps PostgreSQL's text search find more matches. The plainto_tsquery function handles basic stemming, but it does not know that "js" means "javascript."

Semantic Path: Embedding the Query

var OpenAI = require('openai');
var openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

function getQueryEmbedding(query) {
    return openai.embeddings.create({
        model: 'text-embedding-3-small',
        input: query
    }).then(function(response) {
        return response.data[0].embedding;
    });
}

For the semantic path, the raw query is usually fine because embedding models handle abbreviations, typos, and natural language well. However, very short queries (one or two words) produce better embeddings if you expand them with a brief context prompt like "Search query about software engineering: " prepended to the query text.

Faceted Hybrid Search

Real search systems need filtering. Users want to search within a specific category, date range, or content type. Adding facets to hybrid search requires applying the same filters to both CTEs.

WITH keyword_results AS (
    SELECT id, title, body, category, created_at,
        ROW_NUMBER() OVER (ORDER BY ts_rank(search_vector, query) DESC) AS kw_rank
    FROM documents, plainto_tsquery('english', $1) AS query
    WHERE search_vector @@ query
        AND ($6::text IS NULL OR category = $6)
        AND ($7::timestamp IS NULL OR created_at >= $7)
    LIMIT 50
),
vector_results AS (
    SELECT id, title, body, category, created_at,
        ROW_NUMBER() OVER (ORDER BY embedding <=> $2::vector) AS vec_rank
    FROM documents
    WHERE ($6::text IS NULL OR category = $6)
        AND ($7::timestamp IS NULL OR created_at >= $7)
    ORDER BY embedding <=> $2::vector
    LIMIT 50
),
-- ... rest of the fusion query remains the same

The $6::text IS NULL OR category = $6 pattern lets you optionally filter by category. When $6 is NULL, the condition is always true and the filter is effectively disabled.

Boosting Recent Results

For content that has a temporal relevance component, you can add a recency boost to the hybrid score. This prevents stale content from dominating results simply because it has accumulated more keyword matches or happened to embed well.

-- In the fused CTE, add a recency factor
COALESCE(1.0 / (60 + k.kw_rank), 0) * $3 +
COALESCE(1.0 / (60 + v.vec_rank), 0) * $4 +
(1.0 / (1 + EXTRACT(EPOCH FROM NOW() - COALESCE(k.created_at, v.created_at)) / 86400.0)) * $5
    AS hybrid_score

Parameter $5 controls the recency weight. A value of 0.1 gives a gentle recency nudge. A value of 1.0 aggressively favors new content. The formula produces a score that decays from 1.0 to near-zero as content ages, measured in days.

A/B Testing Search Strategies

Before committing to hybrid search, run a controlled experiment comparing three variants.

function getSearchVariant(userId) {
    // Deterministic assignment based on user ID
    var hash = 0;
    for (var i = 0; i < userId.length; i++) {
        hash = ((hash << 5) - hash) + userId.charCodeAt(i);
        hash = hash & hash; // Convert to 32-bit integer
    }
    var bucket = Math.abs(hash) % 3;

    if (bucket === 0) return 'keyword';
    if (bucket === 1) return 'semantic';
    return 'hybrid';
}

function executeSearch(userId, query, queryEmbedding) {
    var variant = getSearchVariant(userId);

    var searchPromise;
    if (variant === 'keyword') {
        searchPromise = keywordSearch(query);
    } else if (variant === 'semantic') {
        searchPromise = semanticSearch(queryEmbedding);
    } else {
        searchPromise = hybridSearch(query, queryEmbedding);
    }

    return searchPromise.then(function(results) {
        logSearchEvent(userId, variant, query, results);
        return { results: results, variant: variant };
    });
}

function logSearchEvent(userId, variant, query, results) {
    var event = {
        userId: userId,
        variant: variant,
        query: query,
        resultCount: results.length,
        topResultIds: results.slice(0, 5).map(function(r) { return r.id; }),
        timestamp: new Date().toISOString()
    };

    // Log to your analytics system
    pool.query(
        'INSERT INTO search_events (user_id, variant, query, result_count, top_results, created_at) ' +
        'VALUES ($1, $2, $3, $4, $5, $6)',
        [event.userId, event.variant, event.query,
         event.resultCount, JSON.stringify(event.topResultIds), event.timestamp]
    );
}

Track click-through rates and time-to-click for each variant. After collecting a statistically significant sample (typically 1000+ searches per variant), compare the metrics. In my experience, hybrid consistently wins, but the margin varies by domain.

Measuring Search Quality

Quantitative search quality measurement requires two things: a set of test queries with known relevant documents (relevance judgments) and metrics that capture ranking quality.

Normalized Discounted Cumulative Gain (nDCG)

nDCG measures how well your system ranks relevant documents at the top of results. It accounts for graded relevance (some documents are more relevant than others) and applies a logarithmic discount to positions further down the results list.

function calculateDCG(relevanceScores) {
    var dcg = 0;
    for (var i = 0; i < relevanceScores.length; i++) {
        dcg += relevanceScores[i] / Math.log2(i + 2);
    }
    return dcg;
}

function calculateNDCG(resultIds, relevanceJudgments) {
    // Get relevance scores in result order
    var relevanceScores = resultIds.map(function(id) {
        return relevanceJudgments[id] || 0;
    });

    var dcg = calculateDCG(relevanceScores);

    // Calculate ideal DCG (best possible ranking)
    var idealScores = Object.values(relevanceJudgments)
        .sort(function(a, b) { return b - a; })
        .slice(0, resultIds.length);
    var idcg = calculateDCG(idealScores);

    if (idcg === 0) return 0;
    return dcg / idcg;
}

Mean Reciprocal Rank (MRR)

MRR is simpler. It measures the reciprocal of the rank position of the first relevant result. If the first relevant document is at position 3, the reciprocal rank is 1/3.

function calculateMRR(queries, searchFunction) {
    var reciprocalRanks = [];

    return Promise.all(queries.map(function(q) {
        return searchFunction(q.query).then(function(results) {
            for (var i = 0; i < results.length; i++) {
                if (q.relevantIds.indexOf(results[i].id) !== -1) {
                    reciprocalRanks.push(1.0 / (i + 1));
                    return;
                }
            }
            reciprocalRanks.push(0);
        });
    })).then(function() {
        var sum = reciprocalRanks.reduce(function(a, b) { return a + b; }, 0);
        return sum / reciprocalRanks.length;
    });
}

Building Relevance Judgments

Create a test set of 50-100 queries with manually labeled relevant documents. Store them in a table:

CREATE TABLE relevance_judgments (
    query TEXT NOT NULL,
    document_id INTEGER REFERENCES documents(id),
    relevance INTEGER NOT NULL CHECK (relevance BETWEEN 0 AND 3),
    PRIMARY KEY (query, document_id)
);
-- relevance: 0 = irrelevant, 1 = marginally relevant, 2 = relevant, 3 = highly relevant

Run your evaluation suite whenever you change weights, add preprocessing steps, or modify the fusion strategy.

Building a Search Quality Feedback Loop

Implicit feedback from user behavior is more scalable than manual relevance judgments. Track which results users click, how long they spend on the page, and whether they return to search for the same query.

function trackResultClick(searchId, documentId, position) {
    return pool.query(
        'INSERT INTO search_clicks (search_event_id, document_id, position, clicked_at) ' +
        'VALUES ($1, $2, $3, NOW())',
        [searchId, documentId, position]
    );
}

function analyzeSearchQuality(daysBack) {
    var query = 'SELECT se.variant, ' +
        'COUNT(DISTINCT se.id) AS total_searches, ' +
        'COUNT(DISTINCT sc.id) AS searches_with_clicks, ' +
        'ROUND(COUNT(DISTINCT sc.id)::numeric / COUNT(DISTINCT se.id) * 100, 2) AS ctr, ' +
        'ROUND(AVG(sc.position)::numeric, 2) AS avg_click_position ' +
        'FROM search_events se ' +
        'LEFT JOIN search_clicks sc ON se.id = sc.search_event_id ' +
        'WHERE se.created_at > NOW() - INTERVAL \'$1 days\' ' +
        'GROUP BY se.variant';

    return pool.query(query, [daysBack]);
}

A lower average click position means users are finding what they need at the top of results. A higher click-through rate means users are finding results worth clicking. Both metrics should improve with hybrid search compared to either pure approach.

Performance Optimization for Hybrid Queries

Hybrid queries are inherently more expensive than single-mode queries because they run two retrievals. Here are the techniques I use to keep response times under 100ms.

Limit the CTE Window

Retrieve only 50 candidates from each pathway instead of scanning the entire table. The RRF fusion only needs the top-N from each source to produce good results. Going beyond 50 provides diminishing returns.

Use Partial Indexes

If most queries filter by category, create partial GIN indexes:

CREATE INDEX idx_docs_search_ai ON documents USING GIN (search_vector)
    WHERE category = 'ai-integration';
CREATE INDEX idx_docs_search_backend ON documents USING GIN (search_vector)
    WHERE category = 'backend';

Cache Embeddings

Query embedding generation adds 100-300ms of latency. Cache embeddings for repeated queries using a simple LRU cache:

var LRU = require('lru-cache');
var embeddingCache = new LRU({ max: 1000, ttl: 1000 * 60 * 60 });

function getCachedEmbedding(query) {
    var normalizedQuery = query.toLowerCase().trim();
    var cached = embeddingCache.get(normalizedQuery);
    if (cached) {
        return Promise.resolve(cached);
    }

    return getQueryEmbedding(normalizedQuery).then(function(embedding) {
        embeddingCache.set(normalizedQuery, embedding);
        return embedding;
    });
}

Use IVFFlat Index Parameters

Tune the probes parameter at query time for the vector search. More probes means better recall but slower queries.

SET ivfflat.probes = 10;  -- Default is 1, 10 is a good balance

For production, set this in your connection setup so every query benefits:

pool.on('connect', function(client) {
    client.query('SET ivfflat.probes = 10');
});

Complete Working Example

Here is a complete Node.js module that implements hybrid search with all the patterns discussed above.

var pg = require('pg');
var OpenAI = require('openai');
var LRU = require('lru-cache');

var pool = new pg.Pool({
    connectionString: process.env.DATABASE_URL
});

var openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

var embeddingCache = new LRU({ max: 1000, ttl: 1000 * 60 * 60 });

var HYBRID_SQL = [
    'WITH keyword_results AS (',
    '    SELECT id, title, body, category, created_at,',
    '        ROW_NUMBER() OVER (ORDER BY ts_rank(search_vector, plainto_tsquery(\'english\', $1)) DESC) AS kw_rank',
    '    FROM documents, plainto_tsquery(\'english\', $1) AS query',
    '    WHERE search_vector @@ query',
    '        AND ($6::text IS NULL OR category = $6)',
    '    LIMIT 50',
    '),',
    'vector_results AS (',
    '    SELECT id, title, body, category, created_at,',
    '        ROW_NUMBER() OVER (ORDER BY embedding <=> $2::vector) AS vec_rank',
    '    FROM documents',
    '    WHERE ($6::text IS NULL OR category = $6)',
    '    ORDER BY embedding <=> $2::vector',
    '    LIMIT 50',
    '),',
    'fused AS (',
    '    SELECT',
    '        COALESCE(k.id, v.id) AS id,',
    '        COALESCE(k.title, v.title) AS title,',
    '        COALESCE(k.body, v.body) AS body,',
    '        COALESCE(k.category, v.category) AS category,',
    '        COALESCE(k.created_at, v.created_at) AS created_at,',
    '        COALESCE(1.0 / (60 + k.kw_rank), 0) AS kw_rrf,',
    '        COALESCE(1.0 / (60 + v.vec_rank), 0) AS vec_rrf,',
    '        COALESCE(1.0 / (60 + k.kw_rank), 0) * $3 +',
    '        COALESCE(1.0 / (60 + v.vec_rank), 0) * $4 +',
    '        (1.0 / (1 + EXTRACT(EPOCH FROM NOW() - COALESCE(k.created_at, v.created_at)) / 86400.0)) * $5',
    '            AS hybrid_score',
    '    FROM keyword_results k',
    '    FULL OUTER JOIN vector_results v ON k.id = v.id',
    ')',
    'SELECT id, title, body, category, created_at,',
    '    ROUND(kw_rrf::numeric, 6) AS keyword_score,',
    '    ROUND(vec_rrf::numeric, 6) AS semantic_score,',
    '    ROUND(hybrid_score::numeric, 6) AS score',
    'FROM fused',
    'ORDER BY hybrid_score DESC',
    'LIMIT $7'
].join('\n');

var ABBREVIATIONS = {
    'js': 'javascript',
    'ts': 'typescript',
    'db': 'database',
    'api': 'application programming interface',
    'k8s': 'kubernetes',
    'pg': 'postgresql',
    'auth': 'authentication',
    'env': 'environment',
    'config': 'configuration',
    'dev': 'development',
    'prod': 'production'
};

function preprocessKeyword(rawQuery) {
    var query = rawQuery.toLowerCase().trim();
    var words = query.split(/\s+/);
    var expanded = words.map(function(word) {
        if (ABBREVIATIONS[word]) {
            return word + ' ' + ABBREVIATIONS[word];
        }
        return word;
    });
    return expanded.join(' ');
}

function getEmbedding(query) {
    var normalized = query.toLowerCase().trim();
    var cached = embeddingCache.get(normalized);
    if (cached) {
        return Promise.resolve(cached);
    }

    return openai.embeddings.create({
        model: 'text-embedding-3-small',
        input: normalized
    }).then(function(response) {
        var embedding = response.data[0].embedding;
        embeddingCache.set(normalized, embedding);
        return embedding;
    });
}

function hybridSearch(query, options) {
    options = options || {};
    var keywordWeight = options.keywordWeight || 1.0;
    var semanticWeight = options.semanticWeight || 1.0;
    var recencyWeight = options.recencyWeight || 0.0;
    var category = options.category || null;
    var limit = options.limit || 20;

    if (!query || query.trim().length < 2) {
        return Promise.reject(new Error('Query must be at least 2 characters'));
    }

    var keywordQuery = preprocessKeyword(query);

    return getEmbedding(query).then(function(embedding) {
        return pool.query(HYBRID_SQL, [
            keywordQuery,
            JSON.stringify(embedding),
            keywordWeight,
            semanticWeight,
            recencyWeight,
            category,
            limit
        ]);
    }).then(function(result) {
        return result.rows.map(function(row) {
            return {
                id: row.id,
                title: row.title,
                body: row.body,
                category: row.category,
                createdAt: row.created_at,
                scores: {
                    keyword: parseFloat(row.keyword_score),
                    semantic: parseFloat(row.semantic_score),
                    combined: parseFloat(row.score)
                }
            };
        });
    }).catch(function(err) {
        // Fall back to keyword-only if embedding fails
        console.error('Embedding failed, falling back to keyword search:', err.message);
        return keywordOnlySearch(keywordQuery, category, limit);
    });
}

function keywordOnlySearch(query, category, limit) {
    var sql = 'SELECT id, title, body, category, created_at, ' +
              'ts_rank(search_vector, plainto_tsquery(\'english\', $1)) AS score ' +
              'FROM documents ' +
              'WHERE search_vector @@ plainto_tsquery(\'english\', $1) ' +
              'AND ($2::text IS NULL OR category = $2) ' +
              'ORDER BY score DESC LIMIT $3';

    return pool.query(sql, [query, category, limit]).then(function(result) {
        return result.rows.map(function(row) {
            return {
                id: row.id,
                title: row.title,
                body: row.body,
                category: row.category,
                createdAt: row.created_at,
                scores: {
                    keyword: parseFloat(row.score),
                    semantic: 0,
                    combined: parseFloat(row.score)
                }
            };
        });
    });
}

// Search quality evaluation
function evaluateSearch(testQueries, searchFn) {
    var results = {
        ndcg: [],
        mrr: []
    };

    return Promise.all(testQueries.map(function(tq) {
        return searchFn(tq.query, {}).then(function(searchResults) {
            var resultIds = searchResults.map(function(r) { return r.id; });

            // Calculate nDCG
            var relevanceScores = resultIds.map(function(id) {
                return tq.judgments[id] || 0;
            });
            var dcg = calculateDCG(relevanceScores);
            var idealScores = Object.values(tq.judgments)
                .sort(function(a, b) { return b - a; })
                .slice(0, resultIds.length);
            var idcg = calculateDCG(idealScores);
            results.ndcg.push(idcg === 0 ? 0 : dcg / idcg);

            // Calculate MRR
            for (var i = 0; i < resultIds.length; i++) {
                if (tq.relevantIds.indexOf(resultIds[i]) !== -1) {
                    results.mrr.push(1.0 / (i + 1));
                    return;
                }
            }
            results.mrr.push(0);
        });
    })).then(function() {
        var avgNDCG = results.ndcg.reduce(function(a, b) { return a + b; }, 0) / results.ndcg.length;
        var avgMRR = results.mrr.reduce(function(a, b) { return a + b; }, 0) / results.mrr.length;
        return { ndcg: avgNDCG, mrr: avgMRR };
    });
}

function calculateDCG(scores) {
    var dcg = 0;
    for (var i = 0; i < scores.length; i++) {
        dcg += scores[i] / Math.log2(i + 2);
    }
    return dcg;
}

// Express route handler
function searchRoute(req, res) {
    var query = req.query.q;
    var category = req.query.category || null;
    var keywordWeight = parseFloat(req.query.kw) || 1.0;
    var semanticWeight = parseFloat(req.query.sw) || 1.0;
    var limit = parseInt(req.query.limit) || 20;

    if (!query || query.trim().length < 2) {
        return res.status(400).json({ error: 'Query must be at least 2 characters' });
    }

    hybridSearch(query, {
        keywordWeight: keywordWeight,
        semanticWeight: semanticWeight,
        category: category,
        limit: Math.min(limit, 100)
    }).then(function(results) {
        res.json({
            query: query,
            count: results.length,
            results: results
        });
    }).catch(function(err) {
        console.error('Search failed:', err);
        res.status(500).json({ error: 'Search failed' });
    });
}

module.exports = {
    hybridSearch: hybridSearch,
    searchRoute: searchRoute,
    evaluateSearch: evaluateSearch
};

Wiring Up the Express Route

var express = require('express');
var router = express.Router();
var search = require('./hybridSearch');

router.get('/api/search', search.searchRoute);

module.exports = router;

Usage

# Basic search
curl "http://localhost:8080/api/search?q=memory+management+nodejs"

# Search with category filter
curl "http://localhost:8080/api/search?q=error+handling&category=backend"

# Search with custom weights (favor keyword matches)
curl "http://localhost:8080/api/search?q=Express.js+middleware&kw=1.5&sw=1.0"

Common Issues and Troubleshooting

1. "function plainto_tsquery(unknown, unknown) does not exist"

This happens when PostgreSQL cannot determine the text search configuration to use. Always specify the configuration explicitly:

-- Wrong
plainto_tsquery($1)

-- Correct
plainto_tsquery('english', $1)

2. "operator does not exist: tsvector @@ text"

You are passing a raw string where a tsquery is expected. Wrap the search term with plainto_tsquery or to_tsquery:

// Wrong
pool.query("SELECT * FROM docs WHERE search_vector @@ $1", [searchTerm]);

// Correct
pool.query("SELECT * FROM docs WHERE search_vector @@ plainto_tsquery('english', $1)", [searchTerm]);

3. "could not access file 'vector': No such file or directory"

The pgvector extension is not installed on your PostgreSQL server. Install it first:

# Ubuntu/Debian
sudo apt install postgresql-16-pgvector

# macOS with Homebrew
brew install pgvector

Then enable it in your database:

CREATE EXTENSION IF NOT EXISTS vector;

4. "different vector dimensions 1536 and 768"

Your query embedding was generated with a different model than the stored document embeddings. All embeddings in a system must use the same model. If you switch from text-embedding-ada-002 (1536 dimensions) to text-embedding-3-small with a reduced dimension count, you must re-embed all documents.

// Check your embedding dimensions
getEmbedding("test query").then(function(embedding) {
    console.log("Query embedding dimensions:", embedding.length);
});

// Compare with stored dimensions
pool.query("SELECT vector_dims(embedding) FROM documents LIMIT 1")
    .then(function(result) {
        console.log("Stored embedding dimensions:", result.rows[0].vector_dims);
    });

5. "syntax error in tsquery" when using special characters

User queries with characters like &, |, !, or : break to_tsquery. Use plainto_tsquery or websearch_to_tsquery instead, which handle raw user input safely:

// Breaks with input like "node.js & express"
pool.query("... WHERE search_vector @@ to_tsquery('english', $1)", [userInput]);

// Safe with any user input
pool.query("... WHERE search_vector @@ plainto_tsquery('english', $1)", [userInput]);

// Even better: supports quoted phrases and minus operator
pool.query("... WHERE search_vector @@ websearch_to_tsquery('english', $1)", [userInput]);

6. Hybrid query returns results but scores are all zero for one column

This typically means the FULL OUTER JOIN is working but one of the CTEs returned zero rows. Check that your keyword CTE is actually finding matches by running the keyword search independently. If the search text has no matches in the GIN index, all kw_rrf scores will be zero and you are effectively running pure vector search.

Best Practices

  • Always specify the text search configuration ('english') in to_tsvector and plainto_tsquery calls. Relying on default_text_search_config leads to inconsistent behavior across environments.

  • Use websearch_to_tsquery instead of plainto_tsquery for user-facing search. It supports quoted phrases ("exact match") and exclusion (-unwanted) without the syntax error risk of raw to_tsquery. Available in PostgreSQL 11+.

  • Set the RRF k parameter to 60 as a default. The original RRF paper uses this value, and in practice it works well for most document collections. Only adjust if you have strong empirical evidence from your own evaluation metrics.

  • Retrieve 50 candidates from each pathway before fusion. Going below 30 risks missing relevant documents that rank moderately in both systems. Going above 100 wastes computation with minimal quality improvement.

  • Cache query embeddings aggressively. The same query text always produces the same embedding. An LRU cache with a one-hour TTL eliminates 40-60% of embedding API calls in typical search workloads.

  • Log both keyword and semantic scores alongside the fused score for every search result. When debugging why a result ranks unexpectedly, you need to see which pathway contributed to or detracted from its position.

  • Run evaluation metrics weekly against your relevance judgment test set. Weight tuning is not a one-time activity. As your content corpus grows and user behavior shifts, optimal weights change.

  • Degrade gracefully when the embedding service is unavailable. A keyword-only fallback is far better than a 500 error. Users can still find content with exact terms while you resolve the embedding service issue.

  • Use GENERATED ALWAYS AS ... STORED columns for the tsvector instead of triggers. Stored generated columns are maintained automatically by PostgreSQL and eliminate an entire class of bugs where the search vector gets out of sync with the source text.

  • Monitor query latency separately for each search pathway. If the vector search CTE is consistently slower, consider increasing your IVFFlat lists parameter or switching to HNSW indexes for better query performance at the cost of higher memory usage.

References

Powered by Contentful