MongoDB Document Modeling for Node.js Applications

Shane

2/14/2026

22 min read

A practical guide to MongoDB document modeling covering embedding vs referencing, schema design patterns, Mongoose schemas, and real-world data modeling for Node.js applications.

nodejs schema-design nosql mongodb mongoose document-modeling

MongoDB Document Modeling for Node.js Applications

Overview

The biggest mistake Node.js developers make with MongoDB is treating it like a relational database with JSON syntax. They normalize everything into separate collections, join them with $lookup, and wonder why their queries are slow. The second biggest mistake is the opposite -- stuffing everything into a single document until they hit the 16MB document size limit and their writes grind to a halt.

Good document modeling sits between these extremes. It requires you to think about how your application reads and writes data before you design your schema. After ten years of building production Node.js applications on MongoDB, I can tell you that the data model is the single most important architectural decision you will make. A bad model cannot be fixed with indexes, caching, or hardware. A good model makes your application fast almost by accident.

This article covers the fundamentals of document modeling, the major design patterns, Mongoose schema design for Node.js, and a complete e-commerce data model you can use as a starting point for real projects.

Document Model Fundamentals

MongoDB stores data as BSON documents -- binary JSON with additional types like ObjectId, Date, Decimal128, and Binary. Unlike relational databases where you design tables and normalize data to eliminate redundancy, MongoDB encourages you to model data based on how your application accesses it.

The core decision in document modeling is embedding vs. referencing.

Embedding

Embedding means storing related data inside the parent document as subdocuments or arrays.

// Embedded address inside a user document
{
  _id: ObjectId("65a1b2c3d4e5f6a7b8c9d0e1"),
  name: "Jane Smith",
  email: "[email protected]",
  addresses: [
    {
      label: "Home",
      street: "742 Evergreen Terrace",
      city: "Springfield",
      state: "IL",
      zip: "62704"
    },
    {
      label: "Work",
      street: "100 Industrial Way",
      city: "Springfield",
      state: "IL",
      zip: "62701"
    }
  ]
}

Embedding gives you atomic reads -- a single query returns everything you need. There are no joins, no round trips, no consistency issues. The data lives together because it is accessed together.

Referencing

Referencing means storing the _id of a related document and looking it up separately.

// User document with reference to orders
{
  _id: ObjectId("65a1b2c3d4e5f6a7b8c9d0e1"),
  name: "Jane Smith",
  email: "[email protected]"
}

// Order document referencing the user
{
  _id: ObjectId("75b2c3d4e5f6a7b8c9d0e1f2"),
  userId: ObjectId("65a1b2c3d4e5f6a7b8c9d0e1"),
  items: [...],
  total: 149.99,
  createdAt: ISODate("2026-02-10T14:30:00Z")
}

Referencing keeps documents small, avoids duplication, and handles unbounded or large related datasets. The trade-off is that you need multiple queries or $lookup aggregations to reconstruct the full picture.

When to Embed vs. Reference

This decision depends on four factors:

1. Access patterns. If you always read the parent and child data together, embed. If you frequently access the child data independently, reference.

2. Data size. MongoDB documents have a 16MB limit. If the embedded data could grow large, reference it. A user with 5 addresses is fine embedded. A user with 50,000 order line items is not.

3. Update frequency. If the child data changes independently and frequently, referencing avoids updating the parent document on every change. If the child data is relatively stable once written, embedding is simpler.

4. Cardinality. One-to-few relationships (a user and their addresses) are ideal for embedding. One-to-many relationships (a blog post and its comments) depend on how many "many" means. One-to-millions relationships (a logging system and its events) must be referenced.

Here is a decision framework I use on every project:

Relationship	Cardinality	Access Pattern	Recommendation
User → Addresses	1:few (< 10)	Always read together	Embed
Blog Post → Comments	1:many (< 1000)	Sometimes read separately	Embed or reference
Author → Blog Posts	1:many (unbounded)	Frequently read independently	Reference
Product → Categories	Many:many	Queried from both sides	Reference arrays
Order → Line Items	1:few (< 100)	Always read together	Embed

Relationship Patterns

One-to-One: Always Embed

One-to-one relationships should almost always be embedded. There is no reason to create a separate collection for data that maps 1:1 to a parent.

// Good: embed the profile
{
  _id: ObjectId("..."),
  email: "[email protected]",
  profile: {
    bio: "Full-stack developer",
    website: "https://janesmith.dev",
    avatar: "/images/jane.jpg"
  }
}

One-to-Many: It Depends

For bounded one-to-many relationships where the "many" side is small and accessed with the parent, embed.

// Product with embedded variants (bounded, always accessed together)
{
  _id: ObjectId("..."),
  name: "Classic T-Shirt",
  variants: [
    { sku: "TS-BLK-S", color: "Black", size: "S", price: 29.99, stock: 45 },
    { sku: "TS-BLK-M", color: "Black", size: "M", price: 29.99, stock: 62 },
    { sku: "TS-WHT-S", color: "White", size: "S", price: 29.99, stock: 30 }
  ]
}

For unbounded one-to-many relationships, reference from the child back to the parent.

// Order references the user (unbounded -- users can have thousands of orders)
{
  _id: ObjectId("..."),
  userId: ObjectId("65a1b2c3d4e5f6a7b8c9d0e1"),
  total: 89.97,
  status: "shipped"
}

Many-to-Many: Reference Arrays

Many-to-many relationships use arrays of references on one or both sides, depending on which direction you query most often.

// Product with category references
{
  _id: ObjectId("..."),
  name: "Wireless Mouse",
  categoryIds: [
    ObjectId("aaa111..."),
    ObjectId("bbb222...")
  ]
}

// Category with product references (if you query both directions)
{
  _id: ObjectId("aaa111..."),
  name: "Electronics",
  productIds: [ObjectId("..."), ObjectId("...")]
}

Keep the reference array on the side with lower cardinality. A product belongs to 3 categories (manageable array). A category contains 10,000 products (too large for an array -- query products by categoryIds instead).

Denormalization Trade-offs

Denormalization means duplicating data across documents to avoid lookups. It is the most powerful and most dangerous tool in MongoDB modeling.

// Order with denormalized product info
{
  _id: ObjectId("..."),
  userId: ObjectId("..."),
  items: [
    {
      productId: ObjectId("..."),
      name: "Classic T-Shirt",      // denormalized from product
      price: 29.99,                 // denormalized -- snapshot at purchase time
      quantity: 2
    }
  ]
}

This is correct for orders because the price and name at the time of purchase should never change, even if the product is later updated. But denormalizing data that changes frequently (like a user's display name across thousands of comment documents) creates an update nightmare.

Rules for denormalization:

Denormalize data that is read far more often than it is written.
Denormalize data that represents a snapshot in time (prices, names at time of transaction).
Do not denormalize data that changes frequently unless you have a reliable update mechanism.
Accept eventual consistency if you denormalize mutable data.

Mongoose Schema Design

Mongoose is the standard ODM for MongoDB in Node.js. It adds schema validation, middleware, virtuals, and query helpers on top of the native MongoDB driver.

Schema Types and Validation

var mongoose = require("mongoose");
var Schema = mongoose.Schema;

var productSchema = new Schema({
  name: {
    type: String,
    required: [true, "Product name is required"],
    trim: true,
    maxlength: [200, "Product name cannot exceed 200 characters"]
  },
  slug: {
    type: String,
    unique: true,
    lowercase: true,
    index: true
  },
  price: {
    type: Number,
    required: true,
    min: [0, "Price cannot be negative"]
  },
  category: {
    type: String,
    enum: {
      values: ["electronics", "clothing", "books", "home", "sports"],
      message: "{VALUE} is not a valid category"
    }
  },
  tags: [String],
  isActive: {
    type: Boolean,
    default: true
  },
  metadata: {
    type: Map,
    of: String
  },
  createdAt: {
    type: Date,
    default: Date.now
  }
});

Virtuals

Virtuals are computed properties that do not get persisted to MongoDB. Use them for derived values.

productSchema.virtual("priceFormatted").get(function() {
  return "$" + this.price.toFixed(2);
});

productSchema.virtual("reviews", {
  ref: "Review",
  localField: "_id",
  foreignField: "productId"
});

productSchema.set("toJSON", { virtuals: true });
productSchema.set("toObject", { virtuals: true });

Instance Methods and Statics

// Instance method -- operates on a single document
productSchema.methods.applyDiscount = function(percent) {
  this.price = this.price * (1 - percent / 100);
  return this.save();
};

// Static method -- operates on the model/collection
productSchema.statics.findByCategory = function(category, callback) {
  return this.find({ category: category, isActive: true })
    .sort({ createdAt: -1 })
    .exec(callback);
};

productSchema.statics.findActive = function() {
  return this.find({ isActive: true });
};

Middleware Hooks

// Pre-save: generate slug from name
productSchema.pre("save", function(next) {
  if (this.isModified("name")) {
    this.slug = this.name
      .toLowerCase()
      .replace(/[^a-z0-9]+/g, "-")
      .replace(/^-|-$/g, "");
  }
  next();
});

// Post-save: log or trigger side effects
productSchema.post("save", function(doc) {
  console.log("Product saved:", doc.name);
});

// Pre-find: automatically exclude inactive products
productSchema.pre("find", function() {
  this.where({ isActive: true });
});

Schema Versioning for Migrations

MongoDB does not enforce a schema, so documents with different structures can coexist in the same collection. Use a schemaVersion field to handle migrations gracefully.

var userSchema = new Schema({
  schemaVersion: { type: Number, default: 2 },
  name: String,         // v1: was "fullName"
  email: String,
  preferences: {        // v2: added preferences object
    theme: { type: String, default: "light" },
    notifications: { type: Boolean, default: true }
  }
});

// Middleware to migrate old documents on read
userSchema.post("init", function(doc) {
  if (!doc.schemaVersion || doc.schemaVersion < 2) {
    // Migrate v1 → v2
    if (doc.fullName && !doc.name) {
      doc.name = doc.fullName;
    }
    if (!doc.preferences) {
      doc.preferences = { theme: "light", notifications: true };
    }
    doc.schemaVersion = 2;
    doc.save();
  }
});

This "lazy migration" pattern upgrades documents as they are accessed, avoiding a massive migration script that locks the database.

Advanced Patterns

Polymorphic Pattern (Discriminators)

Discriminators let you store different document types in the same collection with shared and type-specific fields.

var eventSchema = new Schema({
  timestamp: { type: Date, default: Date.now },
  userId: { type: Schema.Types.ObjectId, ref: "User" }
}, { discriminatorKey: "eventType" });

var Event = mongoose.model("Event", eventSchema);

var clickEventSchema = new Schema({
  url: String,
  element: String,
  position: { x: Number, y: Number }
});

var purchaseEventSchema = new Schema({
  orderId: { type: Schema.Types.ObjectId, ref: "Order" },
  amount: Number,
  currency: String
});

var ClickEvent = Event.discriminator("ClickEvent", clickEventSchema);
var PurchaseEvent = Event.discriminator("PurchaseEvent", purchaseEventSchema);

// Usage
var click = new ClickEvent({
  userId: someUserId,
  url: "/products/wireless-mouse",
  element: "button.add-to-cart",
  position: { x: 450, y: 320 }
});

Bucket Pattern for Time-Series Data

Instead of one document per measurement, group measurements into time-based buckets. This reduces document count and index size dramatically.

var sensorBucketSchema = new Schema({
  sensorId: String,
  date: Date,                    // bucket start (e.g., one hour)
  count: Number,                 // number of readings in this bucket
  sum: Number,                   // pre-aggregated sum
  avg: Number,                   // pre-aggregated average
  min: Number,
  max: Number,
  readings: [{
    timestamp: Date,
    value: Number
  }]
});

// Insert a reading into the current bucket
sensorBucketSchema.statics.addReading = function(sensorId, value, callback) {
  var now = new Date();
  var bucketStart = new Date(now.getFullYear(), now.getMonth(), now.getDate(), now.getHours());

  return this.findOneAndUpdate(
    { sensorId: sensorId, date: bucketStart },
    {
      $push: { readings: { timestamp: now, value: value } },
      $inc: { count: 1, sum: value },
      $min: { min: value },
      $max: { max: value }
    },
    { upsert: true, new: true }
  ).exec(callback);
};

Attribute Pattern for Variable Fields

When documents have many optional or variable fields that you need to index, the attribute pattern consolidates them into an indexable array.

// Instead of this (requires an index per attribute):
{ color: "red", size: "large", material: "cotton", weight: "200g" }

// Use this (one compound index on attributes.key + attributes.value):
{
  name: "Classic T-Shirt",
  attributes: [
    { key: "color", value: "red" },
    { key: "size", value: "large" },
    { key: "material", value: "cotton" },
    { key: "weight", value: "200g" }
  ]
}

// Query: find products where color is red
// db.products.find({ attributes: { $elemMatch: { key: "color", value: "red" } } })
// Single index: { "attributes.key": 1, "attributes.value": 1 }

Computed Pattern for Pre-Aggregated Data

Pre-compute expensive calculations and store them as fields that update on writes rather than recomputing on every read.

var productSchema = new Schema({
  name: String,
  price: Number,
  stats: {
    reviewCount: { type: Number, default: 0 },
    averageRating: { type: Number, default: 0 },
    totalRevenue: { type: Number, default: 0 },
    unitsSold: { type: Number, default: 0 }
  }
});

// Update computed stats when a review is added
productSchema.statics.addReview = function(productId, rating, callback) {
  return this.findByIdAndUpdate(productId, {
    $inc: { "stats.reviewCount": 1 },
    $set: { "stats.averageRating": rating }  // simplified; real impl uses aggregation
  }).exec(callback);
};

Outlier Pattern

When most documents have small arrays but a few outliers have massive arrays, use an overflow mechanism.

var bookSchema = new Schema({
  title: String,
  reviews: [{
    userId: Schema.Types.ObjectId,
    rating: Number,
    text: String
  }],
  hasOverflow: { type: Boolean, default: false }
});

// When reviews exceed threshold, overflow to a separate collection
bookSchema.pre("save", function(next) {
  if (this.reviews.length > 500) {
    this.hasOverflow = true;
    // Move excess reviews to overflow collection
  }
  next();
});

Tree Structures

MongoDB supports several tree patterns. The materialized path pattern is the most practical for most use cases.

var categorySchema = new Schema({
  name: String,
  path: String,          // materialized path: ",electronics,computers,laptops,"
  parentId: { type: Schema.Types.ObjectId, ref: "Category", default: null },
  depth: Number
});

categorySchema.index({ path: 1 });

// Find all descendants of "electronics"
categorySchema.statics.findDescendants = function(categoryPath, callback) {
  var regex = new RegExp("^" + categoryPath);
  return this.find({ path: regex }).exec(callback);
};

// Find all ancestors
categorySchema.statics.findAncestors = function(path, callback) {
  var parts = path.split(",").filter(Boolean);
  var ancestorPaths = [];
  for (var i = 1; i < parts.length; i++) {
    ancestorPaths.push("," + parts.slice(0, i).join(",") + ",");
  }
  return this.find({ path: { $in: ancestorPaths } }).exec(callback);
};

Anti-Patterns to Avoid

1. Massive arrays. Any array that can grow without bound will eventually hit the 16MB document limit or cause performance problems long before that. If an array can exceed a few hundred elements, reference instead.

2. Deep nesting. Documents nested more than 3-4 levels deep become difficult to query and update. MongoDB's dot notation ("level1.level2.level3.field") gets unwieldy, and $elemMatch only works on the top-level array.

3. Unbounded growth. Documents that grow over time (chat messages appended to a room document, logs appended to a session document) will cause write amplification. MongoDB must rewrite the entire document when it outgrows its allocated space.

4. Over-normalization. If you find yourself doing 5+ lookups to render a single page, your data is too normalized. MongoDB is not a relational database. Embrace some redundancy.

5. Using ObjectId strings instead of ObjectId objects. Storing "65a1b2c3d4e5f6a7b8c9d0e1" as a string instead of ObjectId("65a1b2c3d4e5f6a7b8c9d0e1") wastes space and breaks population in Mongoose.

Complete E-Commerce Data Model

Here is a production-ready e-commerce model with Products, Orders, Users, and Reviews.

// models/ecommerce.js
var mongoose = require("mongoose");
var Schema = mongoose.Schema;

// ============================================================
// ADDRESS SCHEMA (reusable subdocument)
// ============================================================
var addressSchema = new Schema({
  label: { type: String, enum: ["home", "work", "shipping", "billing"] },
  street: { type: String, required: true },
  city: { type: String, required: true },
  state: { type: String, required: true },
  zip: { type: String, required: true },
  country: { type: String, default: "US" },
  isDefault: { type: Boolean, default: false }
}, { _id: true });

// ============================================================
// USER SCHEMA
// ============================================================
var userSchema = new Schema({
  email: {
    type: String,
    required: true,
    unique: true,
    lowercase: true,
    trim: true
  },
  name: {
    first: { type: String, required: true, trim: true },
    last: { type: String, required: true, trim: true }
  },
  passwordHash: { type: String, required: true },
  role: {
    type: String,
    enum: ["customer", "admin", "vendor"],
    default: "customer"
  },
  addresses: [addressSchema],
  wishlist: [{ type: Schema.Types.ObjectId, ref: "Product" }],
  isActive: { type: Boolean, default: true },
  lastLoginAt: Date
}, { timestamps: true });

userSchema.virtual("fullName").get(function() {
  return this.name.first + " " + this.name.last;
});

userSchema.methods.getDefaultAddress = function(label) {
  var addr = this.addresses.find(function(a) {
    return a.label === label && a.isDefault;
  });
  return addr || this.addresses[0];
};

userSchema.statics.findByEmail = function(email) {
  return this.findOne({ email: email.toLowerCase().trim() });
};

userSchema.index({ "name.last": 1, "name.first": 1 });
userSchema.set("toJSON", {
  virtuals: true,
  transform: function(doc, ret) {
    delete ret.passwordHash;
    return ret;
  }
});

var User = mongoose.model("User", userSchema);

// ============================================================
// PRODUCT SCHEMA (with embedded variants)
// ============================================================
var variantSchema = new Schema({
  sku: { type: String, required: true, unique: true },
  color: String,
  size: String,
  price: { type: Number, required: true, min: 0 },
  compareAtPrice: Number,
  stock: { type: Number, default: 0, min: 0 },
  weight: Number,
  images: [String]
}, { _id: true });

var productSchema = new Schema({
  name: {
    type: String,
    required: true,
    trim: true,
    maxlength: 200
  },
  slug: { type: String, unique: true, index: true },
  description: String,
  shortDescription: { type: String, maxlength: 500 },
  category: {
    type: String,
    required: true,
    index: true
  },
  brand: String,
  variants: {
    type: [variantSchema],
    validate: {
      validator: function(v) { return v.length > 0; },
      message: "Product must have at least one variant"
    }
  },
  tags: [{ type: String, index: true }],
  images: [String],
  stats: {
    reviewCount: { type: Number, default: 0 },
    averageRating: { type: Number, default: 0 },
    totalSold: { type: Number, default: 0 }
  },
  isActive: { type: Boolean, default: true },
  publishedAt: Date
}, { timestamps: true });

productSchema.virtual("reviews", {
  ref: "Review",
  localField: "_id",
  foreignField: "productId"
});

productSchema.virtual("priceRange").get(function() {
  if (!this.variants || this.variants.length === 0) return null;
  var prices = this.variants.map(function(v) { return v.price; });
  var min = Math.min.apply(null, prices);
  var max = Math.max.apply(null, prices);
  if (min === max) return "$" + min.toFixed(2);
  return "$" + min.toFixed(2) + " - $" + max.toFixed(2);
});

productSchema.pre("save", function(next) {
  if (this.isModified("name")) {
    this.slug = this.name
      .toLowerCase()
      .replace(/[^a-z0-9]+/g, "-")
      .replace(/^-|-$/g, "");
  }
  next();
});

productSchema.statics.search = function(query, options) {
  var filter = { isActive: true };
  if (query) {
    filter.$text = { $search: query };
  }
  if (options && options.category) {
    filter.category = options.category;
  }
  if (options && options.minPrice !== undefined) {
    filter["variants.price"] = { $gte: options.minPrice };
  }
  if (options && options.maxPrice !== undefined) {
    filter["variants.price"] = filter["variants.price"] || {};
    filter["variants.price"].$lte = options.maxPrice;
  }
  var page = (options && options.page) || 1;
  var limit = (options && options.limit) || 20;
  return this.find(filter)
    .sort(query ? { score: { $meta: "textScore" } } : { createdAt: -1 })
    .skip((page - 1) * limit)
    .limit(limit);
};

productSchema.index({ name: "text", description: "text", tags: "text" });
productSchema.set("toJSON", { virtuals: true });

var Product = mongoose.model("Product", productSchema);

// ============================================================
// REVIEW SCHEMA (references both product and user)
// ============================================================
var reviewSchema = new Schema({
  productId: {
    type: Schema.Types.ObjectId,
    ref: "Product",
    required: true,
    index: true
  },
  userId: {
    type: Schema.Types.ObjectId,
    ref: "User",
    required: true
  },
  rating: {
    type: Number,
    required: true,
    min: 1,
    max: 5
  },
  title: { type: String, maxlength: 200 },
  body: { type: String, maxlength: 5000 },
  isVerifiedPurchase: { type: Boolean, default: false },
  helpfulVotes: { type: Number, default: 0 },
  images: [String]
}, { timestamps: true });

// Update product stats after saving a review
reviewSchema.post("save", function(doc) {
  var Review = mongoose.model("Review");
  Review.aggregate([
    { $match: { productId: doc.productId } },
    {
      $group: {
        _id: "$productId",
        averageRating: { $avg: "$rating" },
        reviewCount: { $sum: 1 }
      }
    }
  ]).then(function(results) {
    if (results.length > 0) {
      return Product.findByIdAndUpdate(doc.productId, {
        "stats.averageRating": Math.round(results[0].averageRating * 10) / 10,
        "stats.reviewCount": results[0].reviewCount
      });
    }
  }).catch(function(err) {
    console.error("Failed to update product stats:", err);
  });
});

// Prevent duplicate reviews
reviewSchema.index({ productId: 1, userId: 1 }, { unique: true });

var Review = mongoose.model("Review", reviewSchema);

// ============================================================
// ORDER SCHEMA (with denormalized product info)
// ============================================================
var orderItemSchema = new Schema({
  productId: { type: Schema.Types.ObjectId, ref: "Product", required: true },
  variantId: { type: Schema.Types.ObjectId },
  sku: String,
  name: String,               // denormalized snapshot
  color: String,
  size: String,
  price: { type: Number, required: true },   // price at time of purchase
  quantity: { type: Number, required: true, min: 1 },
  image: String
}, { _id: true });

orderItemSchema.virtual("lineTotal").get(function() {
  return this.price * this.quantity;
});

var orderSchema = new Schema({
  orderNumber: { type: String, unique: true, index: true },
  userId: {
    type: Schema.Types.ObjectId,
    ref: "User",
    required: true,
    index: true
  },
  items: {
    type: [orderItemSchema],
    validate: {
      validator: function(v) { return v.length > 0; },
      message: "Order must have at least one item"
    }
  },
  shippingAddress: addressSchema,
  billingAddress: addressSchema,
  subtotal: { type: Number, required: true },
  tax: { type: Number, default: 0 },
  shippingCost: { type: Number, default: 0 },
  discount: { type: Number, default: 0 },
  total: { type: Number, required: true },
  status: {
    type: String,
    enum: ["pending", "confirmed", "processing", "shipped", "delivered", "cancelled", "refunded"],
    default: "pending",
    index: true
  },
  paymentMethod: {
    type: String,
    enum: ["credit_card", "paypal", "bank_transfer"]
  },
  paymentId: String,
  trackingNumber: String,
  notes: String,
  statusHistory: [{
    status: String,
    changedAt: { type: Date, default: Date.now },
    changedBy: Schema.Types.ObjectId,
    note: String
  }]
}, { timestamps: true });

// Generate order number before save
orderSchema.pre("save", function(next) {
  if (this.isNew && !this.orderNumber) {
    var timestamp = Date.now().toString(36).toUpperCase();
    var random = Math.random().toString(36).substring(2, 6).toUpperCase();
    this.orderNumber = "ORD-" + timestamp + "-" + random;
  }
  next();
});

// Record status changes
orderSchema.pre("save", function(next) {
  if (this.isModified("status") && !this.isNew) {
    this.statusHistory.push({
      status: this.status,
      changedAt: new Date()
    });
  }
  next();
});

// Update product sales stats after order confirmation
orderSchema.post("save", function(doc) {
  if (doc.status === "confirmed") {
    doc.items.forEach(function(item) {
      Product.findByIdAndUpdate(item.productId, {
        $inc: { "stats.totalSold": item.quantity }
      }).exec();
    });
  }
});

orderSchema.statics.findByUser = function(userId, options) {
  var page = (options && options.page) || 1;
  var limit = (options && options.limit) || 10;
  return this.find({ userId: userId })
    .sort({ createdAt: -1 })
    .skip((page - 1) * limit)
    .limit(limit);
};

orderSchema.set("toJSON", { virtuals: true });

var Order = mongoose.model("Order", orderSchema);

// ============================================================
// USAGE EXAMPLES
// ============================================================

// Create a product with variants
function createProduct(callback) {
  var product = new Product({
    name: "Wireless Bluetooth Headphones",
    description: "Premium noise-canceling headphones with 30-hour battery life.",
    shortDescription: "Noise-canceling wireless headphones",
    category: "electronics",
    brand: "AudioPro",
    tags: ["headphones", "wireless", "noise-canceling", "bluetooth"],
    variants: [
      { sku: "HP-BLK-STD", color: "Black", price: 149.99, stock: 200 },
      { sku: "HP-WHT-STD", color: "White", price: 149.99, stock: 150 },
      { sku: "HP-BLK-PRO", color: "Black", price: 199.99, stock: 75 }
    ],
    images: ["/images/headphones-black.jpg", "/images/headphones-white.jpg"]
  });
  product.save(callback);
}

// Place an order with denormalized product data
function placeOrder(userId, cartItems, shippingAddress, callback) {
  var productIds = cartItems.map(function(item) { return item.productId; });

  Product.find({ _id: { $in: productIds } }).then(function(products) {
    var productMap = {};
    products.forEach(function(p) { productMap[p._id.toString()] = p; });

    var items = cartItems.map(function(cartItem) {
      var product = productMap[cartItem.productId];
      var variant = product.variants.id(cartItem.variantId);
      return {
        productId: product._id,
        variantId: variant._id,
        sku: variant.sku,
        name: product.name,              // snapshot
        color: variant.color,
        size: variant.size,
        price: variant.price,            // snapshot at purchase time
        quantity: cartItem.quantity,
        image: product.images[0]
      };
    });

    var subtotal = items.reduce(function(sum, item) {
      return sum + (item.price * item.quantity);
    }, 0);
    var tax = subtotal * 0.08;
    var shippingCost = subtotal > 99 ? 0 : 9.99;

    var order = new Order({
      userId: userId,
      items: items,
      shippingAddress: shippingAddress,
      subtotal: Math.round(subtotal * 100) / 100,
      tax: Math.round(tax * 100) / 100,
      shippingCost: shippingCost,
      total: Math.round((subtotal + tax + shippingCost) * 100) / 100,
      status: "pending"
    });

    return order.save();
  }).then(function(order) {
    callback(null, order);
  }).catch(function(err) {
    callback(err);
  });
}

// Fetch a product with populated reviews
function getProductWithReviews(slug, callback) {
  Product.findOne({ slug: slug })
    .populate({
      path: "reviews",
      options: { sort: { createdAt: -1 }, limit: 20 },
      populate: {
        path: "userId",
        select: "name"
      }
    })
    .exec(callback);
}

// Get user order history
function getUserOrders(userId, page, callback) {
  Order.findByUser(userId, { page: page })
    .populate("userId", "name email")
    .exec(callback);
}

module.exports = {
  User: User,
  Product: Product,
  Review: Review,
  Order: Order,
  createProduct: createProduct,
  placeOrder: placeOrder,
  getProductWithReviews: getProductWithReviews,
  getUserOrders: getUserOrders
};

Common Issues and Troubleshooting

1. Population returning null. If populate() returns null for referenced documents, check that the referenced field stores actual ObjectId values, not strings. Run typeof doc.userId in a post-find hook -- if it returns "string", your data is corrupted. Fix by re-saving documents with mongoose.Types.ObjectId(stringValue).

2. Duplicate key errors on upsert. When using findOneAndUpdate with upsert: true under high concurrency, two requests can both find no matching document and both attempt an insert. Add retry logic or use a unique index with a try/catch that retries on error code 11000.

3. Slow queries on large embedded arrays. If you query or sort by fields inside large arrays, MongoDB must scan the entire array for each document. Use partial indexes or consider moving the array items to a separate collection. Run explain("executionStats") to verify your index is being used.

4. Document size growing unexpectedly. Use Object.bsonsize(doc.toObject()) in Mongoose or db.collection.stats() in the shell to monitor document sizes. If average document size exceeds 1MB, you likely have an unbounded array that needs to be refactored into a separate collection.

5. Schema validation not running on update. By default, Mongoose only runs validation on save(), not on findOneAndUpdate(). Pass { runValidators: true } as an option to update operations, or set mongoose.set("runValidators", true) globally.

Best Practices

Model for your queries, not your entities. Start by listing your application's read and write operations. Design documents that serve the most common queries with a single read.
Keep frequently accessed documents under 1KB. The working set (documents accessed regularly) should fit in RAM. Smaller documents mean more of them fit in the WiredTiger cache.
Index early and measure. Create indexes for every query pattern you use in production. Use explain() to verify they are being used. A missing index on a collection with 100K+ documents will cause full collection scans.
Use projection to limit returned fields. Do not fetch entire documents when you only need two fields. Use .select("name price") in Mongoose to reduce network transfer and memory usage.
Prefer $inc and $push over read-modify-write. Atomic update operators are faster and avoid race conditions. Instead of reading a document, modifying it in JavaScript, and saving it back, use Model.updateOne({ _id: id }, { $inc: { count: 1 } }).
Set maxTimeMS on queries. Prevent slow queries from holding connections open indefinitely. Add .maxTimeMS(5000) to queries in production, and monitor for MaxTimeMSExpired errors in your logs.
Design for deletion. Add a deletedAt field for soft deletes rather than actually removing documents. This preserves referential integrity and lets you recover from accidental deletions. Use a pre-find middleware to filter out soft-deleted documents by default.
Test your model with realistic data volumes. A schema that works with 100 documents may fail at 100,000. Seed your development database with production-scale data and run your queries against it before shipping.

References

MongoDB Data Modeling Introduction -- Official guide to document model concepts.
MongoDB Schema Design Patterns -- The original "Building with Patterns" series covering bucket, attribute, computed, outlier, and other patterns.
Mongoose Documentation -- Comprehensive guide to schemas, models, queries, and middleware.
MongoDB Schema Design Anti-Patterns -- Official list of what not to do.
6 Rules of Thumb for MongoDB Schema Design -- Classic article on embedding vs. referencing decisions.

MongoDB Document Modeling for Node.js Applications

MongoDB Document Modeling for Node.js Applications

Overview

Document Model Fundamentals

Embedding

Referencing

When to Embed vs. Reference

Relationship Patterns

One-to-One: Always Embed

One-to-Many: It Depends

Many-to-Many: Reference Arrays

Denormalization Trade-offs

Mongoose Schema Design

Schema Types and Validation

Virtuals

Instance Methods and Statics

Middleware Hooks

Schema Versioning for Migrations

Advanced Patterns

Polymorphic Pattern (Discriminators)

Bucket Pattern for Time-Series Data

Attribute Pattern for Variable Fields

Computed Pattern for Pre-Aggregated Data

Outlier Pattern

Tree Structures

Anti-Patterns to Avoid

Complete E-Commerce Data Model

Common Issues and Troubleshooting

Best Practices

References

Quick Links

Need Expert Help?