Git Submodules and Subtrees: When to Use Each
A practical comparison of Git submodules and subtrees for managing shared code, vendored dependencies, and multi-repo architectures with real-world patterns.
Git Submodules and Subtrees: When to Use Each
Every project eventually needs to share code across repositories. A utility library used by three services. A configuration repo that multiple deployments reference. A vendored dependency you need to patch. Git provides two mechanisms for this: submodules and subtrees. They solve the same problem differently, and choosing wrong creates ongoing friction.
I have used both extensively. Submodules are better when you need independent version control of the shared code. Subtrees are better when you want the shared code fully embedded in your project. This guide covers both approaches with the real workflows — not just setup, but the daily operations that determine whether each approach is practical.
Prerequisites
- Git installed (v2.20+ for recent submodule improvements)
- Two or more repositories to connect
- Understanding of Git basics (clone, commit, push, pull)
- Terminal access
The Core Problem
You have a shared library that multiple projects use:
project-api/
src/
package.json
project-web/
src/
package.json
shared-utils/ # Used by both projects
src/
validation.js
formatting.js
package.json
Options:
- Copy the code — works until you need to sync changes. Then it becomes a maintenance nightmare.
- Publish as a package — good for stable libraries, overhead for rapidly changing code.
- Git submodule — includes the shared repo by reference.
- Git subtree — includes the shared repo by copying its history.
Git Submodules
Submodules embed one Git repository inside another as a pointer. The parent repo stores a reference to a specific commit in the child repo.
Adding a Submodule
cd project-api/
git submodule add https://github.com/myorg/shared-utils.git libs/shared
git commit -m "chore: add shared-utils as submodule"
This creates:
libs/shared/— the cloned submodule repository.gitmodules— configuration file tracking the submodule
# .gitmodules
[submodule "libs/shared"]
path = libs/shared
url = https://github.com/myorg/shared-utils.git
The parent repo does not store the submodule's files — it stores a commit hash. Running git diff after adding a submodule shows:
+Subproject commit abc1234def5678...
Cloning a Project with Submodules
# Clone and initialize submodules in one command
git clone --recurse-submodules https://github.com/myorg/project-api.git
# Or clone first, then initialize
git clone https://github.com/myorg/project-api.git
cd project-api
git submodule init
git submodule update
Updating Submodules
Pull the latest changes from the submodule's remote:
# Update a specific submodule
cd libs/shared
git checkout main
git pull origin main
cd ../..
git add libs/shared
git commit -m "chore: update shared-utils to latest"
# Or update all submodules at once
git submodule update --remote --merge
git add .
git commit -m "chore: update all submodules"
Pinning a Submodule to a Specific Version
cd libs/shared
git checkout v2.1.0 # Tag, branch, or commit hash
cd ../..
git add libs/shared
git commit -m "chore: pin shared-utils to v2.1.0"
The parent repo now records that specific commit. Other developers who run git submodule update will get exactly that version.
Working Inside a Submodule
You can edit submodule code directly:
cd libs/shared
# Make changes
git add src/validation.js
git commit -m "fix: handle empty string in email validation"
git push origin main
# Return to parent and update the reference
cd ../..
git add libs/shared
git commit -m "chore: update shared-utils with validation fix"
git push
Configuring Submodule Behavior
# Track a specific branch instead of a commit
git config -f .gitmodules submodule.libs/shared.branch main
# Shallow clone submodules (faster, less disk space)
git config -f .gitmodules submodule.libs/shared.shallow true
# Update strategy
git config -f .gitmodules submodule.libs/shared.update merge
# Options: checkout (default), merge, rebase, none
Removing a Submodule
Removing a submodule requires several steps:
# Remove the submodule entry from .gitmodules
git config -f .gitmodules --remove-section submodule.libs/shared
# Remove the submodule entry from .git/config
git config --remove-section submodule.libs/shared
# Remove the submodule directory and staging
git rm --cached libs/shared
rm -rf libs/shared
rm -rf .git/modules/libs/shared
# Commit the removal
git add .gitmodules
git commit -m "chore: remove shared-utils submodule"
Git Subtrees
Subtrees copy the shared repository's content and history directly into your project. There is no pointer, no separate clone — the files exist in your repo as regular files.
Adding a Subtree
cd project-api/
# Add a remote for the shared repo
git remote add shared-utils https://github.com/myorg/shared-utils.git
# Add the subtree
git subtree add --prefix=libs/shared shared-utils main --squash
The --squash option compresses the shared repo's history into a single commit. Without it, the entire history of the shared repo is merged into your project's history.
Pulling Updates from Upstream
git subtree pull --prefix=libs/shared shared-utils main --squash
This pulls the latest changes from the shared repo and merges them into your project.
Pushing Changes Back to Upstream
If you edit files in the subtree directory, you can push those changes back:
git subtree push --prefix=libs/shared shared-utils main
Git extracts the commits that touched libs/shared/ and pushes them to the shared repo.
Splitting a Directory into a Subtree
Extract existing code into a separate repository:
# Create a branch with only the subtree's history
git subtree split --prefix=src/shared --branch shared-split
# Push to a new repository
git remote add shared-new https://github.com/myorg/shared-utils-new.git
git push shared-new shared-split:main
Subtree with No Squash
# Add without squash — full history preserved
git subtree add --prefix=libs/shared shared-utils main
# This merges the entire commit history of shared-utils
# into your project's log. Useful for tracing changes,
# but makes your log noisy.
Submodules vs. Subtrees: Comparison
How They Store Code
Submodule:
project-api/
.gitmodules # Pointer to external repo
libs/shared/ # Separate Git repo (nested .git)
.git # Own history, own remote
Subtree:
project-api/
libs/shared/ # Regular files in your repo
validation.js # Committed directly
formatting.js # Part of your history
Workflow Comparison
| Operation | Submodules | Subtrees |
|---|---|---|
| Clone | Need --recurse-submodules |
Normal clone works |
| Pull updates | git submodule update --remote |
git subtree pull |
| Push changes | Push submodule separately | git subtree push |
| See shared code in diffs | No (just a hash) | Yes (regular files) |
| Offline access | Need to init submodules | Always available |
| CI/CD complexity | Need submodule init step | No extra steps |
| Pin exact version | Natural (commit hash) | Manual (squash commits) |
| Repo size | Small (pointers only) | Larger (full copy) |
| History | Separate | Merged |
Decision Framework
Use submodules when:
- The shared code is a large, independently versioned project
- You need to pin exact versions across consumer projects
- Multiple teams own different repos and need clear boundaries
- The shared code changes frequently and independently
- You want to keep your repo size small
Use subtrees when:
- You want everything in one repo with no external dependencies
- CI/CD should work without special submodule setup
- You rarely push changes back to the shared repo
- Developers should see shared code in normal diffs and searches
- You are vendoring a third-party dependency for patching
- You want offline access to all code without extra steps
Common Patterns
Pattern 1: Shared Configuration
# Submodule approach — config updates independently
git submodule add https://github.com/myorg/eslint-config.git config/eslint
// .eslintrc.json
{
"extends": "./config/eslint/index.js"
}
Pattern 2: Vendored Dependency
# Subtree approach — vendor a library you need to patch
git subtree add --prefix=vendor/express https://github.com/expressjs/express.git 4.18.2 --squash
# Make your patches
git add vendor/express/lib/router/index.js
git commit -m "fix: patch Express router for custom error handling"
Pattern 3: Shared Libraries in a Multi-Repo Architecture
# Submodule approach — multiple services share a validation library
cd service-users/
git submodule add https://github.com/myorg/validation.git libs/validation
cd ../service-orders/
git submodule add https://github.com/myorg/validation.git libs/validation
# Pin both services to the same version
cd service-users/libs/validation && git checkout v1.5.0
cd ../../../service-orders/libs/validation && git checkout v1.5.0
Pattern 4: Documentation in a Separate Repo
# Subtree approach — docs live in their own repo but build with the project
git subtree add --prefix=docs https://github.com/myorg/project-docs.git main --squash
# Build docs alongside code
npm run build-docs # Reads from docs/ directory
Automation Scripts
Submodule Helper Script
// scripts/submodule-status.js
var childProcess = require("child_process");
function run(cmd) {
return childProcess.execSync(cmd, { encoding: "utf-8" }).trim();
}
function getSubmoduleStatus() {
var output = run("git submodule status");
if (!output) {
console.log("No submodules found.");
return;
}
var lines = output.split("\n");
lines.forEach(function(line) {
var parts = line.trim().split(" ");
var hash = parts[0].replace(/^[+-]/, "");
var path = parts[1];
var tag = parts[2] || "";
var prefix = line.trim()[0];
var status = "up to date";
if (prefix === "+") status = "MODIFIED (needs commit in parent)";
if (prefix === "-") status = "NOT INITIALIZED (run git submodule update)";
if (prefix === "U") status = "MERGE CONFLICT";
console.log(path + ": " + status);
console.log(" Commit: " + hash.substring(0, 8) + " " + tag);
// Check for upstream updates
try {
var localHead = run("git -C " + path + " rev-parse HEAD");
var remoteHead = run("git -C " + path + " rev-parse origin/main");
if (localHead !== remoteHead) {
console.log(" Updates available from upstream");
}
} catch (err) {
// Remote not fetched
}
console.log("");
});
}
getSubmoduleStatus();
Subtree Update Script
// scripts/subtree-update.js
var childProcess = require("child_process");
var subtrees = [
{ prefix: "libs/shared", remote: "shared-utils", branch: "main" },
{ prefix: "libs/config", remote: "eslint-config", branch: "main" }
];
function run(cmd) {
console.log("$ " + cmd);
try {
var output = childProcess.execSync(cmd, {
encoding: "utf-8",
stdio: "pipe"
});
if (output.trim()) console.log(output.trim());
return true;
} catch (err) {
console.error("Failed: " + err.message);
return false;
}
}
subtrees.forEach(function(subtree) {
console.log("\nUpdating " + subtree.prefix + "...");
// Fetch latest
run("git fetch " + subtree.remote);
// Pull updates
var success = run(
"git subtree pull --prefix=" + subtree.prefix +
" " + subtree.remote + " " + subtree.branch + " --squash"
);
if (success) {
console.log(subtree.prefix + " updated successfully.");
} else {
console.log(subtree.prefix + " may need manual merge resolution.");
}
});
Complete Working Example: Multi-Repo Project with Submodules
# Create the shared library
mkdir shared-utils && cd shared-utils
git init
mkdir src
cat > src/validation.js << 'SCRIPT'
var validator = {};
validator.isEmail = function(email) {
var pattern = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
return pattern.test(email);
};
validator.isNotEmpty = function(value) {
return value !== null && value !== undefined && String(value).trim().length > 0;
};
validator.isInRange = function(value, min, max) {
var num = Number(value);
return !isNaN(num) && num >= min && num <= max;
};
module.exports = validator;
SCRIPT
git add . && git commit -m "feat: add validation utilities"
git tag v1.0.0
# Create the API project
cd ..
mkdir project-api && cd project-api
git init
git submodule add ../shared-utils libs/shared
cat > app.js << 'SCRIPT'
var validator = require("./libs/shared/src/validation");
function handleRequest(data) {
if (!validator.isEmail(data.email)) {
return { error: "Invalid email address" };
}
if (!validator.isNotEmpty(data.name)) {
return { error: "Name is required" };
}
return { success: true };
}
module.exports = { handleRequest: handleRequest };
SCRIPT
git add . && git commit -m "feat: add API with shared validation"
Common Issues and Troubleshooting
Submodule directory is empty after clone
You cloned without --recurse-submodules:
Fix: Run git submodule init && git submodule update in the project root. Or re-clone with git clone --recurse-submodules <url>.
Subtree merge conflicts during pull
The same file was modified in both the parent project and the shared repo:
Fix: Resolve conflicts normally with git mergetool or by editing the conflicted files. The conflict markers show your version and the upstream version. After resolving, git add the files and git commit.
Submodule shows "modified content" in git status but you did not change anything
The submodule has untracked files or the checked-out commit differs from what the parent expects:
Fix: Go into the submodule directory and check its status. Run git submodule update to reset it to the expected commit. Add ignore = dirty to .gitmodules if you want to suppress noise from untracked files in the submodule.
CI/CD pipeline fails because submodule is not initialized
The CI system clones the repo without submodules:
Fix: Add a submodule init step to your CI configuration:
# GitHub Actions
steps:
- uses: actions/checkout@v4
with:
submodules: recursive
# GitLab CI
variables:
GIT_SUBMODULE_STRATEGY: recursive
Subtree push is extremely slow
git subtree push scans the entire history to extract relevant commits:
Fix: Use git subtree split first to create a branch, then push that branch. This is faster for repos with long histories:
git subtree split --prefix=libs/shared --branch subtree-push
git push shared-utils subtree-push:main
git branch -D subtree-push
Best Practices
- Document your submodule/subtree strategy in the README. New developers need to know that submodules exist and how to initialize them. Without documentation, they will clone and wonder why directories are empty.
- Use
--recurse-submodulesin clone commands. Make it a habit or alias it:git config --global alias.cloner 'clone --recurse-submodules'. - Pin submodules to tags, not branches. Tags are immutable references. Pointing a submodule at
mainmeans it could break unexpectedly when someone pushes to the shared repo. - Squash subtree merges. Without
--squash, the shared repo's entire commit history floods your project log. Squash keeps your history clean. - Automate submodule updates in CI. Your CI pipeline should fail clearly if submodules are not initialized, not silently skip tests because code is missing.
- Prefer subtrees for vendored code. When you fork a library to patch it, subtrees keep everything self-contained. No external dependency at clone time.
- Prefer submodules for large shared code. If the shared repo is 500MB, a subtree copies all of it into every consuming project. Submodules keep a lightweight pointer.