Indexes are a critical aspect of database design in MongoDB. They play a significant role in optimizing query performance by allowing the database to quickly locate and access the data without scanning the entire collection. This chapter covers everything you need to know about indexing and performance in MongoDB, from the basics to advanced concepts, including practical examples and best practices.
An index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space. In MongoDB, indexes are created on collections to enhance query efficiency. Without indexes, MongoDB performs a collection scan, i.e., it scans every document in a collection to select those documents that match the query statement.
A single field index is created on a single field of a document. It improves the performance of queries that select documents based on the value of this single field.
db.collection.createIndex({ field: 1 })
Explanation:
db.collection.createIndex({ field: 1 })
: This command creates an ascending index on the field field
// Output
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
A compound index is created on multiple fields of a document. It supports queries that match on multiple fields.
db.collection.createIndex({ field1: 1, field2: -1 })
db.collection.createIndex({ field1: 1, field2: -1 })
: This command creates an ascending index on field1
and a descending index on field2
.
// Output
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
A multikey index is created on an array field in documents. MongoDB creates an index for each element in the array.
db.collection.createIndex({ arrayField: 1 })
db.collection.createIndex({ arrayField: 1 })
: This command creates an ascending index on each element of the array arrayField
.A text index supports searching for text within string content. MongoDB provides text indexes to support text search queries on string content.
db.collection.createIndex({ field: "text" })
db.collection.createIndex({ field: "text" })
: This command creates a text index on the field
.Geospatial indexes support queries that calculate geometries on a 2D plane or a sphere.
db.collection.createIndex({ location: "2dsphere" })
db.collection.createIndex({ location: "2dsphere" })
: This command creates a 2dsphere index on the location
field, which is used for queries that calculate geometries on an Earth-like sphere.A hashed index is used for sharding and supports equality queries.
db.collection.createIndex({ field: "hashed" })
db.collection.createIndex({ field: "hashed" })
: This command creates a hashed index on the field
, which is used for sharding and equality queries.A wildcard index supports indexing the entire content of documents in a collection or specific fields that match a pattern.
db.collection.createIndex({ "$**": 1 })
db.collection.createIndex({ "$**": 1 })
: This command creates a wildcard index on all fields of the documents in the collection.Indexes can be created using the createIndex
method. Here’s a basic example of creating a single field index:
db.students.createIndex({ name: 1 })
db.students.createIndex({ name: 1 })
: Creates an ascending index on the name
field of the students
collection.To view the indexes on a collection, use the getIndexes
method.
db.students.getIndexes()
// output
[
{
"v" : 2,
"key" : { "_id" : 1 },
"name" : "_id_"
},
{
"v" : 2,
"key" : { "name" : 1 },
"name" : "name_1"
}
]
To drop an index, use the dropIndex
method.
db.students.dropIndex("name_1")
// output
{ "nIndexesWas" : 2, "ok" : 1 }
A unique index ensures that the indexed fields do not store duplicate values.
db.students.createIndex({ student_id: 1 }, { unique: true })
db.students.createIndex({ student_id: 1 }, { unique: true })
: Creates a unique index on the student_id
field.A sparse index only indexes documents that contain the indexed field.
db.students.createIndex({ email: 1 }, { sparse: true })
db.students.createIndex({ email: 1 }, { sparse: true })
: Creates a sparse index on the email
field.A TTL (Time to Live) index is used to automatically remove documents after a certain period.
db.logs.createIndex({ createdAt: 1 }, { expireAfterSeconds: 3600 })
db.logs.createIndex({ createdAt: 1 }, { expireAfterSeconds: 3600 })
: Creates a TTL index on the createdAt
field, and documents will be removed after 3600 seconds (1 hour).Indexes improve query performance by allowing MongoDB to quickly locate and access the data.
db.students.find({ name: "John Doe" })
With an index on the name
field, MongoDB can quickly locate documents with the name “John Doe”.
While indexes improve read performance, they can affect write performance. Each time a document is inserted or updated, the indexes must also be updated, which adds overhead.
Index cardinality refers to the uniqueness of the values in the indexed field. Higher cardinality indexes generally provide better performance.
Index selectivity is a measure of how well an index can narrow down the result set. Highly selective indexes are more efficient.
A covering index is an index that includes all the fields required by a query, which allows MongoDB to return the results using only the index without accessing the documents.
db.students.createIndex({ name: 1, age: 1 })
Index intersection occurs when MongoDB uses multiple indexes to satisfy a query.
db.students.createIndex({ name: 1 })
db.students.createIndex({ age: 1 })
Indexes can be created on array fields, allowing queries to efficiently search for documents containing specific array elements.
db.students.createIndex({ subjects: 1 })
The explain
method provides insight into how MongoDB executes a query, including information about index usage.
db.students.find({ name: "John Doe" }).explain("executionStats")
// Output
{
...
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 1,
"executionTimeMillis" : 2,
"totalKeysExamined" : 1,
"totalDocsExamined" : 1,
...
}
}
The MongoDB profiler collects detailed information about database operations. It helps in identifying slow queries and operations that may benefit from indexing.
db.setProfilingLevel(2)
db.setProfilingLevel(2)
: Enables the profiler to capture all operations.To view the profiler output:
db.system.profile.find().pretty()
MongoDB provides several monitoring tools like mongostat
and mongotop
to monitor database performance.
mongostat
: Provides an overview of database operations.mongotop
: Displays the amount of time the database spends reading and writing data.Consider a students
collection with the following documents:
{
"_id": 1,
"name": "Alice",
"age": 24,
"major": "Computer Science"
},
{
"_id": 2,
"name": "Bob",
"age": 22,
"major": "Mathematics"
},
{
"_id": 3,
"name": "Charlie",
"age": 23,
"major": "Physics"
}
Suppose you frequently run queries like:
db.students.find({ name: "Alice", age: 24 })
Creating a compound index on name
and age
can significantly speed up this query.
db.students.createIndex({ name: 1, age: 1 })
db.students.find({ name: "Alice", age: 24 }).explain("executionStats")
// Output
{
"queryPlanner" : {
"plannerVersion" : 1,
"namespace" : "test.students",
"indexFilterSet" : false,
"parsedQuery" : {
"$and" : [
{ "name" : { "$eq" : "Alice" } },
{ "age" : { "$eq" : 24 } }
]
},
"winningPlan" : {
"stage" : "FETCH",
"inputStage" : {
"stage" : "IXSCAN",
"keyPattern" : { "name" : 1, "age" : 1 },
"indexName" : "name_1_age_1",
"isMultiKey" : false,
"multiKeyPaths" : { "name" : [], "age" : [] },
"direction" : "forward",
"indexBounds" : {
"name" : [ "[\"Alice\", \"Alice\"]" ],
"age" : [ "[24, 24]" ]
}
}
},
"rejectedPlans" : []
},
"executionStats" : {
"executionSuccess" : true,
"nReturned" : 1,
"executionTimeMillis" : 1,
"totalKeysExamined" : 1,
"totalDocsExamined" : 1,
"executionStages" : {
"stage" : "FETCH",
"nReturned" : 1,
"executionTimeMillisEstimate" : 0,
"works" : 2,
"advanced" : 1,
"needTime" : 0,
"needFetch" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 0,
"docsExamined" : 1,
"alreadyHasObj" : 0,
"inputStage" : {
"stage" : "IXSCAN",
"nReturned" : 1,
"executionTimeMillisEstimate" : 0,
"works" : 2,
"advanced" : 1,
"needTime" : 0,
"needFetch" : 0,
"saveState" : 0,
"restoreState" : 0,
"isEOF" : 0,
"keyPattern" : { "name" : 1, "age" : 1 },
"indexName" : "name_1_age_1",
"isMultiKey" : false,
"multiKeyPaths" : { "name" : [], "age" : [] },
"direction" : "forward",
"indexBounds" : {
"name" : [ "[\"Alice\", \"Alice\"]" ],
"age" : [ "[24, 24]" ]
},
"keysExamined" : 1,
"dupsTested" : 0,
"dupsDropped" : 0,
"seenInvalidated" : 0,
"matchTested" : 1
}
}
}
}
winningPlan
shows that MongoDB uses the compound index name_1_age_1
to perform an index scan (IXSCAN
), which is followed by a fetch (FETCH
) operation.executionStats
indicate that only one key and one document were examined, resulting in efficient query performance.Consider a articles
collection with the following documents:
{
"_id": 1,
"title": "Introduction to MongoDB",
"content": "MongoDB is a NoSQL database..."
},
{
"_id": 2,
"title": "Advanced MongoDB Indexing",
"content": "Indexing in MongoDB is a powerful feature..."
}
Suppose you want to enable text search on the content
field.
db.articles.createIndex({ content: "text" })
db.articles.find({ $text: { $search: "MongoDB" } })
// Output
[
{
"_id": 1,
"title": "Introduction to MongoDB",
"content": "MongoDB is a NoSQL database..."
},
{
"_id": 2,
"title": "Advanced MongoDB Indexing",
"content": "Indexing in MongoDB is a powerful feature..."
}
]
content
field and return the matching documents.Consider a places
collection with documents representing locations:
{
"_id": 1,
"name": "Central Park",
"location": { "type": "Point", "coordinates": [-73.9654, 40.7829] }
},
{
"_id": 2,
"name": "Times Square",
"location": { "type": "Point", "coordinates": [-73.9851, 40.7580] }
}
To perform geospatial queries, you need to create a geospatial index.
db.places.createIndex({ location: "2dsphere" })
db.places.find({
location: {
$near: {
$geometry: { type: "Point", coordinates: [-73.9851, 40.7580] },
$maxDistance: 5000
}
}
})
// Output
[
{
"_id": 2,
"name": "Times Square",
"location": { "type": "Point", "coordinates": [-73.9851, 40.7580] }
},
{
"_id": 1,
"name": "Central Park",
"location": { "type": "Point", "coordinates": [-73.9654, 40.7829] }
}
]
$near
query operator finds documents with locations within 5000 meters of the specified coordinates.location
field allows MongoDB to efficiently perform this proximity search.Indexes are fundamental for optimizing query performance in MongoDB. Understanding the different types of indexes and how to effectively create and manage them is crucial for building efficient and scalable applications. By following best practices and regularly monitoring your database, you can ensure that your indexing strategy continues to meet the evolving needs of your application.Happy coding !❤️