The Aggregation Framework in MongoDB is a powerful and flexible tool that enables advanced data processing and analysis by chaining stages to transform documents in various ways.
Aggregation is the process of processing data records and returning computed results. MongoDB’s Aggregation Framework allows for complex data manipulation by utilizing a series of transformation stages, making it suitable for tasks like analytics and reporting.
The Aggregation Pipeline consists of multiple stages. Each stage performs a specific operation on documents, passing the transformed documents to the next stage.
db.collection.aggregate([
{ stage1 },
{ stage2 },
...
])
$match: Filters documents by specified criteria.$project: Shapes the documents by including or excluding fields.$group: Groups documents by a specified field, performing operations on grouped data.$sort: Orders documents based on specified fields.$limit: Limits the number of documents in the output.Example – Filtering and Grouping:
db.sales.aggregate([
{ $match: { status: "completed" } },
{ $group: { _id: "$customer", totalAmount: { $sum: "$amount" } } }
])
Output Explanation: This pipeline filters sales to include only completed orders, then groups by customer and calculates the total amount for each customer.
$lookup Stage for Joins$lookup allows joining data from multiple collections, similar to SQL joins.
db.orders.aggregate([
{
$lookup: {
from: "customers",
localField: "customerId",
foreignField: "_id",
as: "customerData"
}
}
])
Output Explanation: This query joins orders with customers using the customerId field, adding customer details to each order.
$unwind Stage for Flattening Arrays$unwind breaks an array field in a document into separate documents for each element.
db.orders.aggregate([
{ $unwind: "$items" }
])
Output Explanation: If an order document has an array field items, each item in the array is output as a separate document.
Operators like $sum, $multiply, $divide, $cond, and $ifNull allow mathematical calculations and conditional logic.
db.sales.aggregate([
{
$project: {
revenue: { $multiply: ["$price", "$quantity"] },
isHighValue: { $cond: { if: { $gte: ["$revenue", 1000] }, then: true, else: false } }
}
}
])
Explanation: This pipeline calculates revenue for each sale and adds a boolean isHighValue field based on a revenue threshold.
Date operators like $year, $month, $dayOfWeek, and $dateToString are helpful for extracting and formatting date fields.
db.orders.aggregate([
{ $project: { year: { $year: "$orderDate" }, month: { $month: "$orderDate" } } }
])
Explanation: Extracts year and month from the orderDate field, making it easier to perform time-based analysis.
$push and $addToSet for Arrays$push: Adds values to an array.$addToSet: Adds unique values only.
db.orders.aggregate([
{
$group: {
_id: "$customerId",
orders: { $push: "$_id" },
uniqueProducts: { $addToSet: "$product" }
}
}
])
Explanation: Groups orders by customer, collecting order IDs in an array and ensuring unique products.
$filter to Conditionally Include Array Elements$filter applies conditions to include specific elements in an array.
db.products.aggregate([
{
$project: {
highRatedReviews: {
$filter: {
input: "$reviews",
as: "review",
cond: { $gte: ["$$review.rating", 4] }
}
}
}
}
])
Explanation: This extracts only reviews with ratings of 4 or higher.
$cond for Conditional LogicThe $cond operator is a conditional expression similar to an “if-else” statement.
db.orders.aggregate([
{
$project: {
customerType: {
$cond: { if: { $gte: ["$totalAmount", 500] }, then: "Premium", else: "Standard" }
}
}
}
])
Explanation: Classifies customers as “Premium” or “Standard” based on totalAmount.
$switch$switch provides a multi-branch conditional similar to a “switch-case” statement.
db.orders.aggregate([
{
$project: {
category: {
$switch: {
branches: [
{ case: { $gte: ["$totalAmount", 1000] }, then: "High" },
{ case: { $gte: ["$totalAmount", 500] }, then: "Medium" }
],
default: "Low"
}
}
}
}
])
Explanation: Categorizes orders as “High,” “Medium,” or “Low” based on totalAmount.
MongoDB supports full-text search with the $text operator.
db.products.aggregate([
{ $match: { $text: { $search: "laptop" } } },
{ $project: { name: 1, score: { $meta: "textScore" } } },
{ $sort: { score: -1 } }
])
Explanation: This pipeline finds products containing the word “laptop” and sorts them by relevance.
Calculates monthly sales totals with grouping and date operators.
db.sales.aggregate([
{ $match: { year: 2024 } },
{ $group: { _id: { month: { $month: "$saleDate" } }, totalSales: { $sum: "$amount" } } },
{ $sort: { "_id.month": 1 } }
])
Explanation: This pipeline provides total sales per month for 2024, sorted by month.
Segments customers based on total spending using $group and $project.
db.sales.aggregate([
{ $group: { _id: "$customerId", totalSpent: { $sum: "$amount" } } },
{
$project: {
customerType: {
$cond: { if: { $gte: ["$totalSpent", 1000] }, then: "VIP", else: "Regular" }
}
}
}
])
$match at the beginning to reduce data volume.$project: Limit fields to save memory.$lookup: Ensure indexes exist on fields used in joins.$merge for Complex Pipelines: Store intermediate results for reusability.
The expressive features of MongoDB’s Aggregation Framework unlock powerful capabilities for data analysis and transformation. From basic filtering and grouping to advanced expressions, conditional logic, and array manipulation, the aggregation framework allows you to perform complex operations directly within the database. Happy Coding!❤️
