An Introduction to Multikey Indexes with Examples


After understanding single-field and compound indexes, now is the time to learn about multikey indexes. When a field has an array value, an index key is generated for each of its array elements. These indexes are referred as multikey indexes. They can be used with arrays that have scalar values as well as those with nested documents.

To generate a multikey index, you have to use the standard db.collection.createIndex() method. Such indexes are automatically generated by MongoDB whenever it senses an indexed field specified as an array. Hence, there is no need for explicit definition of a multikey index.

Examples

While working with a standard array, let’s suppose we have a collection “student”.

{ _id: 21, name: “Adam”, marks: [ 80, 50, 90 ] }

To build an index on the “marks” field, write the following query.

db.student.createIndex( { marks: 1 } )

As the marks field is an array, thus this index is an example of a multikey index. All of the keys (80, 50, and 90) in its elements point to the same document.

To create a multikey index for array fields having embedded documents, let’s suppose we have a collection “products”.

{

_id: 5,

name: “tshirt”,

details: [

{ size: “large”, type: “polo”, stock:50 },

{ size: “small”, type: “crew neck”, stock:40 },

{ size: “medium”, type: “v neck”, stock: 60 }

]

}

{

_id: 6,

name: “pants”,

details: [

{ size: “large”, type: “cargo”, stock: 35 },

{ size: “small”, type: “jeans”, stock: 65 },

{ size: “medium”, type: “harem”, stock: 10 },

{ size: “large”, type: “cotton”, stock: 10 }

]

}

{

_id: 7,

name: “jacket”,

details: [

{ size: “large”, type: “bomber”, stock: 35 },

{ size: “medium”, type: “leather”, stock: 25 },

{ size: “medium”, type: “parka”, stock: 45 }

]

}

We can build a multikey index with the details.size and details.stock fields.

db.products.createIndex( { “details.size”: 1, “details.stock”: 1 } )

This index is now good to go against queries which have only a single field of “details.size” as well as queries having the both of the indexed fields. As such, these types of queries would benefit from the index.

db.products.find( { “details.size”: “medium” } )

db.products.find( { “details.size”: “small”, “details.stock”: { $gt: 10 } } )

Bounds in Multikey Index

Bounds represent the limits of an index .i.e. how much it needs to scan for searching a query’s results. If there are more than a single predicate with an index, then MongoDB integrates them through compounding or intersection. So what do we mean by intersection and compounding?

Intersection

Bounds intersection point towards the presence of “AND” (logical conjunction) for bounds. For example, if there are two bounds [ 4, Infinity]  and [ – Infinity, 8 ], then intersection bounds process [[4, 8 ]] When $elemMatch operation is used, then MongoDB applies intersection on multikey index bounds.

 

What Is $elemMatch?

Before moving forward, let’s understand the use of $elemMatch first.

$elemMatch is used for matching documents in array field where at the bare minimum, atleast one of the element is matched with the query. Bear in mind, that the operator does not work with $text and $where operators. For a basic example, consider a “student” collection.

{ _id: 8, marks: [ 72, 75, 78 ] }

{ _id: 9, marks: [ 65, 78, 79 ] }

The following query only processes a match with documents in which the “marks” array has at least a single element which is less than 75 and greater or equal to 70.

db.student.find(

{ marks: { $elemMatch: { $gte: 70, $lt: 75 } } }

In response, the result set is comprised of the following output.

{ “_id” : 8, “marks” : [72, 75, 78 ] }

Despite the fact that both 75 and 78 do not conform to the conditions but because 72 had a matched, hence the $elemMatch selected it.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

For instance, we have a collection student which has a field “name” and an array field “marks”.

{ _id: 4, name: “ABC”, marks: [ 1, 10 ] }

{ _id: 5, name: “XYZ”, marks: [ 5, 4 ] }

To build a multikey index with the “marks” array.

db.student.createIndex( { marks: 1 } )

Now, the following query makes use of $elemMatch which means that the array must have at least one element which fulfils the condition of both predicates.

db.student.find( { marks : { $elemMatch: { $gte: 4, $lte: 8 } } } )

Computing the predicates one by one:

  • For the first predicate, the bounds are equal to or greater than 4 [ [ 4, Infinity ] ].
  • For the second predicate, the bounds are equal to or less than 8 [ [ – Infinity, 8 ] ].

Since $elemMatch is used here, therefore MongoDB can apply intersection on the bounds and integrate it like the following.

marks: [ [ 4, 8 ] ]

On the other hand, if the $elemMatch is not used, then MongoDB applies intersection on the multikey index bounds. For instance, check the following query.

db.student.find( { marks : { $gte: 4, $lte: 8 } } )

The query processes the marks array for atleast a single element which is equal to or greater than 4 “AND” atleast a single element which is equal to or less than 8. However, it is not necessary for a single element to conform to the requirements of both predicates, hence MongoDB does not apply intersection on the bounds and uses either [[4, Infinity]] or [[-Infinity, 8]].

Compounding

Compounding bounds means the use of bounds with compound index. For example, if there a compound index { x: 1, y:1 } which has a bound on the x field  [[4, Infinity] ] and a bound on the field y [[-Infinity, 8]]. By applying compounding,

{ x: [ [ 4, Infinity ] ], y: [ [ -Infinity, 8 ] ] }

Sometimes, MongoDB is unable to apply compounding on the given bounds. For such scenarios, it uses the bound on the leading field which in our example is x: [ [4, Infinity] ].

Suppose in our example indexing is applied on multiple fields, where one of the fields is an array. For instance, we have the collection student which stores the “name” and “marks” field.

{ _id: 10, name: “Adam”, marks: [ 1, 10 ] }

{ _id: 11, name: “William”, marks: [ 5, 4 ] }

Build a compound index with the “name” and the “marks” field.

db.student.createIndex( { name: 1, marks: 1 } )

In the following query, there is a condition which applies on both of the indexed keys.

db.student.find( { name: “William”, marks: { $gte: 4 } } )

Computing these predicates step-by-step.

  • For the “name” field, the bounds for the “William” predicate are the following [ [ “William”, “William” ] ].
  • For the “marks” field, the bounds for { $gte: 4 } predicate are [ [ 4, Infinity ] ].

MongoDB can apply compounding on both of these bounds.

{ name: [ [ “William”, “William” ] ], marks: [ [ 4, Infinity ] ] }

 

 

 

 

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s