Introduction to Rest with Examples – Part 2


In the previous post, we talked about what is REST APIs and discussed a few examples, we particularly, used CURL for our requests. So far, we have established that a request is composed of four parts: endpoint, method, header, and data. We have already explained endpoint and method, now let’s go over the header, data, and some more relevant information on the subject.

Headers

Headers offer information to the server and the client. They are used for a wide range of use cases, such as offering a peek into the body content or for authentication. Typically, HTTP headers follow the property-value pair format; a colon separates them. For instance, the following example consists of a header which informs the server about expecting JSON-based content.

“Content-Type: application/json”. Missing the opening”

By using cURL (we talked about it in the last post), you can use the –header option for sending the HTTP headers. For instance, if you want to send the above-mentioned header, then for the Github API, you can write the following.

curl -H “Content-Type: application/json” https://api.github.com

In order to check all of your sent headers, you can use the –verbose or the –v option at the end of the request. Consider the following command as an example.

Keep in mind that in your result, “*” indicates cURL’s additional information, “<” indicates the response headers and “>” indicates the request headers.

The Data (Body)

Let’s come to the final component of a request, also known as the message or the body. It entails information that is to be sent to any server. To use cURL for sending data, you can use the –data or the –d options like the following format.

For multiple fields, you can write the following .i.e. add two –d options.

It is also possible to break requests into several lines for better readability. When you learn how to spin (start) servers, you can easily create your API and test it with any data. If you are not interested in spinning up a server, you can use Requestbin.com and hit the “create endpoint”. In response, you can get a request which can be used for testing requests. In order to test requests, you have to generate your own request bin. Keep in mind that these request bins have a lifespan of 48 hours. Now you can transfer data to your request bin by using the following.

curl -X POST https://requestb.in/1ix963n1 \

-d name=adam \

-d age=28

cURL’s data transfer is similar to a web page’s form fields. For JSON data, you can alter your “Content-Type” and change it to “application/json”, like this.

curl -X POST https://requestb.in/1ix963n1 \

-H “Content-Type: application/json” \

-d ‘{

“adam”:”value”

“age”:”28”

}’

And with this, your request’s anatomy is finished.

Authentication

While using POST requests with your Github API, a message displays “Requires authentication”. What does this mean exactly?

Developers ensure that there are certain authorization measures so specific actions are only performed by the right parties; this negates the possibilities of impersonation by any malicious third party. PUT, PATCH, DELETE, and POST requests change the database, forcing the developers to design some sort of authentication mechanism. What about the GET request? It also needs authentication but only in some cases.

In the world of web, authentication is performed in two ways. Firstly, there is the generic user/password authentication—known as the basic authentication. Secondly, authentication is done by a secret token. The second method consists of something known as oAuth—it uses Google, Facebook, and other social media platforms for user authentication. For using the user/password authentication, you have to use the “-u” option like the following.

You can test this authentication yourself. Afterward, the previous “requires authentication” response is changed to “Problems parsing JSON”. The reason behind this is that so far, you have not sent any data. Since it is a POST request, data transfer is a must.

HTTP Error Messages and Status Codes

The above-mentioned messages like “Problems parsing JSON” or “Requires authentication” fall into the category of HTTP error messages. These emerge whenever a request has an issue. With HTTP status codes, you can learn your response status instantly. The range of these codes starts from 100+ and end to 500+.

  • The success of your request is signified by 200+.
  • The redirection of the request to any URL is signified by the 300+.
  • If the client causes an error, then the code is 400+.
  • If the server causes an error, then the code is 500+.

In order to debug a response’s status, you can use the head or verbose options. For instance, if you add “-I” in a POST request and do not mention the username/password details, then it can cause a 401 status code. When your request is flawed—either due to incorrect or missing data, a 400 status code appears.

Versions of APIs

Time and again, developers upgrade their APIs, it is a life-long process. When too many modifications are required, the developers should consider creating a new version. When this occurs, it is possible that your application gets an error; due to the fact that you wrote code with respect to the previous version API while the brand-new API is pointed out by your requests.

In order to perform a request for a certain version of the API, there are two methods. Depending on your API’s structure, you can choose any of them.

  • Use endpoint.
  • Use the request header.

For instance, Twitter follows the first strategy. For instance, a website can follow it in this way:

https://api.abc.com/1.1/account/settings.json

On the other hand, Github takes advantage of the second method. For instance, consider the following where the API version is 4 as mentioned in the request header.

curl https://api.abc.com -H Accept:application/abc.v4+json

 

 

Are Microservices the Right Fit For You?


The term Microservices was originally coined in 2011. Since then it has been on the radars of modern development organizations.  In the following years, software architecture has gained traction in various IT circles. According to a survey, the enterprises which used microservices were around 36 percent while 26 percent were thinking to include it in the future.

So, why exactly should you use microservices your company? There has to be something unique and more rewarding in it that can compel you to leave your traditional architecture in favor of it. Consider the following reasons to decide for yourself.

Enhance Resilience

Microservices can help to decouple and decentralize your complete application into multiple services. These services are distinct because they operate independently and are separate from each other. As opposed to the conventional monolithic architecture in which code failure can disrupt one function or service, there are little to no possibilities a single service failure to affect another. Moreover, even if you have to do maintain code for multiple systems, it will not be noted by your users.

More Scalability

In a monolithic architecture, when developers have to scale a single function, they have to tweak and adjust other functions as well. Perhaps, one of the biggest advantages of microservices is the scalability which it brings to the table. Since all the services in microservices architecture are separate, therefore it is possible to scale one service or function without having to worry about scaling up the complete application. You can deploy critical business services on different servers to improve the performance and availability of your application whereas your other services remain unaffected.

Right Tool for the Right Task

Microservices ensure that a single vendor does not make you pigeonholed. It can help you to infuse greater flexibility for your projects so rather than trying to make things work with a single tool, you can instead look up for the right tool which can fit your requirements. Each of your services can use any framework, programming language, technology stack, or ancillary services. Despite this heterogeneousness, they can still communicate and connect easily.

Promotion of Services

In microservices, there is no need to rewrite and adjust the complete codebase if you have to change or incorporate a new feature in your application. This is because microservices are ‘loosely coupled’. Therefore, you only have to modify a single service if it is required. The strategy to code your project in smaller increments can help you to test and deploy them independently. In this way, you can promote your services and application quickly, as soon as you complete one service after another.

Maintenance and Debugging

Microservices can help you to test and debug applications easily. The use of smaller modules via continuously testing and delivery means that you can create applications from bugs and errors, thereby improving the reliability and quality of your projects.

Better ROI

With microservices, your resource optimization is instantly improved. They allow different teams to operate by using independent services. As a result, the time needed to deploy is reduced. Moreover, the time for development is also significantly decreased while you can achieve greater reusability as well for your project. The decoupling of services also means that you do not have to spend much on high-priced machines. You can use the standard x86 machines as well. The efficiency which you get from microservices can minimize the costs of infrastructure along with the downtime.

Continuous Delivery

While working with a monolithic architecture, dedicated teams are needed to code discrete modules like front-end, back-end, database, and other parts of the application. On the other hand, microservices allow project managers to add cross-functional teams in the mix who can manage the application lifecycle through a delivery model which is entirely continuous in nature. When testing, operations, and development teams use a single service at the same time, debugging and testing is quickened and made easier. This strategy can help you to develop, test, and deploy your code ‘continuously’. Moreover, you do not have to write new code, instead, you can write code with the help of the existing libraries.

Considerations before Deciding to Use Microservices

If you have decided to use a microservices-based architecture, then review the following considerations.

The State of Your Business

To begin with, you have to think if your business is big enough that it warrants your IT team to work on complex projects independently. If you are not, then it is better to avoid microservices.

Assess the Deployment of Components

Analyze the components and functions of your software. If there are two or more components which you deploy in your project which are completely separate from each other in terms of business processes and capabilities, then it is a wise option to use microservices.

Decide if Your Team Is Skilled for the Project

The use of microservices allows project managers to use smaller teams for development that are well-skilled in their respective expertise. As a result, it helps to quickly generate new functionalities and release it.

Before you adopt the microservices architecture, you have to make sure that your team members are well positioned to operate with continuous integration and deployment. Similarly, you have to see if they can work in a DevOps culture and are experienced enough to work with microservices. In case, they are not good enough yet, you can focus on creating a group who is able to fulfill your requirements to work with microservices architecture. Alternatively, you can also hire experienced individuals to make up a new team.

Define Realistic Roadmap

Exponential scaling is the key to success. Despite the importance of businesses to be agile, it is not necessary for all businesses to scale. If you feel that complexity cannot help you much, then it is better to avoid a microservices architecture. You have to decide on some realistic goals about how your business is going to operate in the future to decide if the adoption of microservices architecture can reap your benefits.

How Has Google Improved Its Data Center Management Through Artificial Intelligence


Historically, the staff at data centers adjusted the settings of the cooling systems to save energy costs. Times have changed, and this is the sweet age of AI where intelligent systems are on guard 24/7 and automatically adjust these settings to save costs.

Last year, a tornado watch prompted Google’s AI system to take control of its cooling plant in a data center and it modified the system settings. The staff at Google was initially perplexed because the changes did not make sense at the time. However, after a closer inspection, the AI system was found to be taking a course of action that reduced the energy consumption.

The increase and decrease in temperature, humidity levels, and atmospheric pressure force the change in weather conditions, and they can stir a storm. This weather data is used by Google’s AI to adjust the cooling system accordingly.

Joe Kava, Google’s Vice President of data centers, revealed Google’s use of AI for data centers back in 2014. At that time, Kava explained that the company designed a neural network to assess the data which is collected from its data centers and suggested a few strategies to enhance its processing. These suggestions were later utilized as a recommendation engine.

Kava explained that they had a single solution which would provide them with recommendations and suggestions. Afterward, the qualified staff at Google would begin modifying the pumps, heat exchangers, and chillers settings according to the results of AI-based recommendations. In the last four years, Google’s AI usage has evolved beyond Kava’s proposed vision.

Presently, Google is adopting a more aggressive approach. Instead of only dishing out recommendations to the human operators could act on them, the new system would itself go onto adjust the cooling settings. Jim Gao, a data engineer at Google, said that the previous system saved 20 percent energy costs and that the newer update would save up to 40 percent in energy consumption.

Little Adjustments

The tornado watch is only a single real-world instance of Google’s powerful AI and its impact on energy savings to an extent which was impossible with manual processes. While at first glance, the minor adjustments done by the AI-enabled system might not seem enough. However, the sum of each savings results in a huge total.

Kava explains that the detailing performed by the AI systems makes it matchless. For instance, if the temperature in the surroundings of the data center goes from 60 degrees Fahrenheit to 64 degree Fahrenheit while the wet-bulb temperature is unaffected, then an individual from the data center staff would not go think much about updating the settings of the cooling system. However, the AI-based system is not so negligent. Whether there is a difference of 4 degrees or 40 degrees, it keeps on going.

One interesting observation regarding the system was its noticeably improved performance during the launch of new data centers. Generally, new data centers are not efficient as they are unable to get the most of the available capacity.

From Semi to Full Automation

The transfer of critical tasks of the infrastructure to the AI system has its own implications and considerations.

With the increase of data and runtime, the AI system becomes more and more powerful and therefore, management also starts to have faith in the system, enough to give it some control. Kava explained that after some experimentation and results, slowly and gradually the semi-automated tools and equipment are replaced by fully automated tools and equipment.

Uniformity is the key to Google’s AI exploits; it is not possible to implement AI at such a massive scale without uniformity. All the data centers are designed to be distinct such that a single AI system is not possible to be integrated across all of them at the same time.

The cooling system of all the data centers are constructed for maximum optimization according to their geographical locations. Google has tasked its data engineering team to continuously look for any possible techniques for energy savings.

Additionally, ML-based models are trained according to their sites. The models have to be programmed to follow that site’s architecture. This process takes some time. However, Google is positive that this consumption of time would result in better results in the future.

The Fear of Automation

One major discussion point with this rapid AI automation and similar AI-based ventures is the future of “humans” or the replacement of the humans. Are the data center engineers from Google going to lose their jobs? This question contains one of mankind’s biggest fears regarding AI. As AI progresses, this uncertainty has crept into the minds of workers. However, Kava is not worried. Kava stated that Google still has staff at its disposal at data centers that is responsible for maintenance. While AI may have replaced some of their “chores”, the staff still has to perform corrective repairs and preventative maintenance.

Kava also shed some light on some of AI’s weaknesses. For instance, he explained that whenever the AI system finds itself in the midst of uncharted territory, it struggles to choose the best course of action. Therefore, it is unable to mimic the brilliance of humans in making astute observations. Kava concluded that it is recommended to use AI for cooling and other data center related tasks, though he cautioned that there must be some “human” presence to ensure that nothing goes amiss.

Final Thoughts

Google’s vision, planning, and execution of AI in its data centers are promising for other industries too. Gao’s model is believed to be applicable to manufacturing plants that also have similar setups like cooling and heating systems. Similarly, a chemical plant could also take advantage of AI and likewise, a petroleum refinery may use AI in the same way. The actual realization is that, in the end, such AI-based systems can be adopted by other companies to enhance their systems.

The Growing Role of Artificial Intelligence in Data Centers


According to Infosys, more than 75 percent of IT experts view artificial intelligence as a permanent strategic priority which can assist them in innovating their organization’s structure. Infosys’ survey receives credibility from the fact that the AI systems are expected to receive investments worth $57 billion by the year 2021. That being said, the implementation of AI is a complex task which requires considerable time and decision-making to go smoothly. Today, AI has initiated the transformation of all the global industries.

One of such industries is the data center industry where AI is making its mark slowly and gradually. Data centers power the operations of organizations all around the world. The data volumes are increasing daily, putting more and more strain on the hardware and software setups in the organization. Consequently, managers are forced to introduce new servers and hardware equipment so their IT infrastructure becomes powerful enough to store and process data without any issue. Currently, most of the centers are not able to maximize their output because they use legacy systems. So how is AI transforming data centers?

Energy Consumption

Energy consumption remains one of the most critical and dire issues in data centers. Bear in mind that as of now, about 6 percent of the world’s electricity is used by data centers. With the computing requirements climbing up day-by-day, it is fair to assume that the energy consumption of data centers will also increase.

On one hand, companies have to address the cost factor, and on the other hand, global warming is mounting pressure on organizations to do their part and act more ‘responsibly’ towards the environment. Particularly, the data center industry is one of those industries that are viewed negatively by the supporters of green energy.

Some data centers have attempted to address such issues by accepting renewable energy. However, there are qualms about its ineffectiveness for smaller setups. There are few companies that have resorted to AI as the answer to their common problems.

AI is being used for real-time monitoring to reduce energy consumption. Moreover, AI is used for parallel and distributed computing to achieve a greater level of productivity. Some organizations have identified and resolved networking troubleshooting via AI. Similarly, there are those who adjust their heating and cooling mechanisms via AI. Due to the widespread use of artificial intelligence, there is no need for staff members to continuously manage mundane tasks such as setting the office temperature.

Security

Security is also one of the most pressing issues for data centers. Cybercriminals have particularly set their eyes on the data centers. With the amount of sensitive data being stored in these data centers, it is not surprising that hackers try to target these centers. For instance, if a cybercriminal group succeeds in a ransomware attack on a data center then by just locking the servers, they bring the entire organization down on its knees. Dreading the losses due to downtime and reputational damage, the company has no option but to pay a ransom to save their data center from complete destruction. Unfortunately, ransom payment does not guarantee the return of data. While organizations are trying their best to infuse the most effective measures to restrict such attacks, they have found AI as an underrated ally in their proactive action against cyber attacks.

AI’s addition in the equation offers a greater level of flexibility and sophistication to protect the data and minimize the dependence of systems on manual intervention. Unlike humans, AI can be available 24/7 and may become the wall that ultimately safeguards you from a cyber attack. For instance, Darktrace—a British organization—leveraged AI to specify a normal network behavior where cyber threats are assessed and identified on the basis of a deviated activity.

Data Center Staffing

AI is also offering a chance for organizations to reduce their staff shortages so they can assign their qualified personnel to the relevant areas. It is expected that with AI in the mix, the standard tech support responsibilities in the center would be handed over to AI-based systems.  These responsibilities would include automation of routine and mundane tasks like the following:

  • Resolving any incoming issue.
  • Working on the help desk support.
  • Provision of services and resources.

Additionally, AI would provide an edge by capturing new symptoms, events, and scenarios for the generation of a functional knowledge base to aid the external and internal stakeholders to learn from the past issues and avoid repeating the same mistakes in future.

However, there will be times when human intervention would be necessary. In such cases, a connection can be established with senior staff members who can fulfill the required task through their years of experience.

Predictive Analytics

With enhanced outage monitoring, AI is providing a major advantage to data centers. AI systems are able to detect and predict any incoming data outage. They can continuously track the performance of all the servers and assess the storage operations like the utilization of disk.

All of this has been made possible through contemporary predictive analytics tools which do not only increase reliability but also are fairly easy to use. Probably the biggest advantage of predictive analytics is that it supervises the workload through optimization, lessens the burden from systems, and distributes the workload more evenly among all the hardware tools.

This modern outlook of data centers is widely different from the conventional data center practices. Traditionally, such troubleshooting was based completely on manual assistance, research, and computation—computers were merely a tool to execute and command their strategies. AI, on the other hand, positions itself as an independent player which can be seen more as a professional colleague rather than a tool.

Final Thoughts

As the management of data centers becomes tougher and more complex with the passing time, AI has been a welcome entry in the space as an IT technology. AI has improved the overall output without any notable compromise. It remains to be seen what more advancements arrive in data centers in the near future. For the time being, AI has done a marvelous job at managing data centers.

How Does a Decision Tree Function?


In the last post, we discussed how artificial neural networks are modeled on a human brain. There are other algorithms too which have been inspired by the real world. For instance, we have the decision tree algorithm in machine learning which is founded on the basis of a tree. Such an algorithm is used for decision analysis. The algorithm is also frequently used in data mining to derive meaningful results. So, how exactly does it work?

To understand a decision tree, let’s suppose an elementary example in which we have a dataset of passengers of a ship. It is expected that a violent storm would cause the ship to get wrecked. Now the problem at hand is to predict the survival rate of a passenger based on their characteristics. Their attributes (also known as features) are mainly their age, spch (any spouse or children with them), and age.

dtree

As you can see, a decision tree is visually represented in an upside-down approach where instead of placing the root at the top, we present it at the top. The italicized text which shows our condition represents the internal node which divides the three into edges (branches). The branch which is not divided any further is referred to as the decision (leaf).

If you analyze the above example, then you can recognize the fact that all the relevant relations are easily viewable, thereby making for strong feature importance. This approach is also called a “learning decision tree from data”. The tree in our drawn example is categorized under a classification tree as it is used to classify the survival rate or fatality rate of a passenger tree. The other category is known as regression tree which is not too dissimilar, except for the fact that they deal with continuous values. The decision tree algorithms are broadly depicted as CAT— Classification and Regression Trees.

The growth of a decision tree depends upon its features (attributes or characteristics) and the conditions which are used to divide the tree with a clear intent about the stopping point of the three. Often, the growth of tree exceeds to arbitrary levels where some trimming is required for better results.

Recursive Binary Splitting

Recursive binary splitting uses a cost function to test all the features and the split points. For instance, in the above example from the root, all features were analyzed after which groups were formed from the divisions of the training data. Our example has 3 examples which mean we require 3 splits. Subsequently, we are going to compute the cost of each split in terms of accuracy. When the least costly split is discovered, which refers to the sex feature in our example then the feature is chosen. This approach of the algorithm is naturally recursive because more groups can undergo subdivisions by repeating the same process. Therefore, the algorithm falls into the category of greedy algorithms. This also means that the most effective classifier is the root node.

Split Cost

Let’s try to understand cost functions more closely while working with classification and regression. Cost functions always attempt to identify the branches which exhibit similarity. Therefore, it is certain that any input which is test data is bound to adhere to the specific path.

Regression: sum(y-prediction)^2

For instance, consider the real estate industry a problem requires the prediction of house prices. In this case, the decision tree initiates the splitting processing and analyzes all the features from the training data. It calculates the input of training data to generate mean for responses which are treated as a prediction for their respective groups. The function is performed for all the data points while a cost is generated for the candidate splits. In the end, the split which consumed the smallest cost is chosen.

Classification: G = sum(pk * (1 — pk))

To determine the quality of a split, the gini score is used which assesses the mixing of the response classes in the split’s groups. In the above equation, pk refers to the proportion in which a particular group has similar class inputs. Maximum purity of a class is achieved when it established that a group encompasses the same class’ inputs. In such a scenario the value of pk maybe either 0 or 1 while G remains 0. The worst purity is established when a node gets 50-50 split for a group’s classes. In binary classification, the values of pk and G would be 0.5 each in such a scenario.

Putting a Stop to Split

There is a point at which the split of the tree must be stopped. Generally, problems have several features which means that the resulting split is also huge, thereby creating a large tree. This is an undesired scenario because such trees raise over-fitting issues. One strategy for stopping a split is to define the lowest number for training inputs which are to be assigned for all the leaves. For instance, in the above example, we can take 15 passengers to reach a consensus or decision for survival or death whereas any leaf which is bombarded with less than 15 passengers is duly rejected. Conversely, you can also define the max depth for the model. Max depth is the longest path’s total length which exists between a root and a tree.

Pruning is used to enhance the performance of a decision tree. In pruning, any branch with low or weak feature importance is eliminated, thereby minimizing the tree’s complexity and boosting its predictive strength. Pruning can either initiate from the leaves or the root. In simpler scenarios, pruning begins from the leaves where it eliminates nodes that have the most popular class of that leaf unless they are not violating accuracy. This strategy is also called as reduced error pruning.

Final Thoughts

The above-mentioned knowledge is enough to complete your initial understanding of a decision tree. You can begin its coding by using Python’s Scikit-Learn library.

What Is AIOps?


Recently I came across one of the very interesting topics- AIOps.

AIOps refers to Artificial Intelligence for IT operations. It involves the use of machine learning, big data analytics, and AI tools to automate the IT infrastructure of an organization.

In larger enterprises, the applications, systems, and services generate massive data volumes. With AIOps, organizations can utilize this data for monitoring their assets and examining their IT dependence more closely.

Capabilities

Ideally, an AIOps solution provides the following functionalities.

1- Automation for Routine Procedures

AIOps facilitates organizations to integrate automation in daily routine procedures. This automation can be performed for requests from users or to manage non-critical notifications from the system. For instance, if AIOps is used, then a help desk system can respond appropriately to a request from a user—all by itself. Similarly, AIOps tools can assess an alert from a system and evaluate if it requires any action without the need of a supervising authority.

2- Detection

AIOps can detect critical issues quicker and better than any manual strategy. When a familiar malware is detected on a non-critical system, the IT experts may try to eliminate it. In the meantime, they might miss an unusual activity or process on a critical system from a newly-arrived and sophisticated threat. As a consequence, the organization suffers a huge setback.

On the other hand, AIOps can make a difference by the use of vulnerability prioritization .i.e. it immediately notifies the authority about a possible cyber invasion for the critical system while for the non-critical system, it can respond by running an anti-malware tool.

3- Streamlining Interactions

Historically, before AIOps was in the scene, teams had to share and work on information through either meetings or exchanging data manually. AIOps can streamline the communication and coordination processes between teams and data center groups. It shows “relevant” data to all the IT groups. For this, the AIOps platform must be designed in a way so that it can monitor and analyze which type of data to present to which of the team.

Technologies

AIOps combines several techniques for aggregation, algorithms, analytics, data output, visualization, machine learning, and automation and orchestration. All of these techniques are mature and integrated.  So how do they work?

Log files, helpdesk ticketing systems, and monitoring provide data for AIOps. Then big data tools are used to properly manage and aggregate any data coming from the system as an output and convert it into a more useful format. To do this, analytics methods and procedures are used which attempt to extract raw data and transform it into a meaningful fresh piece of data. Therefore, analytics eliminates “noise” and irrelevant data. Additionally, it also searches for recurring patterns that can detect and mark common issues.

However, analytics cannot run without the use of proper algorithms. Algorithms support an AIOps solution to respond with the most appropriate course of action. Algorithms are configured to ensure that the IT staff can help the platform learn about the decisions pertaining to the application performance.

Algorithms are the center of machine learning. The AIOps platform sets out a standard for normal activities and behavior where it can continue to update by adding new algorithms with the addition of new data in the infrastructure of the organization.

Automation ensures that any AIOps tool is quick in performing the required action or set of actions. Automated tools are “forced” to act based on their communication with machine learning and analytics tools. For instance, a machine learning tool may establish that an application in a system requires additional storage to function. This piece of information is passed out to an automation tool which resolves to perform an action like adding more storage.

Lastly, to help in decision making, visualization is used in AIOps. It generates dashboards which are extremely easy to use and read. These dashboards contain graphical representations of all kinds, reports, and other visual elements to simplify different types of output. As a result, the management is able to remain in the loop and take any rational decision.

How Has AIOps Proved to Be a Breakthrough?

Before the emergence of AIOps, organizations faced difficulties because their IT personnel spent much of their time on routine and basic tasks. AIOps proved to be a breakthrough by helping organizations focus on more critical issues. As a result, such platforms have saved a great deal of time. IT personnel now attempt to train and educate AIOps platforms to become familiar with the organization’s IT infrastructure.

Afterward, it continues to update and evolve by making use of machine learning and algorithms as well as going through the “learned” history which it accumulated with the passage of time. Therefore, they provide for an excellent monitoring tool that has the “rationality” to perform many useful tasks.

Moreover, AIOps platforms examine and inspect causal relationships from various services, resources, data sources, and systems. Machine learning functionalities identify and run robust root cause analysis. As a result, troubleshooting of frequent issues is enhanced.

Furthermore, AIOps assists organizations to increase collaboration among all the departments. With the reports from visualization, team leaders are able to comprehend requirements and perform their duties with a renewed sense of direction.

The Other Side of the Coin

AIOps is extremely promising, but some analysts consider it to be unrefined. The debate that the effectiveness of an AIOps platform is as powerful as its “training” while the time needed to create, implement, and administer such a platform may be too time-consuming for many organizations.

Likewise, they argue that due to its ability to perform a wide variety of tasks, it requires trust from organizations. Since AIOps tool works autonomously, they have to be trained in such a way that they can easily adapt according to the environment of their organization and be able to accumulate and collect data, come up to the most logical conclusion, and allocate actions accordingly.

What Are Indexes in MongoDB?


What Are Indexes in MongoDB?

When a query is run in MongoDB, the program initiates a collection scan. All the documents which are stored in a collection have to be scanned so that only the appropriate documents can be matched. Obviously, this is a highly wasteful tactic as checking each document results in inefficient utilization of resources.

To address this issue, there is a certain feature in MongoDB known as indexes. Indexes perform as a filter so the scanning pool can be shortened and queries can be executed more “efficiently”. Indexes can be categorized as a “special” type of data structures.

Indexes save parts of a collection’s information (data). What they do is that they save a single or multiple fields’ value. The processing of an index’s content is done order-wise.

By default, MongoDB generates an index for the _id field whenever a collection is built. This index is unique. Due to the presence of this index, it is not possible to insert multiple documents which carry the exact same _id field value. Moreover, unlike other indexes, this index is un-droppable.

How to Create an Index?

Open your Mongo Shell and employ the method “db.collection.createIndex()” for the generation of an index. For the complete format, consider the following.

db.collection.createIndex( <key and index type specification>, <options> )

To develop our own index, let’s suppose we have a field for employee name as “ename”. We can generate an index on it.

db.employee.createIndex( { ename: -1 } )

Types

Indexes are classified in the following categories.

  • Single Field
  • Compound Index
  • Multikey
  • Text Indexes
  • 2dsphere Indexes
  • geoHaystack Indexes

 

Single Field Indexes

A single-field index is the simplest index of all. As the name suggests, it applies indexing on a single field. We begin our single-field example with a collection “student”. Now, this student collection carries documents like this:

{

“_id”: ObjectId(“681b13b5bc3446894d86cd342”),

“name”: Adam,

“marks”: 400,

“address”: { state: “TX”, city: “Fort Worth” }

}

To generate an index on the “marks” field, we can write the following query.

db.student.createIndex( { marks: 1 } )

We have now successfully generated an index which operates via an ascending order. This order is marked by the value of an index. With “1” as a value, you can define an index which arranges its contents by using the ascending order. On the other hand, a “-1” value defined an index by using the descending order. This index can now work with other queries that involve the use of “marks”. Some of their examples are:

db.student.find( { marks: 2 } )

db.student.find( { marks: { $gt: 5 } } )

In the second query, you might have noticed “$gt”. $gt is a MongoDB operator which translates to “greater than”. Similar operators are $gte (greater than and equal to), $lt (less than), and $lte (less than and equal to). In our upcoming examples, we are going to use these operators heavily. These are used for filtering out documents by specifying limits.

It is possible to apply indexing on the embedded documents too. This indexing requires the use of dot notation for the embedded documents. Continuing our “student” example,

{

“_id”: ObjectId(“681b13b5bc3446894d86cd342”),

“marks”: 500,

“address”: { state: “Virginia”, city: “Fairfax” }

}

We can apply indexing on the address.state field.

db.student.createIndex( { “address.state”: 1 } )

Whenever queries involving “address.state” are employed by the users, this index would support them. For instance,

db.student.find( { “address.state”: “FL” } )

db.student.find( { “address.city”: “Chicago”, “address.state”: “IL” } )

Likewise, it is possible to build indexes on the complete embedded document.

Suppose you have a collection “users” which contains the following data.

{

“_id”: ObjectId(“681d15b5be344699d86cd567”),

“gender”: “male”,

“education”: { high school: “ABC School”, college: “XYZ University” }

}

If you are familiar with MongoDB, then you know that “education” field is what we call an “embedded document”. This document contains two fields: high school and college. Now to apply indexing on the complete document, we can write the following.

db.users.createIndex( { education: 1 } )

This index can be used by queries like the following.

db.users.find( { education: { college: “XYZ University”, high_school: “ABC School” } } )

Compound Indexes

So far, we have only used a single field for indexing. However, MongoDB also supports the usage of multiple fields in an index. Such indexes are referred to as compound indexes. Bear in mind that there can be no more than 32 fields in compound indexes. To generate such an index, you have to follow this format where ‘f’ refers to the field name and ‘t’ refers to the index type.

db.collection.createIndex( { <f1>: <t1>, <f2>: <t2>, … } )

Suppose we have a collection “items” which stores these details.

{

“_id”: ObjectId(…),

“name”: “mouse”,

“category”: [“computer”, “hardware”],

“address”: “3rd Street Store”,

“quantity”: 80,

}

Compound index can be now applied on the “name” and “quantity” fields.

db.items.createIndex( { “name”: 1, “quantity”: 1 } )

Bear in mind that the order of fields in a compound index is crucial. The index will process by first referencing to the documents which are sorted according to the “name” field. Afterward, it will process the “quantity” field with the values of the sorted “name”.

Compound indexes are not only useful in supporting queries, which equal the index fields, but they also work with matched queries for the index field’s prefix. This means that the index works with queries that have only the “name” field as well as those that have the “quantity” field. For instance,

db.items.find( { name: “mouse” } )

db.items.find( { name: “mouse”, quantity: { $gt: 10 } } )

So far we have been using descending and ascending order with queries. Now, there is no issue in running it with single-field indexes but for the compound indexes, you have to analyze if your queries will work or not. For example, we have a collection “records” which stores documents having the fields “date” and “item”. When queries are used with this collection, then firstly, results are generated by arranging “item” in ascending order, and then a descending order is applied on the “date” values. For instance,

db.records.find().sort( { item: 1, date: -1 } )

Queries where we apply a descending order on the “item” and an ascending order on the “date” value work like:

db.records.find().sort( { item: -1, date: 1 } )

These sort operations can perfectly work with the queries like these:

db.records.createIndex( { “item” : 1, “date” : -1 } )

However, the point to note is that you cannot apply ascending order on both fields like the following.

db.records.find().sort( {“item”: 1, date: 1 } )