How Has Google Improved Its Data Center Management Through Artificial Intelligence


Historically, the staff at data centers adjusted the settings of the cooling systems to save energy costs. Times have changed, and this is the sweet age of AI where intelligent systems are on guard 24/7 and automatically adjust these settings to save costs.

Last year, a tornado watch prompted Google’s AI system to take control of its cooling plant in a data center and it modified the system settings. The staff at Google was initially perplexed because the changes did not make sense at the time. However, after a closer inspection, the AI system was found to be taking a course of action that reduced the energy consumption.

The increase and decrease in temperature, humidity levels, and atmospheric pressure force the change in weather conditions, and they can stir a storm. This weather data is used by Google’s AI to adjust the cooling system accordingly.

Joe Kava, Google’s Vice President of data centers, revealed Google’s use of AI for data centers back in 2014. At that time, Kava explained that the company designed a neural network to assess the data which is collected from its data centers and suggested a few strategies to enhance its processing. These suggestions were later utilized as a recommendation engine.

Kava explained that they had a single solution which would provide them with recommendations and suggestions. Afterward, the qualified staff at Google would begin modifying the pumps, heat exchangers, and chillers settings according to the results of AI-based recommendations. In the last four years, Google’s AI usage has evolved beyond Kava’s proposed vision.

Presently, Google is adopting a more aggressive approach. Instead of only dishing out recommendations to the human operators could act on them, the new system would itself go onto adjust the cooling settings. Jim Gao, a data engineer at Google, said that the previous system saved 20 percent energy costs and that the newer update would save up to 40 percent in energy consumption.

Little Adjustments

The tornado watch is only a single real-world instance of Google’s powerful AI and its impact on energy savings to an extent which was impossible with manual processes. While at first glance, the minor adjustments done by the AI-enabled system might not seem enough. However, the sum of each savings results in a huge total.

Kava explains that the detailing performed by the AI systems makes it matchless. For instance, if the temperature in the surroundings of the data center goes from 60 degrees Fahrenheit to 64 degree Fahrenheit while the wet-bulb temperature is unaffected, then an individual from the data center staff would not go think much about updating the settings of the cooling system. However, the AI-based system is not so negligent. Whether there is a difference of 4 degrees or 40 degrees, it keeps on going.

One interesting observation regarding the system was its noticeably improved performance during the launch of new data centers. Generally, new data centers are not efficient as they are unable to get the most of the available capacity.

From Semi to Full Automation

The transfer of critical tasks of the infrastructure to the AI system has its own implications and considerations.

With the increase of data and runtime, the AI system becomes more and more powerful and therefore, management also starts to have faith in the system, enough to give it some control. Kava explained that after some experimentation and results, slowly and gradually the semi-automated tools and equipment are replaced by fully automated tools and equipment.

Uniformity is the key to Google’s AI exploits; it is not possible to implement AI at such a massive scale without uniformity. All the data centers are designed to be distinct such that a single AI system is not possible to be integrated across all of them at the same time.

The cooling system of all the data centers are constructed for maximum optimization according to their geographical locations. Google has tasked its data engineering team to continuously look for any possible techniques for energy savings.

Additionally, ML-based models are trained according to their sites. The models have to be programmed to follow that site’s architecture. This process takes some time. However, Google is positive that this consumption of time would result in better results in the future.

The Fear of Automation

One major discussion point with this rapid AI automation and similar AI-based ventures is the future of “humans” or the replacement of the humans. Are the data center engineers from Google going to lose their jobs? This question contains one of mankind’s biggest fears regarding AI. As AI progresses, this uncertainty has crept into the minds of workers. However, Kava is not worried. Kava stated that Google still has staff at its disposal at data centers that is responsible for maintenance. While AI may have replaced some of their “chores”, the staff still has to perform corrective repairs and preventative maintenance.

Kava also shed some light on some of AI’s weaknesses. For instance, he explained that whenever the AI system finds itself in the midst of uncharted territory, it struggles to choose the best course of action. Therefore, it is unable to mimic the brilliance of humans in making astute observations. Kava concluded that it is recommended to use AI for cooling and other data center related tasks, though he cautioned that there must be some “human” presence to ensure that nothing goes amiss.

Final Thoughts

Google’s vision, planning, and execution of AI in its data centers are promising for other industries too. Gao’s model is believed to be applicable to manufacturing plants that also have similar setups like cooling and heating systems. Similarly, a chemical plant could also take advantage of AI and likewise, a petroleum refinery may use AI in the same way. The actual realization is that, in the end, such AI-based systems can be adopted by other companies to enhance their systems.

Advertisement

The Growing Role of Artificial Intelligence in Data Centers


According to Infosys, more than 75 percent of IT experts view artificial intelligence as a permanent strategic priority which can assist them in innovating their organization’s structure. Infosys’ survey receives credibility from the fact that the AI systems are expected to receive investments worth $57 billion by the year 2021. That being said, the implementation of AI is a complex task which requires considerable time and decision-making to go smoothly. Today, AI has initiated the transformation of all the global industries.

One of such industries is the data center industry where AI is making its mark slowly and gradually. Data centers power the operations of organizations all around the world. The data volumes are increasing daily, putting more and more strain on the hardware and software setups in the organization. Consequently, managers are forced to introduce new servers and hardware equipment so their IT infrastructure becomes powerful enough to store and process data without any issue. Currently, most of the centers are not able to maximize their output because they use legacy systems. So how is AI transforming data centers?

Energy Consumption

Energy consumption remains one of the most critical and dire issues in data centers. Bear in mind that as of now, about 6 percent of the world’s electricity is used by data centers. With the computing requirements climbing up day-by-day, it is fair to assume that the energy consumption of data centers will also increase.

On one hand, companies have to address the cost factor, and on the other hand, global warming is mounting pressure on organizations to do their part and act more ‘responsibly’ towards the environment. Particularly, the data center industry is one of those industries that are viewed negatively by the supporters of green energy.

Some data centers have attempted to address such issues by accepting renewable energy. However, there are qualms about its ineffectiveness for smaller setups. There are few companies that have resorted to AI as the answer to their common problems.

AI is being used for real-time monitoring to reduce energy consumption. Moreover, AI is used for parallel and distributed computing to achieve a greater level of productivity. Some organizations have identified and resolved networking troubleshooting via AI. Similarly, there are those who adjust their heating and cooling mechanisms via AI. Due to the widespread use of artificial intelligence, there is no need for staff members to continuously manage mundane tasks such as setting the office temperature.

Security

Security is also one of the most pressing issues for data centers. Cybercriminals have particularly set their eyes on the data centers. With the amount of sensitive data being stored in these data centers, it is not surprising that hackers try to target these centers. For instance, if a cybercriminal group succeeds in a ransomware attack on a data center then by just locking the servers, they bring the entire organization down on its knees. Dreading the losses due to downtime and reputational damage, the company has no option but to pay a ransom to save their data center from complete destruction. Unfortunately, ransom payment does not guarantee the return of data. While organizations are trying their best to infuse the most effective measures to restrict such attacks, they have found AI as an underrated ally in their proactive action against cyber attacks.

AI’s addition in the equation offers a greater level of flexibility and sophistication to protect the data and minimize the dependence of systems on manual intervention. Unlike humans, AI can be available 24/7 and may become the wall that ultimately safeguards you from a cyber attack. For instance, Darktrace—a British organization—leveraged AI to specify a normal network behavior where cyber threats are assessed and identified on the basis of a deviated activity.

Data Center Staffing

AI is also offering a chance for organizations to reduce their staff shortages so they can assign their qualified personnel to the relevant areas. It is expected that with AI in the mix, the standard tech support responsibilities in the center would be handed over to AI-based systems.  These responsibilities would include automation of routine and mundane tasks like the following:

  • Resolving any incoming issue.
  • Working on the help desk support.
  • Provision of services and resources.

Additionally, AI would provide an edge by capturing new symptoms, events, and scenarios for the generation of a functional knowledge base to aid the external and internal stakeholders to learn from the past issues and avoid repeating the same mistakes in future.

However, there will be times when human intervention would be necessary. In such cases, a connection can be established with senior staff members who can fulfill the required task through their years of experience.

Predictive Analytics

With enhanced outage monitoring, AI is providing a major advantage to data centers. AI systems are able to detect and predict any incoming data outage. They can continuously track the performance of all the servers and assess the storage operations like the utilization of disk.

All of this has been made possible through contemporary predictive analytics tools which do not only increase reliability but also are fairly easy to use. Probably the biggest advantage of predictive analytics is that it supervises the workload through optimization, lessens the burden from systems, and distributes the workload more evenly among all the hardware tools.

This modern outlook of data centers is widely different from the conventional data center practices. Traditionally, such troubleshooting was based completely on manual assistance, research, and computation—computers were merely a tool to execute and command their strategies. AI, on the other hand, positions itself as an independent player which can be seen more as a professional colleague rather than a tool.

Final Thoughts

As the management of data centers becomes tougher and more complex with the passing time, AI has been a welcome entry in the space as an IT technology. AI has improved the overall output without any notable compromise. It remains to be seen what more advancements arrive in data centers in the near future. For the time being, AI has done a marvelous job at managing data centers.

How Does a Decision Tree Function?


In the last post, we discussed how artificial neural networks are modeled on a human brain. There are other algorithms too which have been inspired by the real world. For instance, we have the decision tree algorithm in machine learning which is founded on the basis of a tree. Such an algorithm is used for decision analysis. The algorithm is also frequently used in data mining to derive meaningful results. So, how exactly does it work?

To understand a decision tree, let’s suppose an elementary example in which we have a dataset of passengers of a ship. It is expected that a violent storm would cause the ship to get wrecked. Now the problem at hand is to predict the survival rate of a passenger based on their characteristics. Their attributes (also known as features) are mainly their age, spch (any spouse or children with them), and age.

dtree

As you can see, a decision tree is visually represented in an upside-down approach where instead of placing the root at the top, we present it at the top. The italicized text which shows our condition represents the internal node which divides the three into edges (branches). The branch which is not divided any further is referred to as the decision (leaf).

If you analyze the above example, then you can recognize the fact that all the relevant relations are easily viewable, thereby making for strong feature importance. This approach is also called a “learning decision tree from data”. The tree in our drawn example is categorized under a classification tree as it is used to classify the survival rate or fatality rate of a passenger tree. The other category is known as regression tree which is not too dissimilar, except for the fact that they deal with continuous values. The decision tree algorithms are broadly depicted as CAT— Classification and Regression Trees.

The growth of a decision tree depends upon its features (attributes or characteristics) and the conditions which are used to divide the tree with a clear intent about the stopping point of the three. Often, the growth of tree exceeds to arbitrary levels where some trimming is required for better results.

Recursive Binary Splitting

Recursive binary splitting uses a cost function to test all the features and the split points. For instance, in the above example from the root, all features were analyzed after which groups were formed from the divisions of the training data. Our example has 3 examples which mean we require 3 splits. Subsequently, we are going to compute the cost of each split in terms of accuracy. When the least costly split is discovered, which refers to the sex feature in our example then the feature is chosen. This approach of the algorithm is naturally recursive because more groups can undergo subdivisions by repeating the same process. Therefore, the algorithm falls into the category of greedy algorithms. This also means that the most effective classifier is the root node.

Split Cost

Let’s try to understand cost functions more closely while working with classification and regression. Cost functions always attempt to identify the branches which exhibit similarity. Therefore, it is certain that any input which is test data is bound to adhere to the specific path.

Regression: sum(y-prediction)^2

For instance, consider the real estate industry a problem requires the prediction of house prices. In this case, the decision tree initiates the splitting processing and analyzes all the features from the training data. It calculates the input of training data to generate mean for responses which are treated as a prediction for their respective groups. The function is performed for all the data points while a cost is generated for the candidate splits. In the end, the split which consumed the smallest cost is chosen.

Classification: G = sum(pk * (1 — pk))

To determine the quality of a split, the gini score is used which assesses the mixing of the response classes in the split’s groups. In the above equation, pk refers to the proportion in which a particular group has similar class inputs. Maximum purity of a class is achieved when it established that a group encompasses the same class’ inputs. In such a scenario the value of pk maybe either 0 or 1 while G remains 0. The worst purity is established when a node gets 50-50 split for a group’s classes. In binary classification, the values of pk and G would be 0.5 each in such a scenario.

Putting a Stop to Split

There is a point at which the split of the tree must be stopped. Generally, problems have several features which means that the resulting split is also huge, thereby creating a large tree. This is an undesired scenario because such trees raise over-fitting issues. One strategy for stopping a split is to define the lowest number for training inputs which are to be assigned for all the leaves. For instance, in the above example, we can take 15 passengers to reach a consensus or decision for survival or death whereas any leaf which is bombarded with less than 15 passengers is duly rejected. Conversely, you can also define the max depth for the model. Max depth is the longest path’s total length which exists between a root and a tree.

Pruning is used to enhance the performance of a decision tree. In pruning, any branch with low or weak feature importance is eliminated, thereby minimizing the tree’s complexity and boosting its predictive strength. Pruning can either initiate from the leaves or the root. In simpler scenarios, pruning begins from the leaves where it eliminates nodes that have the most popular class of that leaf unless they are not violating accuracy. This strategy is also called as reduced error pruning.

Final Thoughts

The above-mentioned knowledge is enough to complete your initial understanding of a decision tree. You can begin its coding by using Python’s Scikit-Learn library.

How Is Machine Learning Assisting Organizations to Tackle Sophisticated Cyberattacks?


Previously, cybercriminals were limited in their approach. With the passage of time, they evolved and firmed their grasp on newer technologies. As a result, they were able to initiate highly sophisticated campaigns against businesses and individuals alike. One such example is of the attack on LapCorps—one of the prominent names in the healthcare industry in the USA.

Over the past few years, the trend has worsened as cybercriminals are directly challenging governments through attacks in cities and town governments. For instance, just a few months ago, the American cities of Atlanta and Baltimore faced city-wide cyberattacks that halted their public services. What’s more worrisome is the fact that authorities have discovered that some of the cyber attacks were backed by other countries, thereby changing the face of modern-day warfare to cyber warfare.

In such challenging times in the cybersecurity industry, the recent advancements in machine learning have made it highly useful against cyber attacks. The entire purpose of machine learning is to “learn” from the past and update itself with the passage of time. This vision is perfectly suited to address cyber attacks where machine learning can learn from the historical data of cyber attacks like the information of their victims, their target industries, their patterns, and other related information and can then use it to prevent any future attacks while evolving at the same time. Following are some of the cases where machine learning has been pretty impressive against some major threats.

Classification

Traditionally, burglars and robbers used to analyze and research targets and carry out crimes accordingly. Today, the situation is the same but the battleground is different as criminals have transformed into cybercriminals. These cybercriminals target specific businesses or persons to infect their servers with a technique called spear phishing.

In order to combat these cyber attacks, several phishing detection solutions have been released albeit with limited success because they do not fare well on the precision and quickness of their actions against such infections. As a consequence, users are left alone to fight off cyber attacks.

Machine learning is providing a breakthrough by using classification to assess recurring hacking patterns and decoding the encrypted emails of the senders. For analysis, ML-based models are trained to pinpoint any anomaly in the punctuation, email headers, body-data, and other relevant metrics. The purpose of these models is to identify whether or not an email is filled with a malicious phishing threat or not.

Traversal Detection Algorithms

Cybercriminals are increasingly keeping an eye on digital users like which websites do they use the most as well as the network of such websites. For instance, consider a restaurant business. As all of the customers order their food on the website of the restaurant, hackers exploit such websites, gain access to private customer data such as credit card details, and misuse the credentials of the visitors. This type of attack is known as a watering hole.

In these types of attacks, machine learning (ML) can be a game-changer by improving the traditional web security. For instance, it can determine if users are going to be forwarded to a dangerous website’s link through the destination path’s traversal. To attain this goal, traversal detection algorithms are integrated in ML. Likewise; ML can look for any sudden or unusual redirecting from a web-page on the host server.

Deep Learning

Ransomware is a type of cyberthreat that paralyzes and effectively locks the data of its victim. In order to provide access to this data, cybercriminals ask for ransom in exchange for data. The data is encrypted through cryptographic algorithms which generate an encryption key and sends it to the command and control center of the cybercriminals.

In such scenarios, a division of machine learning called deep learning is utilized. Deep learning is used to recognize any fresh ransomware threat. Datasets are trained for analyzing the common ransomware behaviors to predict any upcoming ransomware attack.

To make the system learn, a huge amount of ransomware files along with a bigger amount of non-malicious files are needed for training of the model. ML-based algorithms search and identify the major features from the dataset. These attributes are then subdivided in order to initiate the training of the model. Afterward, whenever a ransomware strain attempts to infect a system, the ML tool runs it against the trained model and computes a set of actions to respond to the attack, thereby saving the computer from being locked.

Remote Attacks

When a single computer or multiple computers are targeted by a cybercriminal, it is known as a remote attack. Such a hacker searches for loopholes in the network or the machine to enter a system. Usually, such attacks are carried out to copy sensitive data or completely ravage a network through a malware infection.

Remote attacks can be caused from a DDoS attack. In such types of attack, the server is damaged by repeatedly flooding it with fake requests. Consequently, as the servers are frozen, the cybercriminals make their move.

With machine learning algorithms, these attacks can be thwarted by a thorough analysis of the system behavior and pinpointing of any unusual instances which does not make sense according to the standard network activities. ML algorithms can be empowered to monitor and detect a malicious payload before it is too late.

Webshell

Webshell is a malware threat which facilitates a hacker in accessing and changing the settings of a website from the server’s web root directory. Hence, the cybercriminal has his/her hands on the complete database.

For e-commerce websites, cybercriminals can even get their hands on financial details like credit card data which can be exploited in a wide range of crimes. This is the major reason that webshell is mostly used against e-commerce websites.

By using machine learning, the figures and data of shopping carts can be analyzed and learnt by ML-based models to differ between malicious actions and standard actions. Malicious files can be fed to ML in order to enhance the training and capability of the model. This training then assists ML-based systems to pick webshells and quarantine them before they can perform harm the system.