What Are the Components of a Sharded Cluster in MongoDB?


A sharded cluster in MongoDB is composed of 3 elements: shards, config servers, and mongos. These are the following.

Shards

Data from a sharded cluster is encompassed in a subset which is stored in a shard. All shards grouped together to maintain the data of the entire cluster. For deployment, you have to configure a shard as a replica set in order to achieve high availability and redundancy. Shards are used for maintenance and administrative operations.

Primary Shard

All the databases have their own primary shard. Primary shards reside in a sharded cluster’s database. They store all of those collections which are not sharded. Bear in mind, that there is no link between the primary shard and the primary replica member of the replica set, hence do not be confused by the similarity in their names.

A primary shard is chosen by the mongos during the generation of a new database. To choose a shard, mongos picks the one which contains minimum data.

It is possible to change the primary shard via a command, “movePrimary”. However, do not change your primary member so casually. The change of a primary shard is a time-consuming procedure. During the primary shard migration, you cannot use collections of that shard’s database. Likewise, cluster operations can be disrupted, and the extent of this disturbance is reliant on the data which is currently migrating.

To check a general view of the cluster in terms of sharding, you can use the sh.status method via the mongo shell. The results generated by this method also specify the primary shard of the database. Other useful information includes information about how the chunks are distributed among the shards.

Config Servers

The sharded cluster’s metadata is stored in the config servers. This metadata can include the organization of the components in the cluster along with the states of these components and data. Information about all the shards and their chunks are maintained in the config servers.

This type of data is then used for caching by the mongos instances after which it is used for routing associated with read and writes operations with respect to the appropriate shards. When new metadata is updated, then the cache is also updated by the mongos.

It must be noted that the configuration of “authentication” in MongoDB like internal authentication and RBAC (role-based access control) is also stored in these config servers. Additionally, MongoDB utilizes them for the management of distributed locks.

While it is possible for a single config server to be used with all the sharded clusters, such practice is not recommended. If you have multiple sharded clusters, then use a separate config server for all of them.

Config Servers and Read/Write Operations

Write Operations

If you are a MongoDB user, then you must be familiar with the admin and config databases in MongoDB. Both of these databases are maintained in the config servers. Collections associated with the authorization, authentication, and system collections are stored by the “admin” database. On the other hand, the metadata of the sharded cluster is stored in the “config” database.

When the metadata is modified like when a chunk is split or a chunk is migrated, then MongoDB directs write operations on the config DB. During these write operations, MongoDB uses “majority” for the write concern.

However, as a developer, you should refrain from writing to the config DB by yourself in the midst of maintenance or standard operations.

Read Operations

The admin database is used for read operations by the MongoDB. These reads are associated with authorization, authentication, and internal operations.

If the metadata is modified like when a chunk is being migrated or mongos is initiated, then read operations are processed to the config database by the MongoDB. MongoDB uses the “majority” read concern during these read operations. Moreover, it is not only the MongoB which reads from config servers. Shards also require them for read operations associated with the metadata of the chunks.

Mongos

The mongos instances in MongoDB are responsible for shard related write operations while they are also utilized for the routing of queries. For the sharded cluster, mongos is the only available interface which offers the application perspective. Keep this in mind that there is no direct communication between the shards and applications.

Mongos monitors the contents of a shard via metadata caching with the help of the config servers. The metadata provided by the config servers is then used by the MongoDB for routing operations between clients and applications with the instances of the mongod. Mongos instances lack a persistent state. This is helpful because it assists in the lowest possible consumption of the available resources.

Usually, mongos instances are run on a system which also houses the applications servers. However, if your situation demands, then you can also use them on any other dedicated resources like shards.

Routing and Results

While routing a query for a cluster, a mongos instance assesses all the shards through a list and identifies which shard needs the query. It then forms a cursor for all of these specific shards.

Afterward, the data in these shards is merged by the mongos instance which is then displayed in the result document. There are some modifiers for queries like sorting which maybe required to be executed on a shard. Subsequently, the retrieval of the results is carried out by the mongos. To manage the query modifiers, mongos does the following.

  • In the case of un-sorted query results, mongos applies a “round-robin” strategy for the generation of results from the shards.
  • In the scenario in which the result size is restricted because of the limit() method, then shards receive this information from the mongos. Afterward, they re-implement limit on the result and then send it to the client.

If the skip() method is used in the query, then like the previous case, it is not possible for mongos to forward the information. Instead, it searches the shards and fetches the unskipped results after which the specified skip limit is processed during the arrangement of the entire result.

Advertisements

What Are Replica Set Members in MongoDB?


Among the several database concepts out there, one of the most important ones is replication. Replication involves data copying between multiple systems so each user has the same type of information. This allows for data availability which in turn improves the performance of the application. Likewise, it can also help in backups, for instance when one of the systems is damaged by a cyberattack. In MongoDB, replication is achieved with the help of grouped mongod processes known as a replica set.

What Is a Replica Set?

A replica set is an amalgamation of multiple mongod processes which are combined together. They are the key to offering redundancy in MongoDB along with ensuring strong availability of data. Members in the replica set are classified into three categories.

  1. Primary member.
  2. Secondary member.
  3. Arbiter.

Primary Member

There can only be one primary member in a single replica set. It is a sort of “leader” among all the other members. The major difference between a primary member and other members is that a primary member gets write operations. This means that when MongoDB has to process writes, then it forwards them to the primary member. These writes are then recorded in the operation log (oplog) of the primary member. Afterward, the secondary members look up in the oplog and use that recorded data to implement changes in their stored data.

In the case of read operations, all the members of the replica set can process them but the read operation is first directed to the primary member due to default settings in the MongoDB.

All members in a replica set acknowledge their availability through a “heartbeat”. If the heartbeat is missing for the specified time limit (usually defined in seconds), then it means that the member is disconnected. In replica sets, there are some cases when the primary member gets unavailable due to multiple issues, for instance, the data center in which the primary member resides is disconnected due to the power outage.

Since replication cannot operate without a primary member, an “election” is held to appoint a primary member from the secondary members.

Secondary Member

A secondary member is dependent on the primary member to maintain its data. For replication, it uses the oplog of the primary member asynchronously. Unlike primary members, a replica set can have multiple secondary members—the more the merrier. However, as explained before, write operations are only conducted by the primary member and thus, a secondary member is not allowed to process them.

If a secondary member is allowed to vote, then it is eligible to trigger an election and take part as a candidate to become the new primary member.

A secondary member can also be customized to achieve the following.

  • Made ineligible to become the primary member. This is done so it can be utilized in a secondary data center where it is valuable as a standby option.
  • Block the contact of applications from the secondary member such that they cannot “read” through it. This can ensure that they work with special applications which necessitate detachment from the standard traffic.
  • You can modify a secondary member to act as a snapshot for historical purposes. This is a backup strategy which can aid as a contingency plan. For example, when a user deletes a database by mistake, then the snapshot can be utilized.

Arbiter

An arbiter is distinct due to the fact that it does not store its own data copy. Likewise, it is ineligible for the primary member position. So, why exactly are they used? Arbiters provide efficiency in the elections; they can vote once. They are added so they can vote and break the possibility of scenarios where it is a “tie” between members.

For example, there are 5 members in a replica set. If the primary member gets unavailable, then an election is held where two of the secondary members bag two votes each, so it is not possible for either of them to become the primary member. Therefore, to break such standoffs, arbiters are added so their vote can complete the election and select a primary member.

It is also possible to add a secondary member in the replica set to improve the election but a new secondary member is heavy as it stores and maintains data. Since arbiters are not constrained by such overheads; therefore they are the most cost-effective solution.

It is recommended, that arbiters should never reside in sites which are responsible to host both the primary and secondary members.

If the configuration option, authorization, is enabled in the arbiter, then it can also swap credentials with members of the replica set where authentication is implemented using the “keyfiles”. In such scenarios, MongoDB applies encryption on the entire authentication procedure.

The use of cryptographic techniques ensures that this authentication is secure from any exploitation. However, since arbiters are unable to store or maintain data, this means that the internal table—which has information of users/roles in authentication—can not be owned by the arbiters. Therefore, the localhost exception is used with the arbiter for authentication purposes.

Do note that in the recent MongoDB versions (from MongoDB 3.6 onwards), if you upgrade the replica set from an older version, then its priority value is changed from 1 to 0.

What Is the Ideal Replica Set?

There is a limit on the number of members in the replica set. At most, a replica set can contain 50 members. However, the number of voting members must not cross the 7-member limit. To configure voting and allow a member to vote, the members[n]. votes setting of a replica set must be 1.

Ideally, a replica set should have at least three members which can bear data: 1 primary member and 2 secondary members. It is also possible to use a three-member replica set where with one primary, secondary, and arbiter but at the expense of redundancy.

How to Work with Data Modeling in MongoDB with an example


Were you assigned to work on the MEAN Stack? Did your organization choose MongoDB for data storage? As a beginner in MongoDB, it is important to familiarize yourself with data modeling in MongoDB.

One of the major considerations for data modeling in MongoDB is to assess the DB engine’s performance, balance the requirements of the application, and think about the retrieval patterns. As a beginner in MongoDB, think about how your application works with queries and updates and processes data.

MongoDB Schema

The schema of MongoDB’s NoSQL is considerably different from the relational and SQL databases. In the latter, you have to design and specify the schema of a table before you can begin with the insert operations to populate the table. There is no such thing in MongoDB.

The collections used by MongoDB operate on a different strategy which means that it is not necessary that two documents (similar to rows in SQL databases) can have the same schema. As a result, it is not mandatory for a document in a collection to adhere to the same data types and fields.

If you must create fields in the document or you have to modify the current fields, then you have to update a new structure for the document.

How to Improve Data Modeling

During data modeling in MongoDB, you must analyze your data and queries and consider the following tips.

Capped Collections

Think about how your application is going to take advantage of the database. For instance, if your application is going to use too many insert operations, then you can benefit from the use of capped collections.

Manage Growing Documents

Write operations increase data exponentially like when an array is updated with new fields, the data increases quickly. It can be a good practice to keep a track of the growth of your documents for data modeling.

Assessing Atomicity

On the document level, the operations in MongoDB are strictly atomic. A single write operation can only change the contents of a single document. There are some write operations that can modify multiple documents, but behind the scenes, they only process one document at a time. Therefore, you have to ensure that you are able to factor in accurate atomic dependency according to your requirements.

The use of embedded data models is often recommended to utilize atomic operations. If you use references, then your application is forced in working with different read/write operations.

When to Use Sharding?

Sharding is an excellent option for horizontal scaling. It can be advantageous when you have to deploy datasets in a large amount of data and where there is a major need for read and write operations. Sharding helps in categorizing database collections. This helps in the efficient utilization for the documents of the collection.

To manage and distribute data in MongoDB, you are required to create a shared key. Shard key has to be carefully selected or it may have an adverse impact on the application’s performance. Likewise, the right shard key can be used to prevent a query’s isolation. Other advantages include a notable uplift in the write capacity. It is imperative that you take your time to select a field for the position of shard key.

When to Use Indexes?

The first option to improve the query performance in MongoDB is an index. To consider the use of an index, go through all your queries. Look for those fields which are repeated the most often. Make a list of these queries and use them for your indexes. As a result, you can begin to notice a considerable improvement in the performance of your queries. Bear in mind, that indexes consume space both in the RAM and hard disk. Hence, you have to keep these factors in mind while creating indexes or your strategy can backfire instead.

Limit the Number of Collections

Sometimes, it is necessary to use different collections based on application requirements. However, at times, you may have used two collections when it may have been possible to do your work through a single one. Since each collection comes with its own overhead, hence you cannot overuse them.

Optimize Small Document in Collections

In case, you have single or multiple collections where you can notice a large number of small documents, then you can achieve a greater degree of performance by using the embedded data model. To increase a small document into a larger one via roll-up, you can try to see if it is possible to design a logical relationship between them during grouping.

Each document in MongoDB comes up with its own overhead. The overhead for a single document may not be much but multiple documents can make matters worse for your application. To optimize your collections, you can use these techniques.

  • If you have worked with MongoDB then you would have noticed that it automatically creates an “_id” field for each document and adds a unique 12-byte “ObjectId”. MongoDB indexes the “_id” field. However, this indexing may waste storage. To optimize your collection, modify “_id” field’s value when you create a new document. What this does is that it helps to save a value which could have been assigned to another document. While there is no explicit restriction to use a value for the “_id” field but do not forget the fact that it is used as a primary key in MongoDB. Hence, your value must be unique.
  • MongoDB documents store the name of all fields which does not bode well for small documents. In small documents, a large portion of their size can be dictated due to the number of their fields. Therefore, what you can do is that make sure that you use small and relevant names for fields. For instance, if there is the following field.

father_name: “Martin”

Then you can change it by writing the following.

f_name : “Martin”

At first glance, you may have only reduced 5 letters but for the application, you have decreased a significant number of bytes which are used to represent the field. When you will do the same for all queries then it can make a noticeable difference.

Let’s see how this works with an example 

The major choice which you have to think around while working for data modeling in MongoDB is what does your DOCUMENT structure entail? You have to decide what the relationships which represent your data are like whether you are dealing with a one-to-one relationship or a one-to-many relationship. In order to do this, you have two strategies.

Embedded Data Models

MongoDB allows you to “embed” similar data in the same document. The alternative term for embedded data models is de-normalized models. Consider the following format for embedded data models. In this example, the document begins with two standard fields “_id” and “name” to represent the details of a user. However, for contact information and residence, one field is not enough to store data as there are more than one attributes for data. Therefore, you can use the embedded data model to add a document within a document. For “contact_information” curly brackets are used to change the field to a document and add “email_address” and “office_phone”. A similar technique is applied to the residence of the user.

{

_id: “abc”,

name: “Bruce Wayne”,

contact_information: {

email_address: “bruce@ittechbook.com”,

office_phone: ” (123) 456–7890″,

},

residence: {

country: “USA”,

city: “Bothell”,

}

 

}

The strategy to store similar types of data in the same document is useful for a reason; they limit the number of queries to execute standard operations. However, the questions which might be going around in your mind is, when is the correct time to use an embedded data model? There are two scenarios which can be marked as appropriate for the use of embedded data models.

Firstly, whenever you find a one-to-many relationship in an entity, then it would be wise to infuse embedding. Secondly, whenever you identify an entity with “contains” relationship, then it is also a good time to use this data model. In embedded documents, you can access information by using the dot notation.

One-to-Many Relationship with Embedding

One-to-many is a common type of database relationship where a collection A can match multiple documents in collection B but the latter can only match for one document in A.

Consider the following example to see when de-normalized data models hold an advantage over normalized data models. To select one, you have to consider the growth of documents, access frequency, and similar factors.

In this example, reference is using three examples.

{

_id: “barney”,

name: “Barney Peters”

}

{

user_id: “Barney”,

street: “824 XYZ”,

city: “Baltimore”,

state: “MD”,

zip: “xxxxx”

}

{

user_id: “Barney”,

street: “454 XYZ”,

city: “Boston”,

state: “MA”,

zip: “xxxx”

}

If you can see closely, then it is obvious that the application has to continuously retrieve data for address along with other field names which are not required. As a result, several queries are wasted.

On the other hand, embedding your address field can ensure that the efficiency of your application is boosted and your application will only require a single query.

{

_id: “barney”,

name: “Barney Peters”,

addresses: [

{

user_id: “barney”,

street: “763 UIO Street”,

city: “Baltimore”,

state: “MD”,

zip: “xxxxx”

},

{

user_id: “barney”,

street: “102 JK Street”,

city: “Boston”,

state: “MA”,

zip: “xxxxx”

}

]

}

Normalized Data models (References)

In normalized data models, you can utilize references to represent relationship between multiple documents. To understand references, consider the following diagram.

References or normalized data models are used in the cases of one-to-many relationship models and many-to-many relationship models. In some instances of embedded documents model, we might have to repeat some data which could be avoided by using references. For example, see the following example.

mangos

In this example, the “_id” field in the user document references to two other documents that are required to use the same field.

References are suitable for datasets which are based on hierarchy. They can be used to describe multiple many-to-many relationships.

While references provide a greater level of flexibility in comparison to embedding, they also require the applications to issue the suitable follow-up queries for their resolution. Put simply, references can increase the processing between the client-side application and the server.

One-to-Many Relationship with References

There are some cases in which a one-to-many relationship is better off with references rather than embedding. In these scenarios, embedding can cause needless repetition.

{

title: “MongoDB Guide”,

author: [ “Mike Jones”, “Robert Johnson” ],

published_date: ISODate(“2019-1-1”),

pages: 1,200,

language: “English”,

publisher: {

name: “ASD Publishers”,

founded: 2009,

location: “San Francisco”

}

}

 

{

title: “Java for Beginners”,

author: “Randall James”,

published_date: ISODate(“2019-01-02”),

pages: 800,

language: “English”,

publisher: {

name: “BNM Publishers”,

founded: 2005,

location: “San Francisco”

}

}

As you can realize whenever a query requires the information of a publisher, that information is repeated continuously. By leveraging references, you can improve your performance by storing the information of the publisher in a different collection. In cases, where a single book has limited publishers and the likelihood of their growth is low, references can make a good impact. For instance,

{

name: “ASD Publishers”,

founded: 2009,

location: “San Francisco”

books: [101, 102, …]

}

 

{

_id: 101

title: “MongoDB Guide”,

author: [ “Mike Jones”, “Robert Johnson” ],

published_date: ISODate(“2019-1-1”),

pages: 1,200,

language: “English }

{

_id: 102

title: “Java for Beginners”,

author: “Randall James”,

published_date: ISODate(“2019-01-02”),

pages: 800,

language: “English”,

 

}

Design Patterns for Functional Programming


In software circles, a design pattern is a methodology and documented approach to a problem and its solution which is bound to be found repeatedly in several projects as a tumbling block. Software engineers customize these patterns according to their problem and form a solution for their respective applications. Patterns follow a formal structure to explain a problem and then go over a proposed answer as well as key points which are related to either the problem or the solution. A good pattern is one which is well known in the industry and used by the IT masses. For functional programming, there are several popular design patterns. Let’s go over some of these.

Monad

Monad is a design pattern which takes several functions and integrates them as a single function. It can be seen as a type of combinatory and is a core component of functional programming. In monad, a value is wrapped in a box which is then unwrapped and a function is passed to use the wrapped value.

To go into more technicalities, a monad can be classified into running on three basic principles.

·         A parameterized type M<T>

According to this rule, T can possess any type like String, Integer, hence it is optional.

·         A unit function T -> M<T>

According to this rule, there can be a function taking a type and its processing may return “Optional”. For instance, Optional.of(String) returns Optional<String>.

·         A bind operation: M<T> bind T -> M <U> = M<U>

According to this rule which is also known as showed operator due to the symbol >>==. For the monad, the bind operator is called. For instance, Optional<Integer>. Now this takes a lambda or function as an argument for instance like (Integer -> Optional<String> and returns and processes a Monad which has a different type.

Persistent Data Structures

In computer science, there is a concept known as a persistent data structure. Persistent data structure at their essence work like normal data structure but they preserve their older versions after modification. This means that these data structures are inherently immutable because apparently, the operations performed in such structures do not modify the structure in place. Persistent data structures are divided into three types:

  • When all the versions of a data structure can be accessed and only the latest version can be changed, then it is a partially persistent data structure.
  • When all the versions of a data structure can be accessed as well as changed, then it is a fully persistent data structure.
  • Sometimes due to a merge operation, a new version can be generated from two prior versions; such type of data structure is known as confluently persistent.

For data structure which does not show any persistence, the term “ephemeral” is used.

As you may have figured out by now, since persistent data structures enforce immutability, they are used heavily in functional programming. You can find persistent data structure implementations in all major functional programming language. For instance, in JavaScript Immutable.js is a library which is used for implementing persistent data structures. For example,

import { MapD } from ‘immutable’;

let employee = Map({

employeeName: ‘Brad’,

age: 27

});

employee.employeeName; // -> undefined

employee.get(’employeeName’); // -> ‘Brad’

Functors

In programming, containers are used to store data without assigning any method or properties to them. We just put a value inside a container which is then passed with the help of functional programming. A container only has to safely store the value and provide it to the developer in need. However, the values inside them cannot be modified. In functional programming, these containers provide a good advantage because they help with forming the foundation of functional construct and assist with asynchronous actions and pure functional error handling.

So why are we talking about containers? Because functors are a unique type of container. Functors are those containers which are coded with “map” function.

Among the simplest type of containers, we have arrays. Let’s see the following line in JavaScript.

const a1 = [10, 20,30, 40, 50];

Now to see a value of it, we can write.

Const x=y[1];

In functional, the array cannot be changed like.

a1.push(90)

However, new arrays can be created from an existing array. An array is theoretically a function. Technically, whenever a unary function is mapped with a container, then it is a functor. Here ‘mapped’ means that the container is used with a special function which is then applied to a unary function. For arrays, the map function is the special function. A map function processes the contents of an array and performs a special function for all the elements of the element step-by-step after which it responds with another array.

Zipper

A zipper is a design pattern which is used for the representation of an aggregate data structure. Such a pattern is good for codes where arbitrarily traversal is common and the contents can be modified, therefore it is usually used in purely functional programming environments. The concept of Zipper dates back to 1997 where Gérard Huet introduced a “gap buffer” strategy.

Zipper is a general concept and can be customized according to data structures like trees and lists. It is especially convenient for data structures which used recursion. When used with zipper, these data structure are known as “a list with zipper” or “a tree with zipper” for making it apparent that their implementation makes use of zipper pattern.

In simple terms, zipper with data structure has a hole. They are used for the manipulation and traversal in data structures where the hole indicates the present focus for the traversal. Zipper facilitates developers to easily move within the data structure.

Java Lambdas


So far we have talked a lot about functional programming. We discussed the basics and even experimented with some coding of functional interfaces. Now is the right time to touch one of the most popular features of Java for functional paradigm, known as lambda expressions or simply lambdas.

What Is a Lambda Expression?

A lambda expression provides functionality for one or more functional interface’s instances with “concrete implementations”. Lambdas do not require the use of a class for their use. Importantly, these expressions can be viewed and worked by coding them as objects. This means that, like objects, it is possible to pass or run a lambda expression. The basic style for writing a lambda expression requires the use of an “arrow”. See below:

parameter à the expression body

On the left side, we have a “parameter”. We can write single or multiple parameters for our program. Likewise, it is not mandatory to specify the parameter type because compilers already ascertain the parameter type. If you are using a single parameter, then you may or may not add a round bracket.

However, if you intend to add multiple parameters, then make sure to use round brackets (). Sometimes, there is no need of parameters in a lambda expression. For such cases, it is possible to signify them by simple adding an unfilled round bracket. To avoid error, use round brackets for parameter whether you are using a 0, 1, or more parameters.

On the right side of the lambda expression, we can have an expression. This expression is entailed in curly brackets. Like parameters, you do not require brackets for a single expression while multiple expressions require one. However, unlike parameter, the return type of a function can be signified by the body expression.

Without Lambdas

To understand lambdas, check this simple example.

package fp;

public class withoutLambdas {

public static void main(String[] args) {

withoutLambdas wl = new withoutLambdas(); // generating instance for our object

String lText2 = “Working without lambda expressions”; // here we assign a string for the object’s method as a parameter

wl.printing(lText2);

}

public void printing(String lText) { // initializing a string

System.out.println(lText);              // creating a method to print the String

}

}

 

The output of the program is “Working without lambda expressions”. Now if you are familiar with OOP, then you can understand how the caller was unaware of the method’s implementation i.e. it was hidden from it. What is happening here is that the caller gets a variable which is then used by the “printing” method. This means we are dealing with a side effect here—a concept we explained in our previous posts.

Now let’s see another program in which we go one step ahead, from a variable to a behavior.

package fp;

public class withoutLambdas2 {

interface printingInfo {

void letsPrint(String someText);  //a functional interface

}

public void printingInfo2(String lText, printingInfo pi) {

pi.letsPrint(lText);

 

}

 

public static void main(String[] args) {

withoutLambdas2 wl2 = new withoutLambdas2(); // initializing instance

String lText = “So this is what a lambda expression is”; // Setting a value for the variable

printingInfo pi = new printingInfo() {

@Override // annotation for overriding and introducing new behavior for our interface method

public void letsPrint(String someText) {

System.out.println(someText);

}

};

wl2.printingInfo2 (lText, pi);

}

 

 

}

In this example the actual work to print the text was completed by the interface. We basically formulated and designed the code for our interface’s implementation. Now let’s use Lambdas to see how they provide an advantage.

 

package fp;

public class firstLambda {

 

interface printingInfo {

void letsPrint(String someText);    //a functional interface

}

public void printingInfo2(String lText, printingInfo pi) {

pi.letsPrint(lText);

}

 

public static void main(String[] args) {

firstLambda fl = new firstLambda();

String lText = “This is what Lambda expressions are”;

printingInfo pi = (String letsPrint)->{System.out.println(letsPrint);};

fl.printingInfo2(lText, pi);

}

See how we improved the code by integrating a line of lambda expression. As a result, we are able to remove the side effect too. What the expression did was use the parameter and processed it to generate a response. The expression after the arrow is what we call as a “concrete implementation”.

Core Concepts of Functional Programming


Now that you have learned about the paradigm shift to functional programming, let’s go into the depths of the fundamentals concepts that power functional programming. The comprehension of these basic concepts is important to create high-quality functional programming applications. You may be tempted to directly begin coding but these concepts can help you become a better coder.

1- Pure Functions

In functional programming, everything is seen as a function. Each function has to be “pure”. Pure here refers to two basic capabilities:

No Side Effects

A function can never be pure if it carries even a single side effect. A side effect is a property when a function’s states are modified by other functions. By states, we mean the data like variables or data structures. Pure functions do not carry any side effects; hence their memory or I/O operations can’t be affected. Now, you might wonder that why exactly does their presence considered bad. Well, because they make functions “unpredictable” where a function has to rely on its system’s state.

On the contrary, if a function’s state cannot be changed, then the same output is generated for the given input. A side effect of a function can also mean to write any operation which has been applied to the disk or turning on/off a control of your front-end UI’s function.

Same Result with Multiple Calls

Whenever a function is called without any modifications in its arguments, the same results are generated. Consider an example where you have designed a simple function “multiply (4, 5). Now, this function is expected to generate the same results .i.e. for each invocation. However, if you were programming in other paradigms then random functions or global variables may not allow your result to remain the same.

Pure functions also offer “memoisation”. Memeoisation refers to a technique in which pure functions’ output (always same result) is saved in the cache memory. Now, whenever such functions are invoked, caching helps to enhance the performance and speed of the application.

2- Higher Order Function

The concept of a function which is higher order is known in mathematics as well as computer science. Generally, they possess two fundamental characteristics.

Return Type

The return type of a higher-order function has to be a function. For example, review the following code in Java.

package fp;

 

public class higherOrderfunction {

public static void main(String args[]) {

 

System.out.println(A(4));

 

}

static int marks() {

int a=5;

return a;

}

static int A(int total) {

total =4;

return total+marks();

}

}

To simplify things, we have constructed a simple function marks. This function holds an integer value “a” which is returned. Now, we have a function A. A takes an argument for an integer total and assigns it a value. Now, comes the actual part. See how in return, we have used marks method as a return type. Since we have processed our function by returning another function; hence this function is a higher-order function.

Arguments

Another characteristic of a higher-order function can be its use of functions as input parameters. For instance, see this see pseudo-code:

Public areaRect (lb) // Here areaRect represents a function that takes arguments from another function lb to calculate length  and breadth

{

int area;

area =l*b;

return lb; //

}

This function is a higher-order function because it processes itself using another function as a parameter.

3- First Class Functions

After higher-order functions, we have first-class functions. These are not too dissimilar to higher-order functions. It is important to note that a first-class function always adheres to the terms and conditions of a higher-order function. So what this means is that a first-class function has to return another function as well as contain a parameter in the form of function. Hence, by default, all first-class functions are higher-order functions. So what exactly is the difference between them? Well, context matters!

By referring a language to have support for first-class functions, we generally mean that uses its functions as values that can be easily passed around.

On the other hand, the term higher-order is more associated with the mathematics outlook when pure problem-solving requires a more theoretical and general perspective of the problem.

4- Evaluation

Some languages support strict evaluation while others offer non-strict evaluation support. This evaluation is targeted at the language’s processing of an expression while considering the function parameters or arguments. To familiarize yourself better with the concept, check this simple example:

Print length([4-3,4+6,1/0,7*7,3-8])

When a programming language uses non-strict evaluation to process this expression, then it simply returns back a value of 5 .i.e. the total number of elements. Such evaluation does not concern itself with the depth of values.

On the other hand, the same expression returns an error with strict evaluation because it found the third element “1/0” to be incorrect. Hence, this means that strict evaluation is more stringent and processes an expression more deeply.

In a few scenarios, non-strict evaluation has to enforce processing of strict evaluation when a function needs to be evaluated on a “stricter” basis due to an invocation.

5- Referential Transparency

In functional programming, the usual assignment of values is not offered. Variables are immutable; a value defined once is the final one which is not possible to be changed in future. It is the property which makes them without any side-effects. To understand further, consider the following example, where the value of x is changed after each evaluation.

x=x*15

In the start of the program, x was assigned a value of 1. The first evaluation made it 15.

Now, in the second evaluation, the value of x changes to 225.

Now, this function is not referentially transparent because the value of x is continuously changing.

So now, if we go by the functional concepts, then our example can be altered into this pseudocode:

int sum(int x)

{

x=x+2;

}

This type of function ensures that the value of x remains constant and it cannot be altered implicitly.

What Is Functional Programming? Why the Paradigm Changed the Game?


The Path to Functional Programming

Before understanding functional programming, ask yourself how much do you know about another programming “styles”? When programming initially emerged to solve the major problems of the world through a few lines of code, Computer Scientists realized that they required a standard format or style which could help them to program effectively and efficiently. This style is commonly known as “programming paradigm”.

Soon developers began coding in C by using the procedural paradigm. Procedural programming mainly deals with coding with a step-to-step design similar to kitchen recipes; where a set of instructions is followed sequentially. At that time, the paradigm was indeed excellent at solving problems. However, as technologies evolved and programming became much more complex—websites were built and businesses began to adopt IT—the flaws of procedural programming bugged developers.

Enter Object-Oriented Programming, the next popular paradigm. The vision behind OOP was simple; it modeled programming on the basis of real-world examples. For example, a car could be seen as an object which possessed certain behavior (methods in programming) and states (members like variables in programming). OOP succeeded in decreasing global codebases. OOP concepts like encapsulation and inheritance were fundamental to attain an unexpected degree of productivity.

However, soon developers realized that OOP was not up to the task for a number of things. Hence, to address certain issues, functional programming came into the scene. Popular languages like PHP, Python, etc., are examples of languages which support the functional paradigm. Remember, not each language is built to support all paradigms. For example, while C can support procedural programming, it does not offer support for object-oriented programming.

However, most modern languages support procedural, OOP, and functional programming. There are some like Haskell, who received recognition due to features for functional programming. Java was initially not supportive of the paradigm, but since the last few years, Java releases have introduced features like Lambda Expressions to support functional programming. However, the question is, what exactly is functional programming?

What Is Functional Programming

The name suggests that it is linked to “functions”. Now, if you think that this function relates to the programming of methods, then your assumption is flawed. Functional programming refers to functions which incorporate a certain piece of code in the form of a feature or operation to an application. This function facilitates programmers to avoid changing the other parts of the application.

Functional programming highly borrows from mathematics. These functions are highly interlinked with those “mathematical functions” that you may have studied in your college courses.  In mathematics, a specific question is solved with a method without going much in the theoretical complexities of that method; similarly, functional programming is there to gain a higher degree of abstraction in applications.

One key aspect of functional programming is that it avoids change in the mutability and states of data. Another thing to note is that unlike “statements”, which are generally used in the OOP landscape, “expressions” power functional codebases.

Paradigm Shift to Functional Programming

Previously business applications used C++, a phenomenon that can prove to be a nightmare for modern developers. In those days, software engineers focused a lot on low-level aspects where issues related to memory management did not allow them to achieve much productivity. Then, Java came and provided more abstractions and managed these tasks through new features, thus resulting in saving developers from de-coding low-level complexities.

Today, languages like Scala (JVM) and F# (.NET) are promising a similar level of convenience to developers. Even outside of JVM and .NET ecosystems, you can clearly see Apple going with Swift and Facebook with React JS. Many of today’s “cool” languages are known for their functional paradigms. So why exactly has it become famous? Maybe, the following factors may have something to do with it.

Parallelism

One of the major reasons behind the positive reception of functional programming is its support for parallel computing. Traditionally, basic computing requirements like data storage and processing were not too intensive .i.e. there was no need of running several things at once. However, this changed as several technologies came and evolved one after another, expanding the world of software development and transforming it to power as a backbone of global operations.

Today, parallel computing is a highly valued domain of computer science where multiple applications and data require processing at the same time. Here, functional programming has made its impact due to its natural features which support independence components to run and support modern software infrastructure like microservices. This means if you try to add a specific functionality or modify an existing one, then “functions” ensure that you do not affect the other components of your system.

Data Streaming

Another reason which can be associated with the ‘functional leap’ is the advent of “streaming services”. a few years ago, entertainment was mainly associated with TV—the only medium for watching shows and movies.

However, today streaming services like Netflix have changed the game. Entertainment has shifted online. People prefer to take advantage of the luxury of watching football matches on mobile apps while commuting, rather than hurrying their way back to home for their television sets. Since functional programming works well with streaming due to its natural concepts; hence its growth makes a lot of sense.

AI

AI is perhaps the most exciting branch of computer science. Over the past few decades, extensive research has been carried out about AI while its applications and sub-branches like machine learning, deep learning, speech recognition, etc have become their own field of studies. AI is seen with a bright hope that it can modernize the world and take it to the next level through robots and intelligent machines. However, all this AI implementation requires coding where a paradigm has to be ultimately chosen.

Since AI is closely related to mathematics, logic, statistics, and theoretical computer science, hence you can certainly understand why functional paradigm—a paradigm founded on mathematics—is gaining so much prominence among the AI community.

What’s in it for the Developer?

Well, so far you might have understood how the elevation of some technologies and necessities of some requirements raised the need for functional programming. However, why should you as a developer working with “objects” (OOP) think about learning something new? Is the hassle worth it? Well, to be honest, functional programming is not a magic potion that can generate great results for all OF YOUR applications.

However, there are several cases where the use of functional programming can assist you to achieve better results in your projects where the time spent on learning the paradigm may result in saving you a great deal of resources. Following are a few of the reasons to learn it.

Decreasing LOC

What was one of the most well-liked thing aspects about OOP? The programs that used to require 5,000 line of codes via procedural programming, were reduced to less than 1,000 lines of codes by adopting OOP’s fundamental concepts like inheritance.

Similarly, functional programming also promises the same .i.e. it provides shorter code-bases while maintaining the performance of the application. As a result, productivity is increased where fewer lines of codes do more while maintenance is also relatively easier.

Testing and Debugging

When everything is a function; testing is less challenging. You just have to test each function where one’s output does not affect the result of the other function.

The states of functions cannot be modified from outside of its scope. Hence, its input can be tracked effectively, resulting in an efficient management of its output too. As a result, debugging is easy because it is easy to check what went wrong.

Career Growth

There are no negatives in learning a paradigm which goes well with all the hottest modern technologies. With a chance to work on IoT, AI, and, other futuristic fields, functional programming can prove pivotal to your success as a computer scientist. Learning and adapting are the hallmarks of every successful computer science. Also, if you are tired of doing the same old programming, then it can provide a new challenge that can motivate you to work contentedly.

Now that you have learned about some basic definition, importance, and advantages of functional programming, now is the right time to understand its basics.