Azure Cosmos DB Tutorial for Beginners | Azure CosmosDB
In today’s article we will be learning about azure cosmos DB today is first of all we’ll see what a database is and what are the different types of databases then we’ll explain what as for cosmos bb is what are the key features of asset corresponds tv and how absolute cosmos VP works we’ll also see what are the backup and resource options that hazel cosmos vb has finally we also see the common use cases and a hands-on session on how to create a course database.
First of all, what is a database in simple words a database is a collection of data that is organized that it can be easily retrieved and managed a very simple example of a database is a register that we have we used to have in our classes where our roll numbers and names were listed and the attendance was marked that is a database it contains information about us and whether we were present on a certain day or not.
That is a simple example of a database but the database uh what that we are talking about today is are the ones that are stored in computers that associated applications can access manage and retrieve data from them these databases are not very simple like the ones you know registers but these are very complex and these are very these this store data that that common very huge boolean source that is what a database.
Now let’s see what are the different types of databases talking about different types of databases there are a lot of different types of databases but since this video is focused on azure cosmos ap we need to know what is a relational database is a NoSQL database a distributed and multi mono database.
First of all, let’s see what are relational databases in our previous example we were talking about the register in this data is stored our representing source market work that is a table actually if you see in the table each raw number if the starting column is a real number each role number is a person and the columns all those columns are information about this person the data is related to each other such databases are called as relational databases.
All these data are stored in tables and this table can be related to other tables, for example, we take the example of a table that contains um the names of different music present in albums there can be a column called genre.
There can be a different table where all these artists are names of artists are present and the genre in which these artists you know make music these tables can be related with using this keys which are unique identifiers of each other different tables can also be related so such databases are known as relational databases now the next one is no SQL database.
Earlier we were talking about relational databases and we commonly use query language known as SQL which is a structured query language to access edit links manage all the relational databases.
The databases that are different from tables like databases that store data in the form of documents key-value pairs or graphs etcetera are no SQL databases then coming to distributed databases is when rather than storing the data in a single place if the database is stored in multiple places then it is known as a distributed database and now finally what is the multi-modal database multimodal database supports both SQL as well as no SQL databases it can support both documents key-value page graphs objects etc.
The multimodal database moving on conventionally the database is that we are familiar with our relational databases like now SQL databases is the other one that we are going to learn about because cosmos DB is a knowledgeable data.
We have to see the difference between these two first of all as we have already discussed in relational database data is storing tables but in NoSQL database there it can be stored as documents graphs key-value pairs etc,
The next one is scalability what is scalability, when we’re creating a database and sometimes in the future if the database becomes bigger we need to expand it the ability of a database to expand this is not scalability when we are expanding the database it can be either by adding more resources to the server that the database is stored or it can be like adding more servers.
If you are adding more resources to this one particular server then we are calling it vertically scalable if you are adding more servers or adding more hardware then it is known as horizontally scalable.
A relational database is vertically scalable and the NoSQL database is horizontally scalable also in the next one is in a relational database there is a predefined schema there’s this particular way in which data has to be stored but in noise skill database there is no such thing and hence it is much easier to update the next thing is a relational database supports a very powerful query language we have talked about SQL but in NoSQL database.
The query language is not as powerful it’s simple then relational databases cannot handle huge volumes of data it can handle small to moderate but non-SQL databases can handle very high volumes of data also relational databases has a very centralized structure but no skill is decentralized and relational databases in relational databases data cannot be written from like many locations it can be written either from one or a few locations but in NoSQL the databases there can be written from any number of locations.
That covers up the differences between relational and NoSQL databases so let’s get into what we came here for today what is iso cosmos bb cosmos database is a globally distributed low latency multimodal database for managing data at large speeds it is a cloud-based noise database offered as a platform as a service from Microsoft azure it is highly available it has high throughput and it has it is very reliable and is often called as a serverless database.
As you have discussed cosmos navy is nothing but a database which is which can support data in huge volumes can it is multi-modal it can support different kinds of data and it is globally distributed now we’ll see what are the features of cosmos DB so first of all globally distributed as we already said azure has multiple regions known as,
Regions where observers are spread out and since Azur has such a large uh network the data can be distributed anywhere that’s why it is called probably distributed then the next one is scalability as we’ve already said we have defined scalability now uh cosmos DB is horizontally scalable and this means that it can support hundreds of millions of reads and drives per second when the need comes it can be upgraded that is what this means then indexing azure cosmos has automatic indexing scheme.
Agnostic indexing means that we don’t need to worry about what the schema is and if we are providing data it can automatically index the data then azure cosmos league is a multi-module it can store data and key-value players documents graph please then it is highly available in most of the cases when you are using retrieving data from the database or your application is doing that Asus cosmos bb is available 99.9 percent of the time it is highly available then low latency as you’ve already said the cosmo Cosmos DB is a globally distributed database and this means that data can be stored very close to the users.
This means that the user is retrieving data and the latency will be very less now we will move to how to do azure cosmos DB works as we’ve already said cosmos DB has a multi-master support which means that that can be simultaneously different databases spread out globally.
In this way, data is replicated to the user’s region that it can be accessed faster but what happens is that when this data is stored usually this data is stored in the primary node and it is replicating the secondary nodes if the primary node takes for example if it’s a website and the user the data is written in a server in u.s and the users are from all over the world.
If a user is from us and he’s accessing the data we’ll get it much faster than the people around the world this is what kind of this is uh there is a call term called consistency that defines this we’ll see what consistencies.
Consistency indicates whether later I’m seeing an at the same stage at any given point in time that’s what consistency is cosmos DB offers multiple levels of consistency with gradient performances and availability now we’ll see what are the consistency levels offered, first of all, there is eventual consistency here the data is written on the primary node and is propagated eventually to read-only secondary nodes it might take some time for the users to get updated data,
This is very powerful because powerful essence it offers a good performance because here if we go to the previous example if the data is being written in us and people are accessing from for example India they’ll get the performance they’ll get the data much faster but will get updated only after something due to the network latency issues then we’ll have strong consistency.
The data won’t be available for the user unless it is updated everywhere including the primary node if I have committed some new data in the database and a user is accessing it from if I have committed it to the primary node in us and a user is accessing it from India he will not get the updated data unless the data has been updated everywhere even if the user is accessing it from the primary node he will also not get the updated data unless the data is updated everywhere.
This is what is called a strong consistency but the problem here is that law for low performance because the people are not getting the updated data as quickly as it is in the uh eventual consistency or eventually consistency because it will take some time for the data to get updated everywhere and only then the people will be getting this data so this offers no consistency between this eventual consistency and the strong consistency there are three different uh other consistency levels which are offered,
They are consistent prefix in consistent prefix the client can read data in the same order as it is written for example if a client if someone is committing data in the order first a is committed then b is committed c is committed and d is committed a user reading it will also get the data in the same order as he will get the first a then b c and d that is what consistent prefixes in session users who have committed the data will be able to see it but the other before the other people will take some time for the data to get replicated to their location that is what session is then bounded stillness in bounded statements.
Here we can set the stainless period for which the data won’t be replicated to the secondary knot so if I set the boundaries uh stainless to two hours so once I commit the data it won’t be available to anyone for two hours it will only be available up two hours,
The data is updated everywhere I can set the stillness to put it to anything that I want if I set the series period to zero it becomes a strong consistency now we will see how data is provisioned in the azure cosmos data database or what provisions does cosmos database provide for storing data.
There are three first one is cosmos database excel then the containers and then items cosmos database as we already said the database and it allows it has rich API support and because of which we can access and manage the data using SQL API Cassandra API as your cosmos API and also MongoDB etc.
These can be used to enumerate read-create and update the database then some containers are horizontally partitioned and replicated across multiple readings for scalability and throughput these containers are groups of all these uh databases and actions and they can also be created updated using the cosmos DB APIs then cosmos items.
The data that is committed into all these databases can be a house and a people documents graphs etc this can also be edited updated uh replace red etc using the APIs now we’ll see what are the backup and register options that cosmos DB offers.
There are two kinds of backup our first one is the periodic backup mod in the periodic backup mode the backups are taken periodically and we can set the time interval and the retention level we can say if we want the backups to be taken every day or something like that we can say how much time should um laps between two consequent laptops we can set that basically and also the returns interval.
How long the data the backed up data is stored we can also set this the the the backup data is stored in uh different servers that it doesn’t get uh destroyed when something happens two hours that happens like that and also in the continuous backup mode the data is back up continuously and can be restored to any point in time within the last 30 days it backs up every the backup of the system is taken at every point of time.
I can change every I can undo any small issue that has taken place I can undo that for the last 30 days that’s the backup and register options now uh coming to the commonly used thesis the first one is not as you all know it dsd huge volumes of data and this data is coming from different locations all over the world this data needs to be written analyzed and retrieved quickly and cosmos DB can be leveraged here because as we already discussed cosmos DB can do all this in uh the next one is retail and marketing,
Here also cosmo CB can make small addition updation retrieval of huge volumes of data related to product catalogs logistics inventory etc then in gaming recently since all of the worlds are connected and the internet has become this huge thing people are already and see a lot of data as in uh game statistics are coming and I should be we can cosmos DB can provide help in giving low latency instead scoreboards social media integration etc also in web and mobile applications it can be used in the mobile applications for modeling social interactions integrating with third-party services for building this personality experiences etc.
we have discussed most of the theory regarding associate now we will see how a cosmos database can be created first of all we need to have an azure account log into the azure portal uh this is how the portal looks like for creating an azure cosmos database, first of all, you have to click on create a resource where we can see the list of resources that azure allows us to create so we need to create a cosmos database so we’ll click on cosmos database.
we already mentioned in the theory part as well cosmos DB has multi API support that means you can use SQL API MongoDB Cassandra API gremlin graph etc to query the data that is stored in the database for the demo purpose we will be selecting SQL API we can click on create and then we have to give the basic information about the database.
First of all, we have to select the subscription type I have selected with USB as we go we have to select the resource group or else we have if we don’t have one we can create a new resource group the resource user group which this database will be belong to and then we can give an account name this is the name by which the database will be identified we can give the location we can provide the capacity then going to the distribution tab we have already mentioned that cosmos DB is globally distributed.
we can have global geo-redundancy enabled then the networking is uh just self-explanatory it’s the kind of network that as you will be accessing the database from a backup policy where we can select the kind of backup that we need whether it’s periodic whether it’s containers if it’s periodic the interval at which the backup the retention period etc.
Encryption and tags you don’t have to worry about that and finally uh you can review and create if you create an account it will take some time for the database to be created because it has to happen on a global scale so it will take around 15 to 20 minutes for the sake of time I’ve already created a database I have given the name of the database as cosmos DB SQL one two three I’ve just given a random name.
if I click on the database I can see the overview of it the status of it that is it is online the resource group it belongs to this is a resource loop that I’ve created each one the type of subscription the read and write locations the URL so the name that I gave will be how the database will be identified cosmos DB sql123 was the name that I gave that comes to the URL then the backup policy,
we have now created a cosmos DB account now we have to create a database between which data can be added for that we have to first go into the data explorer in the data explorer we have to create a container we have to click on the new container and then to keep the details of the database.
First of all, we need to give a database id so I’ll be just giving it db1001, and then we have to give the container id that is I’m just giving it a c o n tone zero one, and then we have to give you the partition key I am just giving it a category and then I can click on ok now it will take some time for cosmos DB to create this container we can see the loading buttons going on over here so as you already mentioned in the theory the data is provisioned.
Provisioned in azure cosmos DB in the form of containers databases containers and items databases form the bigger part containers come inside them and in the containers then they’ll be we’ll be having items our container has been created if we go we can see that the name of the database that we have given db1001 is here if you click on it you can see the container that we have created container 1001.
If you click on that you can see the items are the data that we enter into this database so if we click on items we can have this button new item if I click on a new item I can give new items to the database this is just a JSON file.
I can keep the id for the item, for example, one zero one, and as we click on save it gets saved so a new entry has been made into the database similarly I can create a new item with a different id if I go and view one server two I can see and if I go and save I can see that it has been added to the database.
This is how you add different uh entries in the database and at the top of the table, you can see there is select from c this is querying you can use this to query and uh access different data from the database that is how you create a database and query the data from the database in cosmos DB now we’ll see about the different features that we talked about in the theory session.
First of all, if we go into replicate data globally here we have already mentioned that cosmos DB is a globally distributed database here we can see we can select multiple read and write regions for the database where it has already been selected as west the u.s and east us but for our purpose, if we need the data to be closer to India we can choose a region near India to store our data.
It is for our convenience we can choose the regions to which the data should be returned or read from in the case if the primary read location or the primary right location fails the cosmos tv has a feature where it will automatically switch to a secondary read location and make it as the primary right location.
That is what the manual field is if you go and click on manage failover if you have multiple read locations you can set a priority order for the read locations in such a way that if the right location fails it will automatically switch to the priority read location and make it the right location that our database will keep on functioning then we can see the consistency options we have already seen the consistency options in the theory.
Strong consistency users won’t be able to get the updated data unless the data has been updated everywhere so here we can see invest us best viewers rights reads Eastwood’s reads tweets everywhere the data will be consistent you users one will be getting only the same updated data and it will not get the data unless it’s updated everywhere,
Bounded stillness period we can set the scalenus period for which the users won’t be getting data if you’re giving two hours as a status period once the data has been updated it will be everywhere in the world when someone is accessing they’ll get the updated data only after two hours the data has been updated.
It gives enough time for every server to have the replicated data then there is session consistency consistent prefix and eventual consistency individual consistency the people who are processed to the right location will get the data much faster than the rate location which is farther away from the right location.
That is what consistency is about then there is the uh backup and restore options here you can select the backup intervals and also the retention period for which the data is stored then there are these keys so these keys are for when you’re using some kind of language to query the database so for example if you’re using python or c-sharp you need these keys so that you can access the database so that is how you create an azure cosmos database so that ends the article.