Search This Blog

August 30, 2019

Microsoft Azure: End to End technologies used in Azure for data storage and processing


Azure Storage (Blob storage) -- Used to store data files. Landing or storage area for getting data from multiple systems. Create multiple storage containers for landing, staging, processing and archiving. Virtual folders. Permissions are at the container level. Less data cost. Cost and data availability is different in Hot, Cold & Archive levels.

Azure Data Lake (Gen-1) -- Optimized for Big Data storage and processing. Data can be stored in a hierarchy format. Security can be enabled at the folder level. Parallel writes and reads are enabled. Storage cost is more compared to Azure Storage.

Azure Data Lake (Gen-2) -- Optimized for Big Data storage and processing. Supports both Object-based storage and Hierarchy level storage. Advantages of Azure Storage and Azure Data Lake (Gen-1).

Azure Data Analytics -- Serverless and cluster level processing of big data.

Azure DataBricks -- Cluster-based processing of big data using Spark.

Azure Data Factory -- Similar to SSIS and SQL Server Agent. Used to develop control flows and different tasks for processing data. Jobs can be executed one time or scheduled basis. Only UTC time is supported for job schedules. You can use SSIS, U-SQL, Data Bricks Spark and Azure Data Factory tasks to process the data.

Azure SQL Server -- Supports up to 4 TB of data size. Mainly used for OLTP systems. When needed SQL Server instance, you need to use Managed Azure SQL Server. You can also create a single database.

Azure SQL Data Warehouse -- Used for OLAP data marts and if the data size is more than 4 TB. MPP processing architecture. Data is processed on multiple nodes. Data is stored in Azure Storage. You can create External tables to access HDFS and other data sources. Not all the SQL features are supported currently.

Azure Analysis Services -- Tabular model for Analysis services.

Power BI -- For Reporting purpose.

Azure DevOps -- For source code control and auto-deploy the code.

No comments: