Search This Blog

August 31, 2019

Microsoft Azure: Free Microsoft Azure learning videos for different roles at Pluralsight


Go to the following link and register with your email. Three roles are for free of cost. There is a sequence of videos.

In Azure Environment can be used for different things
  1. Web-based platforms
  2. Big Data Platform
  3. AI platform

Choose the technologies based on the platform you are interested in.

August 30, 2019

Microsoft Azure: End to End technologies used in Azure for data storage and processing


Azure Storage (Blob storage) -- Used to store data files. Landing or storage area for getting data from multiple systems. Create multiple storage containers for landing, staging, processing and archiving. Virtual folders. Permissions are at the container level. Less data cost. Cost and data availability is different in Hot, Cold & Archive levels.

Azure Data Lake (Gen-1) -- Optimized for Big Data storage and processing. Data can be stored in a hierarchy format. Security can be enabled at the folder level. Parallel writes and reads are enabled. Storage cost is more compared to Azure Storage.

Azure Data Lake (Gen-2) -- Optimized for Big Data storage and processing. Supports both Object-based storage and Hierarchy level storage. Advantages of Azure Storage and Azure Data Lake (Gen-1).

Azure Data Analytics -- Serverless and cluster level processing of big data.

Azure DataBricks -- Cluster-based processing of big data using Spark.

Azure Data Factory -- Similar to SSIS and SQL Server Agent. Used to develop control flows and different tasks for processing data. Jobs can be executed one time or scheduled basis. Only UTC time is supported for job schedules. You can use SSIS, U-SQL, Data Bricks Spark and Azure Data Factory tasks to process the data.

Azure SQL Server -- Supports up to 4 TB of data size. Mainly used for OLTP systems. When needed SQL Server instance, you need to use Managed Azure SQL Server. You can also create a single database.

Azure SQL Data Warehouse -- Used for OLAP data marts and if the data size is more than 4 TB. MPP processing architecture. Data is processed on multiple nodes. Data is stored in Azure Storage. You can create External tables to access HDFS and other data sources. Not all the SQL features are supported currently.

Azure Analysis Services -- Tabular model for Analysis services.

Power BI -- For Reporting purpose.

Azure DevOps -- For source code control and auto-deploy the code.

August 29, 2019

Azure Key Vault -- To store sensitive data

Azure Key Vault -- To store sensitive data

This technology is used to hide all the sensitive information like SQL Connection strings, SQL User Name, and passwords. Advantage of this technology is you define the key-value pairs like give the connection string a name and the entire connection string is hidden from all the applications.

This will help not to store the connection strings in source control or applications. In all the applications and source control we refer only with the secret name

For more information, check the following

August 28, 2019

Azure Data Factory - GetMetaData activity



GetMetaData activity is used to get file information which is present in Azure storage. This will get file size, row count, lastModifiedDate, file exists and other information.

Following screenshot shows how to get all files information present in a particular folder. The folder name is passed as pipeline parameter



The output of this activity can be used as input to Stored Procedure activity. It can be used to store the metadata information in Azure SQL Database

For more information check the following