Search This Blog

September 10, 2019

Azure Data Factory - Execute Pipeline Activity

Execute Pipeline activity is used to call another Azure Data Factory Pipeline. This will help to develop reusable pipelines and use it in other pipelines

For example, you can create a separate pipeline to archive the files after processing. You can create one pipeline for archiving files (move the blobs or files to different folders) and call that pipeline in multiple pipelines. This will help you to reduce development time and increase code reusability.

Similarly, you can create a separate pipeline for sending emails in Azure data factory and re-use in multiple pipelines.


In the following screen-shot, we are calling two different pipelines (Archiving Pipeline, Send_Error_Message)









For more information, please check the following
https://docs.microsoft.com/en-us/azure/data-factory/control-flow-execute-pipeline-activity




September 09, 2019

Azure Data Warehouse - Materialized View

Materialized View

Materialized Views are similar to Indexed Views in SQL Server. These are used to increase query performance and hide complex query logic from the users.

To enable disabled materialized view, you need to use the "Alter" statement.

ALTER MATERIALIZED VIEW My_Indexed_View REBUILD;

Materialized views are visible within SQL Server Management Studio under the views folder of the Azure SQL Data Warehouse instance

Limitations

  • Materialized views are disabled during an update or delete operations on the base tables.
  • Only Clustered Cloumnstore Index is supported.
  • Only Hash and Round Robin distributions are supported.

Sample Code

CREATE MATERIALIZED VIEW mv_test2
WITH (distribution = hash(i_category_id), FOR_APPEND)
AS
SELECT MAX(i.i_rec_start_date) as max_i_rec_start_date, MIN(i.i_rec_end_date) as min_i_rec_end_date, i.i_item_sk, i.i_item_id, i.i_category_id
FROM syntheticworkload.item i
GROUP BY i.i_item_sk, i.i_item_id, i.i_category_id


For more information, check the following links

https://docs.microsoft.com/en-us/sql/t-sql/statements/create-materialized-view-as-select-transact-sql?view=azure-sqldw-latest


September 08, 2019

Azure Data Factory - If Condition activity

If activity is used to perform different activities based on the input expressions. In the following, we are checking if the file is present or not using "GetMetaData" activity. If file presents perform the activities in if True activities. If file does not present then Edit the False activities.

Passing input values for If condition activity



Set validation expression in settings for If condition activity.


Adding activity for True or False conditions.


For more information, please check https://docs.microsoft.com/en-us/azure/data-factory/control-flow-if-condition-activity

September 07, 2019

Azure Data Factory - For Each Activity

"ForEach" activity in Azure Data Factory is used to loop thru each row of the input and perform the actions based on the data.

Following screenshot shows two activities one is GetMetaData and second one is ForEach loop. The output of the GetMetaData is passed to the ForEach activity in the Settings. If the checkbox for "Sequential" is selected all the activities are performed in sequence for all the child items. If parallel execution is needed we need to un-check "Sequential".


"ForEach" activity can be used to an output of Lookup activity as well.

You can perform multiple activities for each row of data received for the input of ForEach activity. You can add all the activities by going to Activities and Edit Activities.

Limitations and workarounds

Here are some limitations of the ForEach activity and suggested workarounds.

Limitation Workaround
You can't nest a ForEach loop inside another ForEach loop (or an Until loop). Design a two-level pipeline where the outer pipeline with the outer ForEach loop iterates over an inner pipeline with the nested loop.
The ForEach activity has a maximum batchCount of 50 for parallel processing, and a maximum of 100,000 items. Design a two-level pipeline where the outer pipeline with the ForEach activity iterates over an inner pipeline.

For more information, please check the documentation at  https://docs.microsoft.com/en-us/azure/data-factory/control-flow-for-each-activity

September 06, 2019

Azure Analysis Services

Supported Features

  • AAS is available in 3 tiers (Developer for development, Basic and Standard)
  • We can pause and resume the server when not used to reduce the cost.
  • Supports only the tabular model.
  • Scale-out up to 8 nodes when the load on the model is increased.
  • Partitions are supported
  • Row-level permissions are supported.
  • Perspectives are supported.
  • Bi-directional relations are supported.
  • Metadata is stored in Azure Blob storage.

 Limitations
  • A multidimensional model is not supported
  • Power-pivot for SharePoint is not supported.