Azure Data Factory – Nishant Rana's Weblog

Fixed – Azure Synapse Link for Dataverse not starting

Recently while trying to configure “Azure Synapse Link for
Dataverse” for one of our environments, we faced an issue, where it got stuck as shown below.

Also, there were no files/folders created inside the Storage Account.

Posts on Azure Data Lake

The accounts were properly setup – https://docs.microsoft.com/en-us/power-apps/maker/data-platform/azure-synapse-link-synapse#prerequisites

Here we had only selected contact table for sync initially, and it had around 2.5 million records.

To fix this, we unlinked (using Unlink Option) and created the link again and this time we selected a table having very few record for e.g. country table as shown below.

This time the sync started immediately.

Than through Manage tables option, we added our contact table also, and it also started syncing properly as shown below-

Hope it helps..

How to – Export Dataverse (Dynamics 365) data to Azure SQL using Azure Data Factory pipeline template

[Visual Guide to Azure Data Factory - https://acloudguru.com/blog/engineering/a-visual-guide-to-azure-data-factory]

Using the new Azure Data Factory pipeline template – Copy Dataverse data from Azure Data Lake to Azure SQL – we can now easily export the Dataverse data to Azure SQL Database.

https://docs.microsoft.com/en-us/power-platform-release-plan/2021wave1/data-platform/export-dataverse-data-azure-sql-database

Check other posts on Azure Data Factory

Select Pipeline from template option inside the Data Factory

Search for Dataverse and select the Copy Dataverse data from Azure Data Lake to Azure SQL template

Let us specify the User Inputs required by the template – i.e. Azure SQL Database and the Data Lake Storage.

First we have created the linked service for the Azure SQL Database.

We’d use it to connect to the below table MyContacts.

Similarly create a linked service to Azure Data Lake Gen 2, which holds our Dataverse data.

Get the URL from the Container’s property. (replace blob with dfs in the URL)

To get the storage account key, select Access Keys >> Show Keys >> Copy the Key for the Storage Account.

Here we have already configured Azure Synapse Link for Dataverse

https://nishantrana.me/2020/09/07/export-data-from-common-data-service-to-azure-data-lake-storage-gen2/

Now as we have defined the User Inputs, select Use this template.

Navigate to the data flow created – DataverseToAzureSQL

Select our source ADLS and check and configure its properties.

Source Settings

Here we have the Inline dataset type set to Common Data Model and the Linked service is the AzureDataLakeStorage1 we created earlier.

Source Option

Specify the Dataverse folder for the Root Location.

Here we have specified the contact entity from our Data Lake Storage.

Projection

In the projection we have cleared the generated schema using Clear Schema, also selected Schema options >> Allow schema drift

We have enabled Allow schema drift option which will create the required columns in the destination Azure SQL Table.

Optimize

Inspect

Data preview

As we have not turned on Debug mode, there is nothing to preview

Now let us move to our Destination – SQL.

Sink

Here we have AzureSQLTable dataset connected to contact table in Azure SQL and have checked Allow schema drift option.

Below is our AzureSQLTable Dataset connected to the MyContacts table.

Settings

Here we have selected Allow Insert as the Update Method and Table
action as Recreate table – as we want the destination table to be re-created dynamically based on the source.

Mapping

We have left it to Auto mapping.

Optimize

Inspect

Data preview

Let us Publish All our changes and Debug our pipeline.

Let us monitor our pipeline run.

We can see that pipeline has run successfully and took around 4 minutes.

We can see the contact’s data copied to our Azure SQL successfully.

So here we covered the insert operation, in the next posts we’d see how we can configure update, upsert and delete operation.

Also, check

Posts on Azure Data Factory

Posts on Azure Data Lake

Hope it helps..

How to – improve data migration performance – SSIS & Azure Data Factory (Dataverse / Dynamics 365)

In one of our projects, we were executing SSIS Packages (KingswaySoft’s Dynamics 365 SSIS Integration Toolkit) under Azure-SSIS Integration Runtime in Azure Data Factory.

Check out –

Deploy and run SSIS Package in Azure Data Factory

Deploy and run SSIS Packages that use KingswaySoft’s SSIS Integration Toolkit on Azure Data Factory.

After trying out different combinations, we eventually settled with batch size as 10 and thread as 15.

https://nishantrana.me/2021/06/08/data-migration-optimum-batch-size-and-threads-for-maximum-throughput-microsoft-dataverse-dynamics-365/

Also, we used multiplexing – running the CRM Destination Component under different application users.

To be precise, 4 in our case and we can increase it get further improvement in the throughput.

And also based on the recommendation of our Microsoft’s Fast Track Architect we raised a Microsoft ticket to increase the number of web servers allocated from 2 to 3.

Below were our findings,

the earlier run was using batch size as 100 and thread as 20 with the number of servers as 2.

On updating the batch size to 10 and thread as 15 and with the number of servers allocated increased to 3, there was a huge performance gain.

Check the table below –

The above table is sample run in the sandbox environment, during the final run in production we got the number of servers allocated, increased to 6, gaining further improvement.

Also, check out the below blog post to understand about the affinity cookie and its affect on performance, in case if we doing migration using custom code –

https://markcarrington.dev/2021/05/26/improving-bulk-dataverse-performance-with-enableaffinitycookie/

Hope it helps..

Write batch size, data integration unit, and degree of copy parallelism in Azure Data Factory for Dynamics CRM / 365 Dataset

Let us take a simple example where we are moving contact records (.CSV) stored in Azure File Share to Dataverse or Dynamics 365 (UPSERT).

CSV file has 50000 sample contact records (generated using https://extendsclass.com/csv-generator.html) stored in Azure File Storage.

Another option of generating sample data

https://nishantrana.me/2020/05/26/using-data-spawner-component-ssis-to-generate-sample-data-in-dynamics-365/

The Source in our Data Factory pipeline.

The Sink is our Dynamics 365 / Dataverse sandbox environment, here we are using the Upsert write behavior.

For the Sink, the default Write batch size is 10.

Max concurrent connections specify the upper limit of concurrent connections that can be specified.

Below is our Mapping configuration

The Settings tab for the pipeline, allows us to specify,

Data Integration Unit specifies is the powerfulness of the copy execution.

Degree of copy parallelism specifies the parallel thread to be used.

Let us run the pipeline with the default values.

Write Batch Size (Sink) – 10
Degree of copy parallelism – 10
Data integration unit – Auto (4)

The results à It took around 58 minutes to create 50K contact records.

We then ran the pipeline few more times by specifying the different batch sizes and degree of copy parallelism.

We kept Max concurrent connections as blank and Data Integration Unit as Auto. (during our testing even if we are setting it to higher values, the used DIUs value as always 4)

Below are the results we got à

Write Batch Size	Degree of copy parallelism	Data Integration Unit (Auto)	Total Time (Minutes)
100	8	4	35
100	16	4	29
1000	32	4	35

250	8	4	35
250	16	4	25
250	32	4	55

500	8	4	38
500	16	4	29
500	32	4	28

750	8	4	37
750	16	4	25
750	32	4	17

999	8	4	36
999	16	4	30
999	32	4	20

The results show that increasing the batch size and degree of copy parallelism improves the performance in our scenario.

Ideally, we should run a few tests with different combinations before settling for a specific configuration as it could vary.

On trying to set the batch size to more than 1000,

We would get the below error à
ExecuteMultiple Request batch size exeeds the maximum batch size allowed.

Also refer –

Optimizing Data Migration – https://community.dynamics.com/crm/b/crminthefield/posts/optimizing-data-migration-integration-with-power-platform

Using Data Factory with Dynamics 365 – https://nishantrana.me/2020/10/21/posts-on-azure-data-factory/

Optimum batch size with SSIS –https://nishantrana.me/2018/06/04/optimum-batch-size-while-using-ssis-integration-toolkit-for-microsoft-dynamics-365/

Hope it helps..

Posts on Azure Data Factory

How to – Export Dataverse (Dynamics 365) data to Azure SQL using Azure Data Factory pipeline template

[Visual Guide to Azure Data Factory – https://acloudguru.com/blog/engineering/a-visual-guide-to-azure-data-factory%5D Using the new Azure Data Factory pipeline template – Copy Dataverse data from Azure Data Lake to Azure SQL – we can now easily export the Dataverse data to Azure SQL Database. https://docs.microsoft.com/en-us/power-platform-release-plan/2021wave1/data-platform/export-dataverse-data-azure-sql-database Check other posts on Azure Data Factory Select Pipeline from template option inside the…

by Nishant Rana August 10, 2021

How to – improve data migration performance – SSIS & Azure Data Factory (Dataverse / Dynamics 365)

In one of our projects, we were executing SSIS Packages (KingswaySoft’s Dynamics 365 SSIS Integration Toolkit) under Azure-SSIS Integration Runtime in Azure Data Factory. Check out – Deploy and run SSIS Package in Azure Data Factory Deploy and run SSIS Packages that use KingswaySoft’s SSIS Integration Toolkit on Azure Data Factory. After trying out different…

by Nishant Rana June 15, 2021

Write batch size, data integration unit, and degree of copy parallelism in Azure Data Factory for Dynamics CRM / 365 Dataset

Let us take a simple example where we are moving contact records (.CSV) stored in Azure File Share to Dataverse or Dynamics 365 (UPSERT). CSV file has 50000 sample contact records (generated using https://extendsclass.com/csv-generator.html) stored in Azure File Storage. Another option of generating sample data https://nishantrana.me/2020/05/26/using-data-spawner-component-ssis-to-generate-sample-data-in-dynamics-365/ The Source in our Data Factory pipeline. The Sink…

by Nishant Rana May 25, 2021May 25, 2021

Using SQL Server Management Studio to deploy and run SSIS package in Azure Data Factory

In our previous post, we created the SSIS Catalog (SSISDB) in Azure and deployed the SSIS package using SSDT. Supported version for SSDT – SQL Server Data Tools to deploy SSIS package to Azure. For Visual Studio 2017, version 15.3 or later. For Visual Studio 2015, version 17.2 or later. In this post, we’d use…

by Nishant Rana September 16, 2020

How to – Deploy and run SSIS Integration Toolkit for Dynamics 365 on Azure Data Factory (KingswaySoft)

In the previous post, we saw how to deploy and run SSIS packages on the cloud. Here we take it one step further and will deploy and run the SSIS packages that use KingswaySoft’ s SSIS Integration Toolkit components. Here we will need an Azure Subscription, where we will host the SSISDB, followed by provisioning…

by Nishant Rana September 15, 2020January 2, 2022

How to – Deploy and run SSIS package in Azure Data Factory

Before the SSIS package can be deployed to Azure Data Factory we need to provision Azure-SQL Server Integration Service (SSIS) runtime (IR) in Azure Data Factory. In the previous posts, we had created an Azure data factory instance had used Azure SQL Database as the source. Within Azure Data Factory in the Let’s get started…

by Nishant Rana September 14, 2020January 2, 2022

How to – Use Azure Data Factory V2 to load data into Dynamics 365

Let us take a simple example where we will set up an Azure Data Factory instance and use Copy data activity to move data from the Azure SQL database to Dynamics 365. Login to Azure Portal. https://portal.azure.com Search for Data factories Create a new data factory instance Once the deployment is successful, click on Go…

by Nishant Rana September 11, 2020January 1, 2022

Failed to get response from server error while trying to connect to Dynamics 365 using linked services – Azure Data Factory

Recently, while trying to connect to Dynamics 365 data set through Linked Service we got the below error Seems like a product issue, so the workaround is Opening Azure Data Factory in a new incognito or in-private mode. Or Cancel and do not select the certificate, while testing the connection. Ignoring the certificate fixed the…

by Nishant Rana September 10, 2020

Using SQL Server Management Studio to deploy and run SSIS package in Azure Data Factory

In our previous post, we created the SSIS Catalog (SSISDB) in Azure and deployed the SSIS package using SSDT.

Supported version for SSDT – SQL Server Data Tools to deploy SSIS package to Azure.

For Visual Studio 2017, version 15.3 or later.
For Visual Studio 2015, version 17.2 or later.

In this post, we’d use SSMS to deploy the packages in Azure.

Connect to the Azure SQL Server

Expand the Integration Services Catalog, right-click the Projects folder, and select the Deploy Project option.

Enter the source details in the deployment wizard

Select the option SSIS in Azure Data Factory

Select the existing or create a new folder for the project

Click on Deploy after successful validation and review.

Here in our case, it failed with the below message

There is no available node. Please check node status on the monitoring page of the ADF portal and ensure that at least one node is in running 1 and try again. (Microsoft SQL Server, Error: 50000)

The error is because the Azure-SSIS Integration runtime is in the status Stopped.