How to – improve data migration performance – SSIS & Azure Data Factory (Dataverse / Dynamics 365)


In one of our projects, we were executing SSIS Packages (KingswaySoft’s Dynamics 365 SSIS Integration Toolkit) under Azure-SSIS Integration Runtime in Azure Data Factory.

Check out –

Deploy and run SSIS Package in Azure Data Factory

Deploy and run SSIS Packages that use KingswaySoft’s SSIS Integration Toolkit on Azure Data Factory.

After trying out different combinations, we eventually settled with batch size as 10 and thread as 15.

https://nishantrana.me/2021/06/08/data-migration-optimum-batch-size-and-threads-for-maximum-throughput-microsoft-dataverse-dynamics-365/

Also, we used multiplexing – running the CRM Destination Component under different application users.

To be precise, 4 in our case and we can increase it get further  improvement in the throughput.

And also based on the recommendation of our Microsoft’s Fast Track Architect we raised a Microsoft ticket to increase the number of web servers allocated from 2 to 3.

Below were our findings,

the earlier run was using batch size as 100 and thread as 20 with the number of servers as 2.

On updating the batch size to 10 and thread as 15 and with the number of servers allocated increased to 3, there was a huge performance gain.

Check the table below – 

The above table is sample run in the sandbox environment, during the final run in production we got the number of servers allocated, increased to 6, gaining further improvement.

Also, check out the below blog post to understand about the affinity cookie and its affect on performance, in case if we doing migration using custom code –

https://markcarrington.dev/2021/05/26/improving-bulk-dataverse-performance-with-enableaffinitycookie/

Hope it helps..

Advertisements

Write batch size, data integration unit, and degree of copy parallelism in Azure Data Factory for Dynamics CRM / 365 Dataset


Let us take a simple example where we are moving contact records (.CSV) stored in Azure File Share to Dataverse or Dynamics 365 (UPSERT).

CSV file has 50000 sample contact records (generated using https://extendsclass.com/csv-generator.html) stored in Azure File Storage.

Another option of generating sample data

https://nishantrana.me/2020/05/26/using-data-spawner-component-ssis-to-generate-sample-data-in-dynamics-365/

The Source in our Data Factory pipeline.

The Sink is our Dynamics 365 / Dataverse sandbox environment, here we are using the Upsert write behavior.

For the Sink, the default Write batch size is 10.

Max concurrent connections specify the upper limit of concurrent connections that can be specified.

Below is our Mapping configuration

The Settings tab for the pipeline, allows us to specify,

Data Integration Unit specifies is the powerfulness of the copy execution.

Degree of copy parallelism specifies the parallel thread to be used.

Let us run the pipeline with the default values.

  • Write Batch Size (Sink) – 10
  • Degree of copy parallelism – 10
  • Data integration unit – Auto (4)

The results à It took around 58 minutes to create 50K contact records.

We then ran the pipeline few more times by specifying the different batch sizes and degree of copy parallelism.

We kept Max concurrent connections as blank and Data Integration Unit as Auto. (during our testing even if we are setting it to higher values, the used DIUs value as always 4)

Below are the results we got à

Write Batch Size Degree of copy parallelism Data Integration Unit (Auto) Total Time (Minutes)
100 8 4 35
100 16 4 29
1000 32 4 35
       
250 8 4 35
250 16 4 25
250 32 4 55
       
500 8 4 38
500 16 4 29
500 32 4 28
       
750 8 4 37
750 16 4 25
750 32 4 17
       
999 8 4 36
999 16 4 30
999 32 4 20

The results show that increasing the batch size and degree of copy parallelism improves the performance in our scenario.

Ideally, we should run a few tests with different combinations before settling for a specific configuration as it could vary.

On trying to set the batch size to more than 1000,

We would get the below error à
ExecuteMultiple Request batch size exeeds the maximum batch size allowed.

Also refer –

Optimizing Data Migrationhttps://community.dynamics.com/crm/b/crminthefield/posts/optimizing-data-migration-integration-with-power-platform

Using Data Factory with Dynamics 365https://nishantrana.me/2020/10/21/posts-on-azure-data-factory/

Optimum batch size with SSIShttps://nishantrana.me/2018/06/04/optimum-batch-size-while-using-ssis-integration-toolkit-for-microsoft-dynamics-365/

Hope it helps..

Advertisements

Posts on Azure Data Factory


How to – improve data migration performance – SSIS & Azure Data Factory (Dataverse / Dynamics 365)

In one of our projects, we were executing SSIS Packages (KingswaySoft’s Dynamics 365 SSIS Integration Toolkit) under Azure-SSIS Integration Runtime in Azure Data Factory. Check out – Deploy and run SSIS Package in Azure Data Factory Deploy and run SSIS Packages that use KingswaySoft’s SSIS Integration Toolkit on Azure Data Factory. After trying out different … Continue reading “How to – improve data migration performance – SSIS & Azure Data Factory (Dataverse / Dynamics 365)”

Write batch size, data integration unit, and degree of copy parallelism in Azure Data Factory for Dynamics CRM / 365 Dataset

Let us take a simple example where we are moving contact records (.CSV) stored in Azure File Share to Dataverse or Dynamics 365 (UPSERT). CSV file has 50000 sample contact records (generated using https://extendsclass.com/csv-generator.html) stored in Azure File Storage. Another option of generating sample data https://nishantrana.me/2020/05/26/using-data-spawner-component-ssis-to-generate-sample-data-in-dynamics-365/ The Source in our Data Factory pipeline. The Sink … Continue reading “Write batch size, data integration unit, and degree of copy parallelism in Azure Data Factory for Dynamics CRM / 365 Dataset”

Using SQL Server Management Studio to deploy and run SSIS package in Azure Data Factory

In our previous post, we created the SSIS Catalog (SSISDB) in Azure and deployed the SSIS package using SSDT. Supported version for SSDT – SQL Server Data Tools to deploy SSIS package to Azure. For Visual Studio 2017, version 15.3 or later. For Visual Studio 2015, version 17.2 or later. In this post, we’d use … Continue reading “Using SQL Server Management Studio to deploy and run SSIS package in Azure Data Factory”

Deploy and run SSIS Integration Toolkit for Dynamics 365 on Azure Data Factory (KingswaySoft)

In the previous post, we saw how to deploy and run SSIS packages on the cloud. Here we take it one step further and will deploy and run the SSIS packages that use KingswaySoft’ s SSIS Integration Toolkit components. Here we will need an Azure Subscription, where we will host the SSISDB, followed by provisioning … Continue reading “Deploy and run SSIS Integration Toolkit for Dynamics 365 on Azure Data Factory (KingswaySoft)”

Deploy and run SSIS package in Azure Data Factory

Before the SSIS package can be deployed to Azure Data Factory we need to provision Azure-SQL Server Integration Service (SSIS) runtime (IR) in Azure Data Factory. In the previous posts, we had created an Azure data factory instance had used Azure SQL Database as the source. Within Azure Data Factory in the Let’s get started … Continue reading “Deploy and run SSIS package in Azure Data Factory”

Use Azure Data Factory V2 to load data into Dynamics 365

Let us take a simple example where we will set up an Azure Data Factory instance and use Copy data activity to move data from the Azure SQL database to Dynamics 365. Login to Azure Portal. https://portal.azure.com Search for Data factories Create a new data factory instance Once the deployment is successful, click on Go … Continue reading “Use Azure Data Factory V2 to load data into Dynamics 365”

Failed to get response from server error while trying to connect to Dynamics 365 using linked services – Azure Data Factory

Recently, while trying to connect to Dynamics 365 data set through Linked Service we got the below error Seems like a product issue, so the workaround is Opening Azure Data Factory in a new incognito or in-private mode. Or Cancel and do not select the certificate, while testing the connection. Ignoring the certificate fixed the … Continue reading “Failed to get response from server error while trying to connect to Dynamics 365 using linked services – Azure Data Factory”

Using SQL Server Management Studio to deploy and run SSIS package in Azure Data Factory


In our previous post, we created the SSIS Catalog (SSISDB) in Azure and deployed the SSIS package using SSDT.

Supported version for SSDT – SQL Server Data Tools to deploy SSIS package to Azure.

  • For Visual Studio 2017, version 15.3 or later.
  • For Visual Studio 2015, version 17.2 or later.

In this post, we’d use SSMS to deploy the packages in Azure.

Connect to the Azure SQL Server

Expand the Integration Services Catalog, right-click the Projects folder, and select the Deploy Project option.

Enter the source details in the deployment wizard

Select the option SSIS in Azure Data Factory

Select the existing or create a new folder for the project

Click on Deploy after successful validation and review.

Here in our case, it failed with the below message

There is no available node. Please check node status on the monitoring page of the ADF portal and ensure that at least one node is in running 1 and try again. (Microsoft SQL Server, Error: 50000)

The error is because the Azure-SSIS Integration runtime is in the status Stopped.

navigate to your Azure Data Factory instance, and start the runtime.

After around 10 minutes or so the service would be up and running.

This time deployment is successful.

We can see the packages available within the pipeline.

Hope it helps..

Deploy and run SSIS Integration Toolkit for Dynamics 365 on Azure Data Factory (KingswaySoft)


In the previous post, we saw how to deploy and run SSIS packages on the cloud.

Here we take it one step further and will deploy and run the SSIS packages that use KingswaySoft’ s SSIS Integration Toolkit components.

Here we will need an Azure Subscription, where we will host the SSISDB, followed by provisioning Azure-SSIS Integration runtime instance.

We will also need the Azure Blob Storage account along with Azure Storage Explorer to upload the installation files of the SSIS Integration Toolkit.

Let us first start by creating an Azure SQL Server instance.

We have specified the below details.

Now next create the database inside the server.

Now with Azure SQL Server and Database created, the next step is to create the Storage account.

With the Azure Storage created, now let us connect to Azure using the Azure Storage Explorer.

Create a new blob container in the Azure Storage account created.

For the blob container created, right-click and select Get Shared Access Signature

Specify the expiry time along with Write permissions, this is for logging purpose when the Azure-SSIS IR is being provisioned.

Copy the URL (it will be used in the PowerShell script later)

Now let us get the installation files and programs from the KingswaySoft Shared Blob Container, which we’d place in the blob container we just created.

https://kingswaysoftgeneral.blob.core.windows.net/ssis-integration-toolkit-ultimate?st=2019-07-04T16%3A10%3A25Z&se=2059-07-05T16%3A10%3A00Z&sp=rl&sv=2018-03-28&sr=c&sig=LAGvouFpkZHEk%2BH8%2B0pK%2FDNg7B3jPUf%2FJ91%2BJ%2FEeKg0%3D

Right-click Storage Accounts and select Connect to Azure Storage

Select Use a shared access signature (SAS) URI

Paste the KingswaySoft blob container URL.

We can see the below contents added to the blob container.

Select all and copy all the files.

Paste it in the blob container we had created earlier.

With things now setup, let us get the PowerShell script to provision the Azure-SSIS Integration Runtime Initializations.ps1 and update it.

Specify the appropriate values and run the script. Get the Azure PowerShell.

Also, make sure to update the firewall rules to allow the client to connect.

Update the PowerShell Script appropriately

We can check the status as shown below.

In parallel, we can see our Azure Data Factory created with the integration runtime, which is in Starting status.

After a few minutes, we will have integration runtime up and running.

Below is our SSIS Package that we would be deploying to the cloud.

It uses Data Spawner Component to generate test data for Contacts and the CDS Destination component to create those records inside CDS.

Right-click the integration project and select Deploy

Specify connection details along with Path

After successful deployment, let us create a new pipeline inside the Azure Data Factory.

Drag and drop the Execute SSIS Package and click on the Settings tab.

Connect to the package deployed followed by Validate and Debug to test the pipeline.

The pipeline will be in Queued status

After successful execution,

navigate to our Dynamics 365 Sales Hub

We can see 10 contact records created by the SSIS Package.

Hope it helps..

Advertisements

Use Azure Data Factory V2 to load data into Dynamics 365


Let us take a simple example where we will set up an Azure Data Factory instance and use Copy data activity to move data from the Azure SQL database to Dynamics 365.

Login to Azure Portal.

https://portal.azure.com

Search for Data factories

Create a new data factory instance

Once the deployment is successful, click on Go to resource

Inside the data factory click on Author & Monitor

Click on Author in the left navigation

Create a new Pipeline

And drag the Copy data activity to it

Go to the Source tab, and create a new dataset.

Below is our Azure SQL database with contacts table which will be our source here.


Select Azure SQL Database as the source dataset.


Create a new linked service to specify the connection properties.


Specify the details to connect to the Azure SQL Database.


We have selected the contacts table here.


Similarly, let us define a new dataset for Sink which will connect to our Dynamics 365 Instance.



Select the Dynamics data set and specify the linked service.

Specify the details of the Dynamics 365 instance to connect to.

We have selected contact entity as the destination.

Within the Mapping tab, we can specify the fields to be mapped.

Below is how we have specified the mapping.

Click on Validate and after successful validation, click on Debug to run the pipeline.

Within the Output window, we can see the status.

After the successful run, we can see the contact records created inside Dynamics 365.

We can specify a trigger for the pipeline as shown below.

Publish All will publish the changes to the data factory.

Hope it helps..

Advertisements