Fixed – AuthorizationFailed. The client with object id does not have authorization to perform action ‘’Microsoft.Authorization/roleAssignments/write’ over scope ‘storageaccount’ – Azure Data Lake


While configuring the Azure Synapse Link/ Export to Data Lake service, we were getting below error for one of the users.

{“code”:”AuthorizationFailed”,”message”:”The client ‘nishantr@pmaurua105.onmicrosoft.com’ with object id ‘d56d5fbb-0d46-4814-afaa-e429e5f252c8’ does not have authorization to perform action ‘Microsoft.Authorization/roleAssignments/write’ over scope ‘/subscriptions/30ed4d5c-4377-4df1-a341-8f801a7943ad/resourceGroups/RG/providers/Microsoft.Storage/storageAccounts/saazuredatalakecrm/providers/Microsoft.Authorization/roleAssignments/2eb81813-3b38-4b2e-bc14-f649263b5fcf’ or the scope is invalid. If access was recently granted, please refresh your credentials.”}

The current Role assignments of the user were as below.

To resolve it we need to give the Owner role to the user configuring the Export to Data lake service on the Storage account. (apart from System Admin role within Dynamics 365)

https://docs.microsoft.com/en-us/powerapps/maker/data-platform/export-to-data-lake#prerequisites

After assigning the Owner Role (and removing the other roles), it worked successfully.

Check other posts on Azure Data Lake/ Azure Synapse Link

Posts on Azure Data Lake

Hope it helps..

Advertisements

How to setup – Azure Synapse Link – Microsoft Dataverse


Azure Synapse Link (earlier known as Export to Data Lake Service) provides seamless integration of DataVerse with Azure Synapse Analytics, thus making it easy for users to do ad-hoc analysis using the familiar T-SQL with Synapse Studio, build Power BI Reports using Azure Synapse Analytics Connector or use Azure Spark in Azure Synapse for analytics.

As a first step, we need to create the Azure Synapse Workspace.

Login to Azure Portal (https://portal.azure.com/ ) with appropriate roles and create a Synapse workspace.

https://docs.microsoft.com/en-us/azure/synapse-analytics/get-started-create-workspace

Wait for the deployment to be complete. Below are the resources created as part of the deployment.

Login to the maker portal (https://make.powerapps.com/ ) and select the appropriate environment

Click on Azure Synapse Link and check the Connect to your Azure Synapse Analytics workspace (preview) and specify the storage account created in the previous step

Here we have selected the Contact and System User table for export.

On clicking Save, we got this error couple of times, even though the account being used was Owner across the Azure resources. Trying save 3-4 times more, allowed the link to be created.

Click on Go to Azure Synapse Analytics workspace, opens the Azure Synapse Studio

Within Synapse Studio we can see our Dataverse Database and run query.

Similarly, we create a Power BI report with Azure Synapse Analytics (SQL DW) connector.

Copy the Serverless SQL endpoint of the Synapse workspace.

Select the Direct Query option.

https://azure.microsoft.com/en-gb/blog/power-your-business-applications-data-with-analytical-and-predictive-insights/

Hope it helps..

Advertisements

Azure Synapse Link / Export to Data Lake Service – Performance (initial sync)


Recently we configured the Export to Data Lake service for one of our projects.

Just sharing the performance, we got during the initial sync.

Entity

Count

Contact

2,36,2581

Custom entity

1,61,3554

The sync started at 11:47 A.M and was completed around 4:50 P.M. – around 5 hours i.e. 300 minutes

Let us consider total records synced as 3975000 – 1613446 + 2362581 + 2 = 3976029.

So the performance here comes down to

  • 795000 records per hour
  • 13250 records per minutes
  • 220 records per second.

Of course, it will vary depending on the specific environment, table / entity type, attributes in it etc. it gives us a rough idea.

Hope it helps..

Advertisements

Advanced configuration settings – Azure Synapse Link / Export to Data Lake service (Dataverse/ Dynamics 365)


The Export to Data Lake service now has some Advanced configuration settings available.

To learn more on Export to Data Lake service

https://nishantrana.me/2020/12/10/posts-on-azure-data-lake/

The new settings allow us to configure how the DataVerse / CRM table data is written to Azure Data Lake.

  • In-Place update or upsert (default)
  • Append Only

With the in-place update, the default setting, the file will contain the full data set, and any update in the source will update the same in the synced CSV file or the data partition, similarly, any record deleted will delete the row from the data partition, unlike Append Only where a new row will be added in case of both update and delete.

For huge volume of data, Microsoft recommends opting for Append only mode. This mode is also preferable when an organization wants to incrementally review the changed data.

The other option is to define the data partition strategy.

  • By Month (default)
  • By Year

With this option, files generated are partitioned by either year or more granular month-wise, which can be specified per-table basis.

Microsoft recommends Monthly partition if data volume is high.

Now, let us see it in action.

For the Lead table, we haven’t selected the option for advanced configuration settings and are going by default.

  • Append Only – No
  • Partition Strategy – Month

For contact, we have enabled the advanced configuration settings and opted for Partition Strategy as Year.

For Account , we have opted for Append Only as true, for which the Partition strategy option is disabled and set as Year.

The final configuration à

Within the container inside the Storage Account, we can see corresponding folders created per table/entity along with model.json as shown below.

Let us explore the Lead folder –

We can see 2 CSV created with format YYYY-MM.csv i.e. having the month part in it because we had specified Partition Strategy as Month i.e. the default value.

For Contact and Account, the Pattern Strategy was Year, so we have files generated in format YYYY.csv

Let us update one of the lead records by appending ‘Updated’ in the last name field.

After the successful sync,

we can see the record updated in the .csv / partition.

The same is the case with the contact record.

Now let us update an account record, it had Append Only specified as Yes.

Here we update the Account Name field from Litware to Litware Updated.

After the sync

We can see a new row appended with the updated record along with the original record.

Let us delete the same account record

As expected, being Append Only mode, we can see a new row added for the Litware record.

We have 2 additional rows apart from the original row, one created for update and the other for delete action.

Export to Data Lake service is Microsoft’s recommended way of synchronizing Dataverse Data with external storage, and we can see them continuously investing and adding enhancements to it.

Get all the details here –

https://docs.microsoft.com/en-us/powerapps/maker/data-platform/export-to-data-lake#data-partition-strategy

Hope it helps..

Write batch size, data integration unit, and degree of copy parallelism in Azure Data Factory for Dynamics CRM / 365 Dataset


Let us take a simple example where we are moving contact records (.CSV) stored in Azure File Share to Dataverse or Dynamics 365 (UPSERT).

CSV file has 50000 sample contact records (generated using https://extendsclass.com/csv-generator.html) stored in Azure File Storage.

Another option of generating sample data

https://nishantrana.me/2020/05/26/using-data-spawner-component-ssis-to-generate-sample-data-in-dynamics-365/

The Source in our Data Factory pipeline.

The Sink is our Dynamics 365 / Dataverse sandbox environment, here we are using the Upsert write behavior.

For the Sink, the default Write batch size is 10.

Max concurrent connections specify the upper limit of concurrent connections that can be specified.

Below is our Mapping configuration

The Settings tab for the pipeline, allows us to specify,

Data Integration Unit specifies is the powerfulness of the copy execution.

Degree of copy parallelism specifies the parallel thread to be used.

Let us run the pipeline with the default values.

  • Write Batch Size (Sink) – 10
  • Degree of copy parallelism – 10
  • Data integration unit – Auto (4)

The results à It took around 58 minutes to create 50K contact records.

We then ran the pipeline few more times by specifying the different batch sizes and degree of copy parallelism.

We kept Max concurrent connections as blank and Data Integration Unit as Auto. (during our testing even if we are setting it to higher values, the used DIUs value as always 4)

Below are the results we got à

Write Batch Size Degree of copy parallelism Data Integration Unit (Auto) Total Time (Minutes)
100 8 4 35
100 16 4 29
1000 32 4 35
       
250 8 4 35
250 16 4 25
250 32 4 55
       
500 8 4 38
500 16 4 29
500 32 4 28
       
750 8 4 37
750 16 4 25
750 32 4 17
       
999 8 4 36
999 16 4 30
999 32 4 20

The results show that increasing the batch size and degree of copy parallelism improves the performance in our scenario.

Ideally, we should run a few tests with different combinations before settling for a specific configuration as it could vary.

On trying to set the batch size to more than 1000,

We would get the below error à
ExecuteMultiple Request batch size exeeds the maximum batch size allowed.

Also refer –

Optimizing Data Migrationhttps://community.dynamics.com/crm/b/crminthefield/posts/optimizing-data-migration-integration-with-power-platform

Using Data Factory with Dynamics 365https://nishantrana.me/2020/10/21/posts-on-azure-data-factory/

Optimum batch size with SSIShttps://nishantrana.me/2018/06/04/optimum-batch-size-while-using-ssis-integration-toolkit-for-microsoft-dynamics-365/

Hope it helps..

Advertisements

Using AzCopy to sync the local data with Azure Storage


Using the sync command of azcopy, we can keep the local data synchronized with Azure Blob.

https://docs.microsoft.com/en-us/azure/storage/common/storage-ref-azcopy-sync

Suppose below is our storage account named – samplestorageaccountcrm

having the container named – mycrmfilescontainer inside it as shown below.

Below is how the URL for the container will look like

https://samplestorageaccountcrm.blob.core.windows.net/mycrmfilescontainer

i.e. format –

https://[storagename].blob.core.windows.net/[containername]

Let us generate the SAS token for the Storage account with the appropriate permissions..

Navigate to Shared access signature navigation link, specify the permissions and click on Generate SAS and connection string

Copy the generated SAS token and append it to the URL.

https://samplestorageaccountcrm.blob.core.windows.net/mycrmfilescontainer?sv=2020-02-10&ss=bfqt&srt=sco&sp=rwdlacupx&se=2021-03-09T02:50:19Z&st=2021-03-08T18:50:19Z&spr=https&sig=OKydecj8kMBzi%2Ff4dwutlHbIvYimQv9FGPQmKwott5w%3D

Now we are ready to run the AzCopy command to sync the contents of the below folder with the container.

On executing the command within PowerShell, it will scan the files at the source first, followed by the files in the destination, and will copy the files from the source that are not present in the destination.

Sample Run:-

We can see both the files uploaded in the container.

Now if we try to run the same command as the batch .bat file.

https://www.windowscentral.com/how-create-and-run-batch-file-windows-10

We might encounter the below error – “Server failed to authorize the request. Make sure the value of the Authorization header is formed correctly including the signature

This is because of the special characters within the SAS token – the signature part, that needs to be escaped.

https://bornsql.ca/blog/using-azcopy-with-batch-files-and-task-scheduler/

https://www.robvanderwoude.com/escapechars.php

Here the special character within the sig is replaced with appropriate escape sequences.

E.g. “%” with “%%”

Now updating the .bat file with the updated command allows us to run it successfully.

@ECHO OFF

“D:\azcopy_windows_amd64_10.9.0\azcopy.exe” sync “C:\Intel” “https://samplestorageaccountcrm.blob.core.windows.net/mycrmfilescontainer?sv=2020-02-10&ss=bfqt&srt=sco&sp=rwdlacupx&se=2021-03-09T02:50:19Z&st=2021-03-08T18:50:19Z&spr=https&sig=OKydecj8kMBzi%%2Ff4dwutlHbIvYimQv9FGPQmKwott5w%%3D”

PAUSE

Next, we can run the batch file within the task scheduler.

https://stackoverflow.com/questions/4437701/run-a-batch-file-with-windows-task-scheduler

Hope it helps..

Advertisements