Fast data management in a limited CDS world

Fast data management in a limited CDS world

In May 2019 Dynamics 365 CE/CDS enacted some new throttling mechanisms that have caused some headaches for anyone wanting to manage a lot of data in CDS (I will refer to Dynamics 365/CDS as just CDS below). There are several different throttles but the one that has cause me most trouble is the concurrency throttle. Kingswaysoft will release support for handling this in the next release and you can also request a special version from them if you ask nicely. In the meanwhile this post can give you some help on how to work as fast as possible using application user mulitplexing and a loop with a 5 min wait to make sure that the throttles are reset.

The new throttling on the main CDS API, as described here: https://docs.microsoft.com/en-us/dynamics365/customer-engagement/developer/api-limits needs to be carefully considered when doing heavy data manipulations in the CDS. One of my customers has a large system with numerous integrations of which the most data heavy are the Marketing Automation systems and the booking systems. And yes, this is Business to consumer.

With the new per GB pricing, keeping the database as small as possible has become an essential task and using the bulk delete just doesn’t work for large data loads, at the time of writing this article. I do hope that Microsoft increase the speed of it so that it does become more useful but currently its speed is somewhere around 1-2 records per second.

The bulk delete also has limitations on that it can only base it selections on a query, i.e. a FetchXML. Often this is not enough, for instance when you want to remove “All emails except those that have any connection to either a case or a contact which has a case”.

For these reasons I almost always opt for using SSIS with Kingswaysoft connectors to CDS when working with complex data management. This article will be on how to get some performance now that there is tougher throttling to take into consideration.

User multiplexing

As the throttling is measured on a “per user”, one trick is of course to use multiple users and spread the load over all these users. You can, of course use normal users, but that will cost you licenses so the smart person will of course use application users instead. If you don’t know how to create application users in Dynamics 365, check it out here: https://docs.microsoft.com/en-us/dynamics365/customer-engagement/admin/create-users-assign-online-security-roles#create-an-application-user . In the example below, I will be using four different application users, one as the source account and three as destinations. The reason for this is that it is typically easier to read several thousand rows per request, but seldom efficient to do batch creates/writes/deletes of more than 10-20.

To do this with SSIS/Kingswaysoft you should start by setting up the connections. In this case, the four CDS/CRM connections and use the OAuth auth-type like below. 

As you might want to have several packages in the same project and have them share the connections, it may be a good idea to use project connections. I also use an Azure SQL db for logging any errors. Previously I used to use CDS but now with the throttling, that is not such a good idea as the error itself might be throttling and hence the error can cause an error. Writing to some target that you know will not fail is hence a good idea for logging errors. When you are done with the connections, it should look something like this:

Now it is time to build the actual flow. If you’d normally have a Source and a Target, it will now look something like the image below, which I will explain.

First of all, the Premium Derived Column creates a new column which simply contains the row number. It will look something like this: 

I like to use the components that are available in the Productivity pack from Kingswaysoft, and this Premium Derived Column is one of these. In this case I think it is actually equal if you use IncrementalValue() or RowIndex(). I think you can create this logic with a normal Derived Column too, it just has less features.

Next we need to create a Conditional split that divides the rows evenly between the three destination components. This is done using the mathematical operator modulus which is written using the “%”-sign. For those that didn’t study this in school, it simple means “the rest” in a division. For instance 5%3=2, if you divide 5 by 3 you will get 1 and a rest of 2. What we will do, is assign RowNr%3 == 0 to Case 1, RowNr%3 == 1 to Case 2 and the rest to Case 3. That should divide them evenly. It looks like this: 

You then create the three destination components. I typically create one first, copy it and change it, as that is faster. Make sure that you set the Connection Manager to the three different Target Connections.

I also recommend that you fiddle a bit with the batch size and the number of threads and test out which gives the best results for you and the entity and action you are working on. There is no one answer here. I would typically start at Batch 10, Threads 16.

Tuning DataFlow property settings

If you back out to the Control Flow view and right click on the Data Flow you have created, there are some other interesting setting you can twirk.

DefaultBufferMaxRows – 10 000

DefaultBufferSize – 10 485 760 (10MB)

EngineThreads – 10

These can also be tuned to allow for the Data Flow to handle more rows, more memory and use more parallell threads which of course will make it faster (if that is the bottle neck, typically not when working with Dynamics)

What I have found is changing the maxrows to 100k, the buffer size to 100 MB and engine threads to 32 will not hurt but you can find several other blog articles specializing in SSIS that discuss this.

Crude throttle handler

What I have noticed is that many of my Dataflows simple seem to grind to a halt after 400-600k rows read from Dynamics. Not sure if it the read or write part that is causing this but what I figured is that probably the most pragmatic way of solving this would be to create a loop that runs a data flow that is limited in the number of records, typically 400k, wait 5 minutes then iterate. Smartest version is of course to have a control variable which checks to see when when there are no more rows and then breaks the loop, simpler version is to just loop n number of times to cover the amount of data you are trying to move, ie. number of rows per iteration x number of iterations. It would look something like the picture to the left.

If you would like to refine the loop a bit to make it more automatic, create a variable of type Int, for instance RowCount, set the initial value to be 10 or something different from 0. Then set the EvalExpression to “@RowCount > 0”. After this add a RowCounter control to the Data Flow and connect this to the variable RowCount. When the Data Flow runs and returns 0 rows, it will run to the end, the EvalExpression will evaluate to “False” which will cause it to break.

Using this technique, I am able to remove several million records in just a few hours. With one of these jobs I managed to remove 20 GB of structured data in less than two days (no attachments or similar, just records). By adding more application accounts and of course both to the source and particulary to the destination side, you can increase the speeds you are getting.

I do also advise you to be on the lookout for Kingswaysofts new version which I think will come soon, and do as I, make sure to always download both the Dynamics and Productivity Pack. I have read that there are great things coming to the productivity pack!

Azure guest accounts in CDS/Dyn365

Azure guest accounts in CDS/Dyn365

Azure Active Directory (AAD) has a feature where it allows users of foreign tennants to be granted access to the current tennant. In other word, if you are running contoso.com and a user of northwind.com would like to have access, you can add this user as a guest account in Azure. However, I have found that giving this user access to Dynamics is not fully straight forward, although, it is far from rocket science. In this article I will show how this is done.

Do note that I have heard from people in the product team that there are features of the powerplatfor that cannot currently be accessed using a guest account, I think it was Canvas Apps and Flow. I will have to try this out and get back to you (or someone else could! – I would appreciate a link back to this article) in a later article. I also do think that they are workin on this.

On a high level, what we need to do is:

  1. Add user in AAD
  2. Grant License
  3. Wait for the user to pop up in CDS/Dynamics
  4. Assign a security role in CDS/Dynamics

To start with, we need to go to the Azure Portal: https://portal.azure.com – and click on the AAD menu item on the left.

 

 

Browse to portal.azure.com -> click Azure Active Directory (AAD) -> Click Users

Click “New guest user”

Enter the email address of the user, and perhaps some nice personal email message showing you are not some evil spammer!

Then go to portal.office.com and you will now be able to see the new guest user in here.

Select the guest user and click “Edit product licenses” – Note, I have not been able to set licenses directly by opening the user, only this way.

Assign the license required, P2 or Dynamics Customer Engagement App or Plan – in the example above, a Dyn365CE Plan 1 (trial)

After you have assigned the guest user a license, you have to wait a while until the asynchronous service in O365 provisions the user in the CDS. This often is rather quick, but sometimes takes more time. When I was making this, it took more than 15 minutes.

To find the user in CDS/Dyn365 go to Settings and click on Security. (Old UI)

And then click on “Users” in the Security area.

This is how a guest user look like in Dynamics 365/CDS. It has a # sign in front of it. As you can see, I have another one with my name previously created.

The last thing that has to be done is to grant the guest user the correct role.

After this, just give the user the direct URL to the system and they should be able to log in with their normal users.

This is a very useful method to use when setting up trials for someone as they do not have to sign in with another account to access they system. I strongly recommend it.

As mentioned in the beginning of this article, there might still be some issues with using canvas apps and Flow using guest users, so do be aware that not all features could be available.

 

Hybrid NTLM Server Side Sync and Exchange 2013 Cert secrets

Hybrid NTLM Server Side Sync and Exchange 2013 Cert secrets

The server side sync is a technology for connecting Dynamics 365 CE to an Exchange server. When connecting an Online Dynamics 365 to an onprem Exchange there are some requirement that need to be met. These can be found here: https://technet.microsoft.com/sv-se/library/mt622059.aspx

Piping data to and from Exchange and Dynamics
By Quartl [CC BY-SA 3.0], from Wikimedia Commons

However, I just had a meeting with Microsoft and based on the version shown 2018-09-05, they have now added some new features that they haven’t had time to get into the documentation yet.

Some of the most interesting parts of the integration is that the it requires Basic Authentication for EWS (Exchange Web Service). Of the three types of authentication available Kerberos, NTLM and Basic, Basic Authentication is, as the name might hint, the least secure. Hence it is also not very well liked by many Exchange admins and may be a blocker for enabling Server Side Sync in Dynamics 365.

In the meeting I just had with Microsoft, they mentioned that they now support NTLM as well! That is great news as that will enable more organizations to enable Server Side Sync.

There is still a requirement on using a user with Application Impersonation rights which might be an issue as that can be viewed as having too high rights within the Exchange server. For this there is currently no good alternative solution. I guess making sure that the Dynamics Admins are trustworthy and knowing that the password is encrypted in Dynamics might ease some of that. But if the impersonation user is compromised, then a haxxor with the right tool or dev skills could compromise the entire Exchange server.

Microsoft also mentioned another common issue that can arise with the Outlook App when using SSS and hybrid connection to an Exchange 2013 onprem. It will show a quick alert saying “Can’t connect to Exchange” but it will be able to load the entire Dynamics parts.

This might be caused by the fact, according to Microsoft, that Exchange 2013, doesn’t automatically create a self-signed certificate that it can use for communication. Hence this has to be done.

This can be fixed by first creating a self signed certificate and then modify the authorization configuration using instruction found here . Lastly publish the certificate. It can also be a good idea to check that the certificate is still valid and hasn’t expired.

I will see if I can create a more detailed instruction on this later.

Gustaf Westerlund
MVP, Founder and Principal Consultant at CRM-konsulterna AB
www.crmkonsulterna.se

Setting up Data Export Service without PowerShell Script

Setting up Data Export Service without PowerShell Script

Setting up Dynamics 365 Data Export Service requires a Azure KeyVault to be set up which is typically done using a PowerShell script which can be found in the Data Export Service setup wizard. However, if you run into issues setting this up, it might be easier to do this directly in Azure by minimizing the steps of the scripts. This was a tip that my friend and Business Solution MVP Scott Durow recommended. He mentions this in his very instructive video, but doesn’t actually show how, so I thought I’d just detail how I made it work.

First some background. The reason why I even started investigating how to do this manually was that when I tried running the PowerShell script supplied by Microsoft in the wizard.

Press the “i” icon to get a window containing the PowerShell Script that Microsoft recommends for setting up the Key Vault.

When running the PowerShell script both as myself (not a global admin) and asking a global admin to do it, it failed in the latter parts. The key vault was created by some of the access policies seemed to be missing and it just didn’t work. My users rights in Azure was Contributor in the Resource Group, and it was a bit interesting cause the global admin and I got different error messages, but when I finally managed to create the key vault manually, I could do it all with my user, so it didn’t seem I was missing any rights to do it.

First step is to make sure you have all your data straight. The power shell script is good for this. Check out Scott clip if you want to know how to find the different strings. He shows it very clearly.

 Just copied from the PS-Script:

$subscriptionId = '<subscription ID>'
$keyvaultName = '
MyVault'
$secretName =
'MySecretName'
$location = '
North Europe'
$connectionString = 'Server=tcp:
<db-name>,
1433;
Initial Catalog=<catalog>;
Persist Security Info=False;
User ID={your_username};
Password={your_password};
MultipleActiveResultSets=False;
Encrypt=True;TrustServerCertificate=False;Connection Timeout=30;'
$organizationIdList = '<DYN365GUID>

$tenantId = ‘<AZURE TENANT ID> 

The highlighted parts have to be replaced by your settings. I will use these variables to have something to reference to further in this article.

Search for Key Vault and add the “Key vault”, the top one in this picture

Then we have to set it up. Not so tricky if you have worked with Azure before. Consider if you want to work in an existing Resourcegroup or if you want to create a new one. Typically you need to have Azure SQL services running as well so it might be good to keep them all together to be able to see the costs and control who has access why a resource group might be a good idea. But that should hence already exist. If not, you can create it. I would recommend keeping Azure SQL and Key vault in the same, not sure if it actually works in different resource groups, probably does, but I haven’t tested.

Creating the key vault – in this case I am creating a new resource group, normally it would already exist

Azure will add you as the default principal with access to the key vault. We will add Data Export Service to this later. For now, just create it.

Now we need to open the Key vault and select the “Secrets” section in the menu on the left hand side and press the button:

“+ Generate/Import” 

Then you have to enter you Secret name ($secretName) and the connection string ($connectionString) into the value.

Creating a secret – $secretname in Name and $connectionstring in Value

Press “Create”.

You should now return to the previous screen and see a row for your secret.
Select it.

It should open the settings panel for the Secret, press the “Tags” part which is located in the middle and add a tag which has $OrgIdList ($organizationIdList) as the key and Tennant ($tenantId) as value. I have blurred them out below as they are rather private.

Adding a tag with OrgIdList and tenantId to a Secret

You then need to go back to the Key Vault and click on the “Access Policies” menu item, you should then see yourself as the principal as this was set when we created the key vault. We now need to add Data Export Service as a valid Principal with read access rights.

So click “Add”, click “Select Principal” and search for “b861dbcc-a7ef-4219-a005-0e4de4ea7dcf” which is the ID for Data Export Service. It should show up like this:

It needs to have “Secret Management Operations – GET” permissions and nothing else.

Now, go back to the Secret and copy the URI to the Secret.

Getting the URI for the Key Vault Secret

Paste it into the Data Export Service Wizard field for Key Vault.

Fill in the other information and press validate. Hopefully it will work out well!

Some issues

Being too cheap with the Azure SQL level
If you don’t go for a Azure SQL P1 and choose a lower tier, you might get this warning:

We tried an S0 for our Dev environment and tried to sync a couple of million records and that just didn’t work, we got tons of errors. We upgraded the ASQL to S2 and then at least we didn’t get any errors. We are planning for P1s in UAT and production.

Might have to set activation date on secret
Seems that you might have to set an activation date on the secret. Not sure why this is, the PS-script doesn’t seem to do this. But not very hard.

Added activation date on the Secret from June 4.th

Using Database schema that is not created
The default database schema is “dbo” in the Data Export Service Wizard. If you change this to something else like “crm” and you haven’t created this in the database, you will get an error. It is simple to fix, you just have to go into the database and create the schema. To create the schema “crm” open a query and run:
CREATE SCHEMA crm

For more information on how to create schemas, check this site: https://docs.microsoft.com/en-us/sql/t-sql/statements/create-schema-transact-sql?view=sql-server-2017

Once the schema has been created, there should be no problem using it, as long as the user has permissions using it.

I hope this works for you. If you have any questions, don’t hesitate to leave a comment.

Gustaf Westerlund
MVP, Founder and Principal Consultant at CRM-konsulterna AB
www.crmkonsulterna.se

Deleting a lot of records fast

Deleting a lot of records fast

A quick one today…

Needed to delete a couple of million records for a customer and the natural thing was to use the Bulk Deletion service, well, I turned it on and it was extremely slow. Only got about 10 records/s which would cause the entire delete to take over a week. I have checked with Microsoft and this is not a bug, but it is working as designed and is not designed to be super fast. According to Microsoft bulk deletion jobs are put on the async queue on low priority to allow other more important jobs higher prio.

And a favorite quote of mine from Purvin Patel of Microsoft Does a dump truck need to outrace a Ferrari?” – and I think that the answer to that question is: it depends. Sometimes it does.  

Personally I would sometimes like it to be as fast as possible when removing a lot of records.

I also checked to see how fast the deletion would be with SSIS and Kingswaysoft. Used the following settings:

  • VM about 5 ms from the Dynamics 365 instance (important that it not be too far, use an Azure VM for this)
  • Used 64 threads
  • Used Execute Multiple batching with 10 (cannot use more that 10 if you are using a lot of threads, ie more than 2)
  • VM has 8 virtual cores and 32 GB memory
  • Loading in batches of 2000. Only loading the id-column, as that is all that is needed.

With this setup, I got somewhere around 345 records deleted per second. Which is a tad more than 34x faster than the bulk delete.

So, want to delete a lot of stuff, maybe Bulk Delete is not the way to go. Not yet anyway, let’s hope Microsoft makes it faster!

(this post was updated on Feb the 9:th 2018)

Gustaf Westerlund
MVP, Founder and Principal Consultant at CRM-konsulterna AB
www.crmkonsulterna.se