Thoughts on throttling updates

Thoughts on throttling updates

Microsoft recently (in February) published some updates to their documentation regarding Service protection API limits or as they are sometimes referred to, throttling. Some of these, like the new recommendations on how to handle batching are rather interesting and I thought I’d give my 2 cents about this. They are also eluding a bit regarding how the network infrastructure is set up for the deployment and how to optimize when handling larger workloads using the affinity cookie setting. I did find this rather interesting too.

So, first of all, a quick recap for those who arn’t into the throttling and API service limits. There are three different limits set up for Power Platform. You can read the details here: https://docs.microsoft.com/en-us/powerapps/developer/data-platform/api-limits#how-service-protection-api-limits-are-enforced

In short, within a 5 minute sliding window, you cannot exceed the following for one specific user.

  1. Not more than 6 000 requests.
  2. Not more than 1 200 (20 minutes) execution time – equal to 4 parallell processes if running at full capacity
  3. Not more than 52 request at the same time (concurrently).

Generally, if you do not use batching, you would typically run into the first (1) or the third if running unlimited threading. If using a connection pooling with 52 connections, then you probably run into the 1:st, and if you use complex request that cause cascading behaviours or batching, then you typically run into the second (2). There are exceptions that match these. Do refer to the official docs above for details about that.

Now to the interesting part.
There is a new section that is attempting to tells us how to maximize throughput. First of all, I think this is great. We really need this and we need Microsoft to tell us how to not only use the platform, but how to efficiently use their platform.

“Let the server tell you how much it can handle”
This section is interesting as it recommends a rather complex approach to how to work with performance. As they further down recommend using threading, they essentially recommend building logic that dynamically increases and decreases the number of threads as the platform informs you that it has capacity. This brings me back to university math, and trying to figure out the derivative of an unknown function by sampling and finding the local max. I do however, think this is a rather tall order to recommend to your average developer. But it would be a great community tool, so feel free to build it. Consider the challenge set. Best would be if Microsoft included this in the SDK of course.

Use multiple threads
In this part they recommend using multiple threads. This is also my experience that this is a good idea as the processing time and latency per package causes certain delay on a per-message basis. By utilizing multi-threading with multiple connections, this overhead can be reduced. As there is a limit of 52 concurrent connections, I would recommend using a maximum of that amount of connections/threads per user.

Avoid batching
Now this is really interesting. The previous recommendation was to use batching to be “nicer” to the API and get increased performance. The recommendation now is the direct opposite. This is based on the fact that the overhead in a WebAPI JSON-message is significantly smaller than that in a SOAP message and that this will reduce the difference between using batching and non-batching. They do, however, recommend using smaller batch sizes still. This is also my experience when working with Kingswaysoft. I typically (it depends on the instance and which table I am using) start with 16 threads with batches of 10 or 20. This has typically given me the best performance, with performance of +300 records/s.

There is also a comment about the fact that the using batching does not bypass the entitlement limits, ie. 20 000 API calls/24 hours for an enterprise user/100 000 API calls for all non-licensed users and so on. See more on the Entitlement limits based on which license you have here. Hence this calculation is done by after exploding the batch on the servers. This is also news to me as I previously was told that batching was exactly the way to go to limit the amount of calls.

Removing the affinity cookie – server multiplexing
The details being eluded to in this section are very interesting. If I understand it correctly, the logic is as follows:

The point being that, shutting off the affinity cookie int the HttpClient will allow for more wider use of all the servers in the node (the entire setup of all the Frontends, backends, NLB etc.)

What I do wonder, is if it would be possible to store the Affinity Cookie, and hence pool it on the client side. As each time you need to hit a new front end you will loose some time while it warms up your instance, and it would hence be advantageous to be able to more tightly control this. Maybe even this could be another community tool for someone interested?
I also think, I havn’t tested this, that you will get better results when working with removed affinity cookie if you do use batching, at least until all the frontends have been warmed up to your instance.

Do also note a very important sentence; “This increases throughput because limits are applied per server“. We do not know how many servers are used in a node frontend, but probably more than 10. Removing the affinity cookie could hence increase performance by at least one order of magnitute.

User multiplexing
As all API limits are calculated on a per-user basis, another way to increase performance is to use what I like to call user multiplexing. This means that operations are done using several different application users at the same time. There is of course some admin work that needs to be done to set these up, and there is no OOB way of doing this but with SSIS and Kingswaysoft it is rather straight forward; just create several connections, one per user, configure them per user and then use the “Balanced Data Distributor” which can be found in the productivity pack to spread the data to different destinations that are using the different connections.

My tips
My tips for getting good performance, for large scale datasets, are hence the following based on these new facts:

  1. Continue to use batching, but don’t use huge batches. Probably around 5-20 will be ok.
  2. Use multithreading. I typically use around 16, but that was before I knew about the removal of the affinity cookie. Hence I would recommend 16 per server. But I cannot tell you how many servers there are.
  3. Use the remove affinity cookie setting, and if possible, figure out some way of pooling the affinity cookies instead.
  4. Make sure your application can handle the exceptions regarding the API-limit and have some reasonable strategy for working with them. I have found that blasting the API for 5 minutes at max speed, then backing off for 5 minutes, then going full throttle again for 5 minutes, has given me better throughput overall than “being nice” and just finding the “right” speed to use to not be throttled. Not sure this strategy will work in the long run though.
  5. Use application user multiplexing.

Suggestions to ETL vendors and others
My suggestions to ETL vendors and others who build connections to Dataverse that require high performance are:

  1. Start by visualizing the affinity cookie setting so that it is possible to set this as wanted.
  2. Include multithreading, batching and application user multiplexing into the standard dataverse connections.
  3. Figure out if there are an points to pooling the affinity cookies, and if so, include this into the connection.
  4. Make the connection auto-optimize with the data it is currently sending. Ie. how many threads, size of batches, size of affinity cookie pool and number of application users to utilize.
  5. Have different strategies for utilizing application users instead of just spreading the data evenly, it could be that one is used until it receives an exception an then it is put on hold for 5 minutes and then another is being used. Or a combination of these two if there are five application users, 3 might be used for data transfer, and two on hold in case one gets an exception and needs to be put on hold.

I hope this has given you some insights and that my 2 cents got you this far. Feel free to leave a comment if you have an questions!

Custom columns on systemuser  – good or bad?

Custom columns on systemuser – good or bad?

Some people might have heard about an industry best practice that you should never have custom columns (fields) on the systemuser table (entity) in dataverse. Is this true and why so? This article is based on my understanding of how the inner workings of dataverse works and hence what you need to think about when designing your application to not unintentionally create an application that destroys your environments performance. In short, be careful about adding custom columns to the systemuser and if you do, only add fields that have static data, ie data that doesn’t often change. Let me describe this in more detail.

First of all, I would like to give credit to a lot of this to my friend and former Business Application MVP Adam Vero, who described this in detail for me, I have also discussed this with other people and since had it confirmed but not actually seen it documented as such, why it might not be fully official. I do, however, not see any problems with people understanding this, rather the opposite.

Dataverse is an application platform that has security built into it as an integral part, there are security roles, system users, teams and business units that form the core pieces of the security in the system. As the system will often need to query data from these four tables, it has a built in “caching” functionality that per-environment loads these four tables and precalculates them into an in-memory table for easy and fast access. This is then then stored in-memory for as long as the data in these four tables is kept static, in other words, nothing is changed, no updates, no creates, no deletes.

What could then happen if you add a column to the systemuser table? Well, that depends. If this column is a column that you set when the user is created and then never change that, that isn’t a problem, as this wouldn’t affect the precalculated in-memory table. However, if the data of these columns are constantly being changed, like for instance, if you add a column called “activities last 24h” and then create a Flow which every time an email, appointment etc. is created it will increment this by one per day and reset it every night for every user. Then every time, this writes to any user, the precalculated in-memory table will be flushed and recalculated before it can be used again causing a severe performance hit that can be very hard to troubleshoot.

How would you create a solution for a the “activities last 24h”, as described above then? Well, I would probably create a related entity called userstatistics with a relationship to systemuser. In this case it could even be smarter to have a 1:N relationship to this other entity as you could then have many userstatistics per user and measure differences in activities day by day.

But wouldn’t the NLB:s (Network Load Balancer) make this irrelevant as each environment is hosted together with many others? Well, I cannot, due to NDA talk about the details of how the NLBs actually work for the online environements, but I can say this, no, it is still relevant. The NLB will make it so for performance reasons.

As for teams, it is only the owner teams that count in this equation, the access teams are only being used for sharing or other types of grouping and hence never part of this pre-calculation.

And the savvy person would then of course realize that the multiplied size of:

systemusers x owner teams x business units x security roles

Does make up the size of this pre calculated table and for large implemenations, this can give indications of where performance can start to make a difference as every time a user does not have organizational level privilige, the system has to go through the entire table to check what is right. And then of course the POA. But that table is story for another day and another article.

Just final word. The platform is constantly shifting and even though this was true and probably still is true, there might be changes going on or that have happened that I am unaware of, that have changed how this works. If I hear of this, I will let you know.

Fast data management in a limited CDS world

Fast data management in a limited CDS world

In May 2019 Dynamics 365 CE/CDS enacted some new throttling mechanisms that have caused some headaches for anyone wanting to manage a lot of data in CDS (I will refer to Dynamics 365/CDS as just CDS below). There are several different throttles but the one that has cause me most trouble is the concurrency throttle. Kingswaysoft will release support for handling this in the next release and you can also request a special version from them if you ask nicely. In the meanwhile this post can give you some help on how to work as fast as possible using application user mulitplexing and a loop with a 5 min wait to make sure that the throttles are reset.

The new throttling on the main CDS API, as described here: https://docs.microsoft.com/en-us/dynamics365/customer-engagement/developer/api-limits needs to be carefully considered when doing heavy data manipulations in the CDS. One of my customers has a large system with numerous integrations of which the most data heavy are the Marketing Automation systems and the booking systems. And yes, this is Business to consumer.

With the new per GB pricing, keeping the database as small as possible has become an essential task and using the bulk delete just doesn’t work for large data loads, at the time of writing this article. I do hope that Microsoft increase the speed of it so that it does become more useful but currently its speed is somewhere around 1-2 records per second.

The bulk delete also has limitations on that it can only base it selections on a query, i.e. a FetchXML. Often this is not enough, for instance when you want to remove “All emails except those that have any connection to either a case or a contact which has a case”.

For these reasons I almost always opt for using SSIS with Kingswaysoft connectors to CDS when working with complex data management. This article will be on how to get some performance now that there is tougher throttling to take into consideration.

User multiplexing

As the throttling is measured on a “per user”, one trick is of course to use multiple users and spread the load over all these users. You can, of course use normal users, but that will cost you licenses so the smart person will of course use application users instead. If you don’t know how to create application users in Dynamics 365, check it out here: https://docs.microsoft.com/en-us/dynamics365/customer-engagement/admin/create-users-assign-online-security-roles#create-an-application-user . In the example below, I will be using four different application users, one as the source account and three as destinations. The reason for this is that it is typically easier to read several thousand rows per request, but seldom efficient to do batch creates/writes/deletes of more than 10-20.

To do this with SSIS/Kingswaysoft you should start by setting up the connections. In this case, the four CDS/CRM connections and use the OAuth auth-type like below. 

As you might want to have several packages in the same project and have them share the connections, it may be a good idea to use project connections. I also use an Azure SQL db for logging any errors. Previously I used to use CDS but now with the throttling, that is not such a good idea as the error itself might be throttling and hence the error can cause an error. Writing to some target that you know will not fail is hence a good idea for logging errors. When you are done with the connections, it should look something like this:

Now it is time to build the actual flow. If you’d normally have a Source and a Target, it will now look something like the image below, which I will explain.

First of all, the Premium Derived Column creates a new column which simply contains the row number. It will look something like this: 

I like to use the components that are available in the Productivity pack from Kingswaysoft, and this Premium Derived Column is one of these. In this case I think it is actually equal if you use IncrementalValue() or RowIndex(). I think you can create this logic with a normal Derived Column too, it just has less features.

Next we need to create a Conditional split that divides the rows evenly between the three destination components. This is done using the mathematical operator modulus which is written using the “%”-sign. For those that didn’t study this in school, it simple means “the rest” in a division. For instance 5%3=2, if you divide 5 by 3 you will get 1 and a rest of 2. What we will do, is assign RowNr%3 == 0 to Case 1, RowNr%3 == 1 to Case 2 and the rest to Case 3. That should divide them evenly. It looks like this: 

You then create the three destination components. I typically create one first, copy it and change it, as that is faster. Make sure that you set the Connection Manager to the three different Target Connections.

I also recommend that you fiddle a bit with the batch size and the number of threads and test out which gives the best results for you and the entity and action you are working on. There is no one answer here. I would typically start at Batch 10, Threads 16.

Tuning DataFlow property settings

If you back out to the Control Flow view and right click on the Data Flow you have created, there are some other interesting setting you can twirk.

DefaultBufferMaxRows – 10 000

DefaultBufferSize – 10 485 760 (10MB)

EngineThreads – 10

These can also be tuned to allow for the Data Flow to handle more rows, more memory and use more parallell threads which of course will make it faster (if that is the bottle neck, typically not when working with Dynamics)

What I have found is changing the maxrows to 100k, the buffer size to 100 MB and engine threads to 32 will not hurt but you can find several other blog articles specializing in SSIS that discuss this.

Crude throttle handler

What I have noticed is that many of my Dataflows simple seem to grind to a halt after 400-600k rows read from Dynamics. Not sure if it the read or write part that is causing this but what I figured is that probably the most pragmatic way of solving this would be to create a loop that runs a data flow that is limited in the number of records, typically 400k, wait 5 minutes then iterate. Smartest version is of course to have a control variable which checks to see when when there are no more rows and then breaks the loop, simpler version is to just loop n number of times to cover the amount of data you are trying to move, ie. number of rows per iteration x number of iterations. It would look something like the picture to the left.

If you would like to refine the loop a bit to make it more automatic, create a variable of type Int, for instance RowCount, set the initial value to be 10 or something different from 0. Then set the EvalExpression to “@RowCount > 0”. After this add a RowCounter control to the Data Flow and connect this to the variable RowCount. When the Data Flow runs and returns 0 rows, it will run to the end, the EvalExpression will evaluate to “False” which will cause it to break.

Using this technique, I am able to remove several million records in just a few hours. With one of these jobs I managed to remove 20 GB of structured data in less than two days (no attachments or similar, just records). By adding more application accounts and of course both to the source and particulary to the destination side, you can increase the speeds you are getting.

I do also advise you to be on the lookout for Kingswaysofts new version which I think will come soon, and do as I, make sure to always download both the Dynamics and Productivity Pack. I have read that there are great things coming to the productivity pack!

Deleting a lot of records fast

Deleting a lot of records fast

A quick one today…

Needed to delete a couple of million records for a customer and the natural thing was to use the Bulk Deletion service, well, I turned it on and it was extremely slow. Only got about 10 records/s which would cause the entire delete to take over a week. I have checked with Microsoft and this is not a bug, but it is working as designed and is not designed to be super fast. According to Microsoft bulk deletion jobs are put on the async queue on low priority to allow other more important jobs higher prio.

And a favorite quote of mine from Purvin Patel of Microsoft Does a dump truck need to outrace a Ferrari?” – and I think that the answer to that question is: it depends. Sometimes it does.  

Personally I would sometimes like it to be as fast as possible when removing a lot of records.

I also checked to see how fast the deletion would be with SSIS and Kingswaysoft. Used the following settings:

  • VM about 5 ms from the Dynamics 365 instance (important that it not be too far, use an Azure VM for this)
  • Used 64 threads
  • Used Execute Multiple batching with 10 (cannot use more that 10 if you are using a lot of threads, ie more than 2)
  • VM has 8 virtual cores and 32 GB memory
  • Loading in batches of 2000. Only loading the id-column, as that is all that is needed.

With this setup, I got somewhere around 345 records deleted per second. Which is a tad more than 34x faster than the bulk delete.

So, want to delete a lot of stuff, maybe Bulk Delete is not the way to go. Not yet anyway, let’s hope Microsoft makes it faster!

(this post was updated on Feb the 9:th 2018)

Gustaf Westerlund
MVP, Founder and Principal Consultant at CRM-konsulterna AB
www.crmkonsulterna.se

Common performance problem

Now Convergence is over and I feel stuffed with info and experiences. There is so much I would like to write about and so little time. I will try to spread it out a bit.

During one of the sessions Clint Warriner, an Escalation Engineer (he writes hotfixes) held a very interesting chalk-&-talk about CRM system maintanance and performance best practices.

It was crammed with goodies, and I will write more about it later, but one interesting thing he talked about was the possible delay in loading forms. He said this, most often, depends on the SQL server. When loading forms the list of which reports are relevant and so on, are also loaded from the Reporting server. This can, in some cases, take some time, and in some bad cases, really afect the load time of forms.

They have created a hotfix for this, that can be requested from Microsoft support (no cost) if you give the referece: KB 941592. It simply caches the information on what reports exist so that the request to the SQL-server doesn’t have to be executed every time.

I will get back to the subject of performance tuning later.

Gustaf Westerlund
CRM and SharePoint Consultant

Humandata AB
www.humandata.se