“Do you really need to delete records like a Ferrari?” – that question was posed to me when I, a few years ago complained about the bad performance of the Bulk Deletion functionality in Power Platform (at that time Dynamics 365 Online) to a friend at Microsoft who I will not name. And my simple answer is yes, we do need to delete records like a Ferrari, for many reasons. I will discuss why in this article and I have for that reason also created an Idea on the Power Apps Community site on this subject and I hope that you agree with me and vote for it! You will find it on the link below.
So, why is a fast bulk deletion important. I would say there are several reasons and I will walk through the ones that I have thought of, if you have any other, please drop a comment.
Keep your data in check – remove unnecessary data
GDPR and other compliancy and legal issue
Power Platform growing into Citizen developer platform
Entitlements effectivly blocks using external tools
Keeping data in check
For larger organizations, especially with many integrated modules and systems, many running Flows, workflows, Customer Voice surveys etc. the system will generate a lot of data, especially if it is a B2C scenario. A few of these have built in features that automatically remove old logs etc but most don’t and we as admins and system caretakers (isn’t it a fancy title!) need to tend to this, typically by setting up jobs that clean old data. I would recommend looking at the PPAC statistics of which tables are the largest and having a practice of doing so at regular intervals and downloading it. That way you can see the trends over time. A suggestion for an addon to the CoE Starter kit would be a trend analysis of all tables with growth numbers per week for each tables with warnings for quickly growing tables and prognosis.
However, as instances start growing over 50-100 GB in size (of structured data) it soon becomes too large to handle the data with bulk deletion. Some tables might still be managable this way, but in general the performance is has is, when I have tried to measure it (albeit a few years ago) was around 1-3 records per second. A customer I have, working with B2C for whom I wanted to remove their Voice of the Customer, which had been used a lot, had over 50 Million Survey Invites. It is not possible to remove the solution without first removing the data, and if we were to use Bulk Delete and put it on crack and it got to 10 records per second, it would still take around 2 months. I now did it with SSIS/Kingswaysoft and it took a few days. If Bulk Delete could reach around 200 records/second, it would take a little less than 3 days.
I have also noted that when trying to Bulk Delete very large datasets, Bulk Delete simply fails, as I think the FetchXML query might do a SQL Timeout or something like that. Not exactly sure what happens. As it works with Kingswaysoft I don’t know what might be the difference.
GDPR and other compliancy and legal issues
As GDPR and other similar compliancy regulations have come into play in many countries around the world, it has become ever more important to stricly follow these detailed instructions. These might be simple when you look at them on a Power Point C-level perspective but when you dig down on the detailed level, where they actually need to be implemented, things seldom are as simple as in a Power Point.
Power Platform growing into Citizen developer platform
As the Power Platform grows from being just a platform on which Dynamics 365 is delivered to being a huge platform for digitalization entire organizations with almost 100% user saturation will be coming starting to use Dataverse. The amount of data being stored in dataverse will hence grow to massive amounts and hence an effective tool to manage this data is most important. It is probably even important to such a level that Bulk Delete cannot even scratch the top of the iceberg of what we need to be able to do on a data management perspective as data will be growing and expanding in heaps and bounds and admins will not only need to manage Flows and Apps but also data in size and content.
Entitlements effectivly blocks using external tools
The soon to enacted entitlements, as mentioned in my previous post, Entitlements are not throttling | Powerplatform.se, also effectivly stop the use of external tools like SSIS/Kingswaysoft for deleting unwanted data. One of the customers I am working with generate between 10-20 M API requests PER DAY, and the bulk of these are from deletion jobs or other maintainance jobs trying to keep track of the instances. With the new entitlements charge, there is no way this can be continued, but the customer is cought between a rock and a hard place as either the data grows by leaps and bounds or the API calls becomes a huge cost and there is no easy way to handle it. What advise am I to give the customer? I would think that the most reasonable thing would be if the platform made the tools available to maintain the data to avoid the costs. If this is using bulk delete or some other more elaborate feature, that is up to the product team but I do think they should hold off on activating the entitlements until there is a good alternative for managing an instance data within the platform before this (not generating API requests).
What else is missing?
Bulk deletion is not only not being performant enough, it also lacks the effective filtering logic that is required for more complex queries. For some customers a I have had to construct rather elaborate SSIS scripts which start with a complex FetchXML and the filter the data through several Cache Transforms, for instance with GDPR consents and similar to be able to get the final list. I must admit that I havn’t tried using the new T-SQL connector for this, that it could handle the full T-SQL complexity and that it is implemented in Buld Delete or Kingswaysoft as a means to make querying more powerful.
Microsoft recently (in February) published some updates to their documentation regarding Service protection API limits or as they are sometimes referred to, throttling. Some of these, like the new recommendations on how to handle batching are rather interesting and I thought I’d give my 2 cents about this. They are also eluding a bit regarding how the network infrastructure is set up for the deployment and how to optimize when handling larger workloads using the affinity cookie setting. I did find this rather interesting too.
In short, within a 5 minute sliding window, you cannot exceed the following for one specific user.
Not more than 6 000 requests.
Not more than 1 200 (20 minutes) execution time – equal to 4 parallell processes if running at full capacity
Not more than 52 request at the same time (concurrently).
Generally, if you do not use batching, you would typically run into the first (1) or the third if running unlimited threading. If using a connection pooling with 52 connections, then you probably run into the 1:st, and if you use complex request that cause cascading behaviours or batching, then you typically run into the second (2). There are exceptions that match these. Do refer to the official docs above for details about that.
Now to the interesting part. There is a new section that is attempting to tells us how to maximize throughput. First of all, I think this is great. We really need this and we need Microsoft to tell us how to not only use the platform, but how to efficiently use their platform.
“Let the server tell you how much it can handle” This section is interesting as it recommends a rather complex approach to how to work with performance. As they further down recommend using threading, they essentially recommend building logic that dynamically increases and decreases the number of threads as the platform informs you that it has capacity. This brings me back to university math, and trying to figure out the derivative of an unknown function by sampling and finding the local max. I do however, think this is a rather tall order to recommend to your average developer. But it would be a great community tool, so feel free to build it. Consider the challenge set. Best would be if Microsoft included this in the SDK of course.
“Use multiple threads“ In this part they recommend using multiple threads. This is also my experience that this is a good idea as the processing time and latency per package causes certain delay on a per-message basis. By utilizing multi-threading with multiple connections, this overhead can be reduced. As there is a limit of 52 concurrent connections, I would recommend using a maximum of that amount of connections/threads per user.
Avoid batching Now this is really interesting. The previous recommendation was to use batching to be “nicer” to the API and get increased performance. The recommendation now is the direct opposite. This is based on the fact that the overhead in a WebAPI JSON-message is significantly smaller than that in a SOAP message and that this will reduce the difference between using batching and non-batching. They do, however, recommend using smaller batch sizes still. This is also my experience when working with Kingswaysoft. I typically (it depends on the instance and which table I am using) start with 16 threads with batches of 10 or 20. This has typically given me the best performance, with performance of +300 records/s.
There is also a comment about the fact that the using batching does not bypass the entitlement limits, ie. 20 000 API calls/24 hours for an enterprise user/100 000 API calls for all non-licensed users and so on. See more on the Entitlement limits based on which license you have here. Hence this calculation is done by after exploding the batch on the servers. This is also news to me as I previously was told that batching was exactly the way to go to limit the amount of calls.
Removing the affinity cookie – server multiplexing The details being eluded to in this section are very interesting. If I understand it correctly, the logic is as follows:
The point being that, shutting off the affinity cookie int the HttpClient will allow for more wider use of all the servers in the node (the entire setup of all the Frontends, backends, NLB etc.)
What I do wonder, is if it would be possible to store the Affinity Cookie, and hence pool it on the client side. As each time you need to hit a new front end you will loose some time while it warms up your instance, and it would hence be advantageous to be able to more tightly control this. Maybe even this could be another community tool for someone interested? I also think, I havn’t tested this, that you will get better results when working with removed affinity cookie if you do use batching, at least until all the frontends have been warmed up to your instance.
Do also note a very important sentence; “This increases throughput because limits are applied per server“. We do not know how many servers are used in a node frontend, but probably more than 10. Removing the affinity cookie could hence increase performance by at least one order of magnitute.
User multiplexing As all API limits are calculated on a per-user basis, another way to increase performance is to use what I like to call user multiplexing. This means that operations are done using several different application users at the same time. There is of course some admin work that needs to be done to set these up, and there is no OOB way of doing this but with SSIS and Kingswaysoft it is rather straight forward; just create several connections, one per user, configure them per user and then use the “Balanced Data Distributor” which can be found in the productivity pack to spread the data to different destinations that are using the different connections.
My tips My tips for getting good performance, for large scale datasets, are hence the following based on these new facts:
Continue to use batching, but don’t use huge batches. Probably around 5-20 will be ok.
Use multithreading. I typically use around 16, but that was before I knew about the removal of the affinity cookie. Hence I would recommend 16 per server. But I cannot tell you how many servers there are.
Use the remove affinity cookie setting, and if possible, figure out some way of pooling the affinity cookies instead.
Make sure your application can handle the exceptions regarding the API-limit and have some reasonable strategy for working with them. I have found that blasting the API for 5 minutes at max speed, then backing off for 5 minutes, then going full throttle again for 5 minutes, has given me better throughput overall than “being nice” and just finding the “right” speed to use to not be throttled. Not sure this strategy will work in the long run though.
Use application user multiplexing.
Suggestions to ETL vendors and others My suggestions to ETL vendors and others who build connections to Dataverse that require high performance are:
Start by visualizing the affinity cookie setting so that it is possible to set this as wanted.
Include multithreading, batching and application user multiplexing into the standard dataverse connections.
Figure out if there are an points to pooling the affinity cookies, and if so, include this into the connection.
Make the connection auto-optimize with the data it is currently sending. Ie. how many threads, size of batches, size of affinity cookie pool and number of application users to utilize.
Have different strategies for utilizing application users instead of just spreading the data evenly, it could be that one is used until it receives an exception an then it is put on hold for 5 minutes and then another is being used. Or a combination of these two if there are five application users, 3 might be used for data transfer, and two on hold in case one gets an exception and needs to be put on hold.
I hope this has given you some insights and that my 2 cents got you this far. Feel free to leave a comment if you have an questions!
The reason I often recommend Kingswaysoft over other methods of data migration or even sometime data integration is rather simple, it can do what others can’t. Not even Microsoft. And it can do it fast.
So have I been payed by Kingswaysoft to write this? No, not a penny, I havn’t even been given a free license even though they might if I asked. I think it is just fair that I explain why I am such a strong proponent for this product and if anyone disagrees, please feel free to drop a comment below.
Kingswaysoft both know how the API:s of the Power Platform/CDS/Dataf??x (called CDS below) work, sometimes even better than Microsoft themselves as I have seen feedbacks given in such detail from them on what is missing from specific API:s to reach feature parity, that it is scary. Kingswaysoft also has a blog which has details recommendations and built in recommendations in the product on how to maximize performance with the Dataflex API. They have also built in handling for throttling, multi-threading, batching and more to just make it transparent. When looking at other integration products and connectors. I have never seen anything that is close to this depth.
Datamodel and Dynamics knowledge
Dynamics 365 Sales, Customer Service and other first party apps have some very peculiar oddities that need to be taken into consideration. These oddities range from for example:
How to delete Marketing List members – by using the party member and list as key
How audit logs are handled
How Activities work, with activity pointer and activity parties
Reading/Writing personal views (saved advanced finds)
Setting statusvalues of some fringe entities
Even though many tools, like Power Automate, Data Flows, Azure Data Factory, etc. all seem to handle common tasks like creating or reading a contact or account rather well, it soon becomes a problem when you start looking at some of the areas above. And you often need to in a migration.
It can use the power of SSIS
SQL Server Integration Services is a very powerful framework once you start getting your head around it. Yes, it has some quirks to it but in general it can do some rather cool things and is good when you want to sequence many different tasks in order to reduce overall runtime. Using built in features like Cache Transforms with memory storage, you can make filters using memory based lookups with millions of records super fast, once everything is loaded.
…and extend it
And of course Kingswaysoft didn’t settle for just building connectors to Dynamics 365. They have a pack called “Productivity Pack” that adds a lot of nice features to SSIS that makes your day a lot easier.
And if you have problems – great support
And if you ever run into problems, their support is great. We have identified buggs and they have fixed the bug and sent us a special deploy just a day or two after.
There is always a flip side. I think one is that SSIS isn’t always your most stable product. For this I don’t think Kingswaysoft are to blame but it affects their product experience none the less.
Another obvious flip side is that this requires a skillset that is a bit “off the tracks” even though it isn’t super hard to learn the basics, becoming really proficient with SSIS takes time.
So, a product as good as this is probably super expensive, right? No. It isn’t. If you are running things from within Visual Studio, then you don’t even need a license, but if you plan to run a migration or so, I certainly recommend one anyway, to get access to support. And buy the Ultimate license to get access to the productivity pack and all the other connectors while you’re at it. It is worth it.
Just to be clear, I am not saying that the other products are bad. I am just saying that once you choose a product, you will start to decend the rabitts hole. If the product isn’t up to speed you will have a couple of options:
Back up and redo with another technology
Patch with another technology
Create some kind of workaround
Neither of these solutions is rather palatable. Hence it is often tempting to choose a product that you know can do the job. And apart from coding, SSIS with Kingswaysoft will very seldom have any issues.
In May 2019 Dynamics 365 CE/CDS enacted some new throttling mechanisms that have caused some headaches for anyone wanting to manage a lot of data in CDS (I will refer to Dynamics 365/CDS as just CDS below). There are several different throttles but the one that has cause me most trouble is the concurrency throttle. Kingswaysoft will release support for handling this in the next release and you can also request a special version from them if you ask nicely. In the meanwhile this post can give you some help on how to work as fast as possible using application user mulitplexing and a loop with a 5 min wait to make sure that the throttles are reset.
With the new per GB pricing, keeping the database as small as possible has become an essential task and using the bulk delete just doesn’t work for large data loads, at the time of writing this article. I do hope that Microsoft increase the speed of it so that it does become more useful but currently its speed is somewhere around 1-2 records per second.
The bulk delete also has limitations on that it can only base it selections on a query, i.e. a FetchXML. Often this is not enough, for instance when you want to remove “All emails except those that have any connection to either a case or a contact which has a case”.
For these reasons I almost always opt for using SSIS with Kingswaysoft connectors to CDS when working with complex data management. This article will be on how to get some performance now that there is tougher throttling to take into consideration.
As the throttling is measured on a “per user”, one trick is of course to use multiple users and spread the load over all these users. You can, of course use normal users, but that will cost you licenses so the smart person will of course use application users instead. If you don’t know how to create application users in Dynamics 365, check it out here: https://docs.microsoft.com/en-us/dynamics365/customer-engagement/admin/create-users-assign-online-security-roles#create-an-application-user . In the example below, I will be using four different application users, one as the source account and three as destinations. The reason for this is that it is typically easier to read several thousand rows per request, but seldom efficient to do batch creates/writes/deletes of more than 10-20.
To do this with SSIS/Kingswaysoft you should start by setting up the connections. In this case, the four CDS/CRM connections and use the OAuth auth-type like below.
As you might want to have several packages in the same project and have them share the connections, it may be a good idea to use project connections. I also use an Azure SQL db for logging any errors. Previously I used to use CDS but now with the throttling, that is not such a good idea as the error itself might be throttling and hence the error can cause an error. Writing to some target that you know will not fail is hence a good idea for logging errors. When you are done with the connections, it should look something like this:
Now it is time to build the actual flow. If you’d normally have a Source and a Target, it will now look something like the image below, which I will explain.
First of all, the Premium Derived Column creates a new column which simply contains the row number. It will look something like this:
I like to use the components that are available in the Productivity pack from Kingswaysoft, and this Premium Derived Column is one of these. In this case I think it is actually equal if you use IncrementalValue() or RowIndex(). I think you can create this logic with a normal Derived Column too, it just has less features.
Next we need to create a Conditional split that divides the rows evenly between the three destination components. This is done using the mathematical operator modulus which is written using the “%”-sign. For those that didn’t study this in school, it simple means “the rest” in a division. For instance 5%3=2, if you divide 5 by 3 you will get 1 and a rest of 2. What we will do, is assign RowNr%3 == 0 to Case 1, RowNr%3 == 1 to Case 2 and the rest to Case 3. That should divide them evenly. It looks like this:
You then create the three destination components. I typically create one first, copy it and change it, as that is faster. Make sure that you set the Connection Manager to the three different Target Connections.
I also recommend that you fiddle a bit with the batch size and the number of threads and test out which gives the best results for you and the entity and action you are working on. There is no one answer here. I would typically start at Batch 10, Threads 16.
Tuning DataFlow property settings
If you back out to the Control Flow view and right click on the Data Flow you have created, there are some other interesting setting you can twirk.
DefaultBufferMaxRows – 10 000
DefaultBufferSize – 10 485 760 (10MB)
EngineThreads – 10
These can also be tuned to allow for the Data Flow to handle more rows, more memory and use more parallell threads which of course will make it faster (if that is the bottle neck, typically not when working with Dynamics)
What I have found is changing the maxrows to 100k, the buffer size to 100 MB and engine threads to 32 will not hurt but you can find several other blog articles specializing in SSIS that discuss this.
Crude throttle handler
What I have noticed is that many of my Dataflows simple seem to grind to a halt after 400-600k rows read from Dynamics. Not sure if it the read or write part that is causing this but what I figured is that probably the most pragmatic way of solving this would be to create a loop that runs a data flow that is limited in the number of records, typically 400k, wait 5 minutes then iterate. Smartest version is of course to have a control variable which checks to see when when there are no more rows and then breaks the loop, simpler version is to just loop n number of times to cover the amount of data you are trying to move, ie. number of rows per iteration x number of iterations. It would look something like the picture to the left.
If you would like to refine the loop a bit to make it more automatic, create a variable of type Int, for instance RowCount, set the initial value to be 10 or something different from 0. Then set the EvalExpression to “@RowCount > 0”. After this add a RowCounter control to the Data Flow and connect this to the variable RowCount. When the Data Flow runs and returns 0 rows, it will run to the end, the EvalExpression will evaluate to “False” which will cause it to break.
Using this technique, I am able to remove several million records in just a few hours. With one of these jobs I managed to remove 20 GB of structured data in less than two days (no attachments or similar, just records). By adding more application accounts and of course both to the source and particulary to the destination side, you can increase the speeds you are getting.
I do also advise you to be on the lookout for Kingswaysofts new version which I think will come soon, and do as I, make sure to always download both the Dynamics and Productivity Pack. I have read that there are great things coming to the productivity pack!
One of my customers is a B2C customer with a very large online database exceeding 500 GB. With a very active Marketing automation tool interated, we generate a lot of data in Dynamics 365 CE which after defined retention periods needs to be removed. This has caused some side effects, that a table called SubscriptionTrackingDeletedObject has become very large. This article will describe how to set a configuration to reduce its size. UPDATED – Based on some new learning and information from Microsoft this article has now been updated!
We often monitor the Organizational Insights, and now lately the brand new capacity feature that can be found in the left hand menu in https://admin.powerplatform.microsoft.com, if you have a CDS/Dynamics 365 CE instance.
An interesting table started growing rapidly and we had no clue what this was, and I had during my now 15 years of working with Dynamics 365 never seen it. It was called SubscriptionTrackingDeletedObject. When I came back from my Swedish summer vaccation, it had grow to over an amazing 181M records. Time to fix this.
First thing, as usual is of course to google it (yes, it is a verb, get used to it). All I found was this somewhat informative post by my good friend Chris Cognetta who is an ace with infrastructure issues.
However, it seemed that they just truncated the table, and we were online so that was a bit tricky, to say the least. I was at this time a bit upset that Microsoft were taking up around 50GB of space for my customer without giving me any way of managing that, or having any direct use of it. I counted to ten and called Microsoft Support.
After a few emails back and forth, the excellent support technician at Microsoft informed me that there is actually a setting in the infamous super secret setting tool with the Star Trek-sounding name OrgDBOrg (it is pronounced “Org-D-Borg” in case you ever get stuck in Dynamics trivia). The setting is called ExpireSubscriptionsInDays. I will quote the support technician in what this table is used for and if anyone has any more information, please leave a comment.
“The SubscriptionTrackingDeletedObject table is the table that logs records for number of days before deleting inactive subscriptions as well as timed out deletion services.”
I am not sure for which purpose. If it is in regards to GDPR or some restore mechanism. I would like to know though. Default value for this i 90, which means that these logs will be stored for 90 days. The minimum they can be set to is 1. As I am currently not entirely sure what these logs are used for, I would not recommend you set them to 1, but I did set my customers to 5, hoping that this is not going to come back with a vengance.
We have during the day seen a dramatic drop in the amount of records in this table, with about 30M but and it is still ongoing, but hard to measure as there is a delay in the capacity measurement the Powerplatform admin portal.
Update! The size of the table fell to 160M rows but never below this so after some further discussions with Microsoft support they did some more investigation into this subject and came back with the following recommendation:
1. Reduce the value gradually from 90 to 60 and then on 2. Never go below 15
There is however, another related setting called ExpireChangeTrackingInDays which is located just next to the ExpireSubscriptionsInDays. This is defaulted to 30. We reduced this to 15.
Based on these recommendations we tried 60 days and this resulted in a most dramatic drop to around 20M rows. – End of update
So, how do you do this? First, download the OrgDbOrgSettings tool and install it in the instance where you are having issues. Check out these links below for that:
A word of advice regarding OrgDbOrg; don’t think that you are Captain Kirk and go flying off into the Beta Quadrant and beam every single setting just because you can. It won’t make your system better, rather the opposite. Make really sure on what you are doing and don’t even trust a blog article like this, read the KB-article linked in the tool and make up your own mind. It is a powerful tool, like a jackhammer.
After you have installed the OrgDbOrgSettings tool, you can see it and open it by clicking the display name.
Then just find the “ExpireSubscriptionsInDays” – Press “Edit” and change to whatever you would like it to be. You will typically have to confirm to save it to Dynamics 365 CE/CDS
With that done you should just have to wait for the magic to be done.
As far as I have understood these two settings and this table is used to indicate how long changes and deletes are stored in this table and related tables for integrating systems to be able to read. This can, for example be Data Export Service, the old Dynamics for Outlook client etc. Hence reducing the numbers to, for instance 15 (the lowest recommended number by Microsoft) can result in some changes not being propageted to these integrated systems in the case that the integrations or just an offline client being offline for more than 15 days. And I also got the feeling that setting it to 5 was below some internal threashold and hence wasn’t really supported despite the fact that it says in OrgDBOrg that the lowest value is 1.