On the last project I was working as a consultant on we used NServiceBus throughout the system.
After some time in production we started to experience issues with the system. First a slow system, then transactional deadlocks on the db and loss of data when these deadlocks were automatically resolved by the db server.
I had the pleasure of being the one tasked with tracking down the culprit and resolve the issue. (No irony, I love tracking down issues… :))
Problem
Some of our NServiceBus endpoints were concerned with high level business tasks, others with more infrastructural tasks. It turned out that the issue could be tracked to a NServiceBus endpoint which had the task of continuously importing content from RSS feeds owned by our customers. That is, for each feed, create, update (yes, some feeds updated existing feed items) or delete items locally according to feed contents. Importing also meant downloading images referenced in the feed items. Each RSS feed import was represented with a saga instance, and used timeouts for initiating imports.
The logic was placed in a NServiceBus endpoint to be able to “interact” with the import. We wanted to be able to remotely control the import and make it possible for users to start the import, when needed.
The implementation was unfortunately much like what you would find in a typical batch job. It consisted of a message handler which contained calls to private methods, more or less like this:
public void Handle(StartRSSImport message) {
var feedContent = _feedLoader.GetFeedItems(message.Url);
UpdateExistingItems(feedContent); //Updated already known items
AddNewItems(feedContent); //Added new items from RSS feed
RemoveItems(feedContent); //Removed items that were no longer present in the RSS feed
}
It worked without problems in the beginning. But then the RSS import feature got popular (oh no! :), more feeds were added and they contained a lot more items. Basically, the big RSS imports, which of course were inherently dependent on various external systems, ended up keeping transactions open and therefore blocking other clients from accessing/updating DB tables used. Issue identified!
A solution
The good thing was that the import code itself was pretty well organized, and methods handling one item at a time were already present (UpdateItem(), RemoveItem(), Additem()).
However, in the “plural” methods, the code for discovery of RSS feed changes was entangled with the code actually performing the changes. This was my goal and what I ended up doing under the circumstances:
- My idea was: Importing items from a RSS feed to the local system did not need to be viewed as one big task. It could instead be viewed as a list of separate tasks with no transactional relation, identified by one initial task. So therefore I wanted to...
- Have the StartRSSImport message handler method load the feed and only discover what actions that needed to be taken locally.
This was considered a light weight operation on the production DB, as we already saved hashes locally representing RSS item’s previously known state. That meant we didn’t need to traverse our object graph to discover potential changes for each item in the feed.
For each identified task, this message handler would then send a matching message to its own queue, either UpdateRSSItemMessage, RemoveRSSItemMessage or AddRSSItemMessage.
These messages would contain all information needed for their respective message handlers to perform their task (local item id, hashes, text content, image URLs)
- Create new message handlers for UpdateRSSItemMessage, RemoveRSSItemMEssage and AddRSSItemMessage.
Each of these would perform their particular task on item level.
After refactoring our code looked more or less like this (compacted for readability – I’m not a fan of train wrecks):
public void Handle(StartRSSImport message) {
var feedContent = _feedLoader.GetFeedItems(message.Url);
foreach (var item in FindUpdatedItems(feedContent)) { Bus.SendLocal(new UpdateRSSItemMessage( … )) ;}
foreach (var item in FindAddedItems(feedContent)) { Bus.SendLocal(new AddRSSItemMessage( … )) ;}
foreach (var item in FindDeletedItems(feedContent)) { Bus.SendLocal(new DeleteRSSItemMessage( … )) ;}
}
public void Handle(UpdateRSSItemMessage message) {
//Updates existing item
}
public void Handle(AddRSSItemMessage message) {
//Adds new item
}
public void Handle(DeleteRSSItemMessage message) {
//Deletes item
}
This solution is basically comparable to when you’re reading a menu on a restaurant. You order the items you want and then they arrive later. You don’t, like earlier, call the chef and make him create your food on the spot. It’s just bad style… ;o)
Separating the discovery of changes from actually performing the changes locally meant we went away from a N+1 task in one message handler and, more importantly, in one transaction. It was a huge step forward, and the errors disappeared.
Final thoughts
The possibility to interact with a RSS import instance was a popular function in our system we achieved pretty easily by using NServiceBus. But this issue and the way we resolved it also showed us how NServiceBus can help us make our implementation simpler. But it did also teach us that we had to get our heads around slicing the control flow into separate flows first, before we could benefit from this.
So we were ok, right? Well, it’s true that the transaction scopes were now much smaller, fitting item level handling instead the feed handling. But there were still some issues to this implementation. I’ll follow up on this in a later blog post.