Monday, May 16, 2011

Processing Inter-dependent files using a Non-Uniform Sequential Convoy


Processing Inter-dependent files using a Non-Uniform Sequential Convoy - Kent

The Problem


A re-occurring topic that I have recently come across on some of the BizTalk forums(here and here) is the ability to process interdependent files. More specifically we want the ability to process a particular file after receiving a "trigger" or "signal" file.

The use of trigger and signal files is very common in SAP and mainframe systems. A common pattern in the SAP world is to write "work" files to a "work" folder and a "signal" file to a "sig" folder. For SAP outbound files, the SAP system will write data files to the work folder, once the file has been completely written, a signal file will be written to the sig folder. This sig file gives the Middleware, or downstream system, an indication that this file is safe for processing. This is important as some files that get written from these types of systems are very large or may be written to over a period of time via batch jobs.


BizTalk supports messaging patterns that provide the ability to wait for messages to arrive before completing a business process(or orchestration). These mechanisms tend to use correlation and the use of convoys. Stephen W Thomas has written a whitepaper that dives into these topics further.


The messaging pattern that I have chosen to aid in this solving this challenge is the Non-Uniform Sequential Convoy. This pattern's mandate is to process 2 or more messages in a known order.


The challenge with this pattern, for our scenario, is that our work file is written prior to the sig file, but in terms of BizTalk processing messages we want BizTalk to only process the work file once the sig file has been written.


In order to solve this problem we are going to leverage a .Net Helper class to do some of the lifting.


Solution
I have written a proof of concept(POC) app to demonstrate the pattern. The solution is pretty light and easy to implement. It contains 2 message schemas(1 for the signal file, 1 for the work file), a property schema, an orchestration and a .Net Helper class. Note, that I have left what you do with both files once you get them into the same instance of the orchestration out of scope. At this point your requirements will determine what you need to do with both files.


The first artifact that will dive into is the Signal file. The file itself does not need to contain a lot of data. For my sample I have two elements, a timestamp and a WorkFileName. In order to use my convoy, I need to create a correlation type and correlation set. A correlation type requires a promoted property(or BizTalk System property) in order to "link" multiple messages to one running instance of an orchestration. For more information on Correlation, please see the following document.






The second artifact represents data that could be generated by an upstream system such as SAP or a Mainframe. This document is entirely fictitious so don't look too much into it. Also note that I have a promoted property called FileNamethat is used in my Correlation Type.


So for this example I am using FileName but you can correlate based upon any data, as long as your signal file and work file promote the same data values.







Below is a snapshot of what our orchestration looks like.


A few things to note is that Non-Uniform Sequential convoys to require that both receive shapes connect to the same receive port. This logical receive port also needs to be marked for Ordered Delivery.







For the initial receive shape we need to set a few properties. Since this is the first Receive shape we need to set theActivate property to True in order to instantiate the orchestration.


The other property that we need to populate is the Initializing Correlation Sets. In order for BizTalk to "wait" for the work file to be picked up by the same instance of the orchestration that consumed the signal file we need to initialize a correlation set. (Keep reading for more info on how to create the Correlation Set).




Prior to creating a Correlation Set, you need to create a Correlation Type. You can do this from the Orchestration Viewtab.
We want to create the Correlation Type based upon the element/attribute that we promoted in each of the two schemas.
We then need to create a Correlation Set, which is instantiated by the initial Receive Shape, that is based upon the Correlation Type that was just created. Correlation Sets are also configured in the Orchestration View tab.




In the Rename Work File Expression shape we are going to call a .Net Helper class that will aid in renaming work file in the source folder. This step is a critical step in the process. Since the work file is completely written before the sig file is, we cannot use the original file extension in the Receive Location file mask. Otherwise, this would prompt BizTalk to consume the work file prior to us wanting it to be consumed. By appending a temporary extension, like .BIZ, to the end of the work file name, we can be assured that BizTalk will pick the file up when we want it to. So in the Receive Location we will use a *.BIZ extension instead of a *.XML.


Note that I have hard coded the path of the source location. Since this is a POC, this is ok but this would not be a suitable solution for a production environment




In order for BizTalk to "wait" for the work file to be picked up, we need to set the Following Correlation Sets property with the same Correlation Set that we initialized in the first Receive shape. Since the rename operation occurs the step before, the "wait" time will be extremely small. The main point here is that we want to control when BizTalk picks up the work file. It is essentially the rename operation and the second receive shape/location that controls this.
Now that you have consumed both the signal file and work file, in order, you can finish up any processing that is required by your business requirements. Since this is just a POC, I output the work file to a folder.




Testing
In order to simulate how an upstream system would write the files, I drop a file into the work folder with the original extension. I then drop a sig file into the sig folder. The sig file will get picked up and BizTalk, via the .Net Helper, will rename the work file. The receive location, for work files, will then pick up the work file and the orchestration will finish processing both files.




Conclusion
Through the use of a Messaging Pattern and with the help of a .Net Helper class we can process a set of files in a known order.

Kent Weare's BizTalk Blog: Processing Inter-dependent files using a Non-Uniform Sequential Convoy

No comments:

Post a Comment

FEEDJIT Live Traffic Map