Wednesday, October 24, 2012

UltraESB – Notes for beginners

 

UltraESB  is an open source ESB (Enterprise Service Bus) developed by AdroitLogic  . It claims to have a very small memory foot print and extremely fast performance.

I am evaluating this tool for a future product that I am planning to build. I will share my feedback over next few months. I will also love to get your feedback on this and any other good ESBs that you are aware of.

Installation of this software was extremely easy. Two versions of this software are available for download – full and minimal. I installed both. At first glance I didn’t see much difference in capabilities as far as ESB is concerned. Minimal install does not have a “toolbox”, “uconsole” and “uterm”. You will find these tools useful during development and hence I would suggest that you download full version for development or evaluation purposes.

Adroit Logic has also created some nice documentation. Quickstart guide of this document provides a sample for beginners. If you are on Windows 7, you may have difficulty in running this sample. To be specific, you may set a timeout error in response panel of your toolbox.

image

If that happens, most probably another process is already using the port mentioned in Jetty configuration. You can change port number to another port, change corresponding port configuration in ultra-dynamic.xml and retry the example.

In case you are curious about which process is using this port, you can use TCPview tool that is available from Microsoft sysinternals web site.  In my case I found an “aeagent.exe” process using port 9000 that I initially wanted to use.

More later.. Do write me your feedback and suggestions.

JConsole – not able to connect to java processes on Windows 7 – solution

Today I was trying to connect my java 1.6.0_27 jconsole to an UltraEsb process. I could see local java processes IDs. However there was no process description or path information. Connect button was grayed out for all local java processes except that for “Jconsole” itself.

After wasting 45 minutes trying to restart all processes and making sure that my java versions are in sync between jconsole and all processes to be monitored, fortunately I found this useful blog post which explained that this is due to some file name quirk.  Apparently, all java processes add an entry about themselves in a folder under TEMP folder. This folder is named as hsperfdata_Username.

Now here is the catch. Jconsole wants  Username part of this folder to be exactly same as process owner name shown in windows 7 task manager. In my machine java processes were running as user DGoyal. Name of “hsperfdata*” folder was hsperfdata_dgoyal. I stopped all java processes, kill agent*32.exe process and then renamed  this folder to hsperfdata_DGoyal. I started all processes and JConsole. Now JConsole was able to connect to java processes.

One more note,  on Windows 7 this temp folder is set under your profile. Default is C:\Users\<username>\AppData\Local\Temp. You should set TEMP as well as TMP environment variable to your desired location.

Thursday, October 4, 2012

Data exchange format for enterprise application integration – XML, EDI and JSON?

Design for message exchange between modules of an application and across applications is often an afterthought. My experience is that this afterthought results in severe limitations on functionality, performance and scalability of the applications. In this note I will discuss relative merits of formatting data exchange through XML, JSON and EDI like flat file formats.

Decision about choosing data format depends on multiple factors. Among most important of these factors are size and structure of the message.

Consider a relational database for Orders. It will have multiple tables related to Order header and order lines. It will also have tables for items, suppliers, customers, addresses and currencies that will get referenced from Order header and order lines.

Technically it is possible to design a structure, in all three formats, that allows you to send complete database in a single huge message.

Size of message depends on its content. Size of message has impact on communication throughput, processing throughput, error handling and requirements for disk and memory resources.

Creating such large message is cumbersome but not very difficult. Actual issue comes when a recipient attempts to receive, log, parse and consume message.

Let us consider issues related to size & structure of message in more detail.

Communication time: Messages are exchanged serially. Considering network overheads, it may take several seconds for a large message to travel from source to destination. This communication time increases as message size increases. Any intermittent disruption may require source to resend the message to destination (Network compression can be used to partially mitigate this issue).

Logging: If traceability, retransmission and non-repudiation are a requirement, both source and destination systems will need to log the messages. Writing and reading large messages further reduces the throughput and also requires significant disk space.

Parsing: It is important to consider this aspect of integration. If your application needs to understand complete message before it can take any action, it will need to parse complete message and create an object in memory. Large messages require more memory and time to parse. Standard DOM (document object model) parsing for XML requires significant memory. Parsing of JSON messages will require minimum memory. One reason for need to parse complete message is message structure that do not enforce a specific order in which message elements can appear. You can partially mitigate need to parse complete object by enforcing such order.

Exception handling: As stated, you can transmit content of complete database, multiple orders in above mentioned example, in a single message. If your receiving application handles each message as a single transaction then a single failure will require reprocessing of complete message.

Let us now evaluate relative merits of three formats – XML, JSON and EDI/TEXT.

XML is a well established and popular standard. It is self describing, open and extensible. Extensibility allows you to add elements to message with limited programming impact. Multiple out of box parsing libraries exist in almost all popular languages. XML standards do not enforce a sequence in which elements must appear. This becomes a major weakness as receiving application will need to parse complete message to find element that it is interested in. Often this may require multiple passes through the message. One option is to parse complete message at once as a structure (called Document Object Model structure) that can be used later to find elements of interest. However, DOM parsing requires memory and is also overkill if you only need few elements from message. Self-describing structure of XML is another one of its major weaknesses. XML uses starting and end TAGs around each data element. While these descriptive tags are useful for a human reader, they add to overall size of XML.

JSON is an upcoming message format. This format is also open, extensible and self describing. As this format is a direct representation of Java in-memory object, parsing and loading it to memory requires significantly lower time than that needed to parse an XML. However, as JSON also does not enforce an element sequence, you still need to load complete message. JSON does not require an end tag. Hence size of a JSON message is typically 30% to 40% smaller than corresponding XML message. Parsing libraries exist for Java and for limited number of other language.

I am using term EDI like flat file format for a generic variable length text based message structure where each line in message corresponds to an individual record of data. First few characters of each line, called record identifier, identify record type. Elements within each line (record) are separated by an element separator. This message format is most compact and can be 60% to 70% smaller than a corresponding XML. This format is obviously not self-describing. Hence source and recipients need to share a previously agreed definition of message structure. Parsing libraries exist for standard EDI. However, one may need to write custom parsers for custom message formats. As formats are mutually agreed between senders and receivers, record & element sequences are usually enforced as part of that agreement. This guaranteed sequencing allows extremely large messages to be parsed sequentially using limited system resources at a significantly faster throughput. As messages are not self-describing, programmers need to be careful before changing formats.

I will now show same data in XML, JSON and EDI like format.

XML : size – size 341 characters

<ORDERS>

<ORDER>

<HEADER>

<SHIP_TO>Don Trump</SHIP_TO>

<SHIP_TO_ADDRESS>Atlanta, GA</SHIP_TO_ADDRESS>

<ORDER_DATE>April 21 2012</ORDER_DATE>

<PAYMENT_INFO>XYZ BANK</PAYMENT_INFO>

</HEADER>

<LINES>

<LINE>

<ITEM>Pen</ITEM>

<QTY>10</QTY>

</LINE>

<LINE>

<ITEM>Tablet</ITEM>

<QTY>05</QTY>

</LINE>

<LINE>

<ITEM>Laptop</ITEM>

<QTY>100</QTY>

</LINE>

</LINES>

</ORDER>

</ORDERS>

 

JSON: size – 241 characters

{

"ORDERS": {

"ORDER": {

"HEADER": {

"SHIP_TO": "Don Trump",

"SHIP_TO_ADDRESS": "Atlanta, GA",

"ORDER_DATE": "April 21 2012",

"PAYMENT_INFO": "XYZ BANK"

},

"LINES": {

"LINE": [

{

"ITEM": "Pen",

"QTY": "10"

},

{

"ITEM": "Tablet",

"QTY": "05"

},

{

"ITEM": "Laptop",

"QTY": "100"

}

]

}

}

}

}

 

EDI: size-88 characters

*B*ORD*

*H*Don Trump*Atlanta, GA*April 21 2012*XYZ Bank

*L*Pen*10

*L*Table*05

*L*Laptop*100

*E*ORD

 

You must have noticed that EDI-like format is most compact. Following table summarize pros and cons of three formats

 

XML

JSON

EDI like Flat File

Size (scale of 100 to 1)

100

70

25

Can enforce element sequencing

No

No

Yes

Standard-based

Yes

Yes

No

Availability of parsers

Best

Java and few limited languages

Limited, custom parsers may be needed

Ease of parsing huge messages

Need custom parsers

 

Easiest

Even though EDI like format requires more programming, I would recommend using this format as much as possible. If other application or module can handle only XML or JSON formats then you can implement translators to convert your EDI like format to XML, JSON or any other open standard format.

Note: you can use compression to further reduce size of your text messages by another 80%. I will write another note on message compression.