NetKernel News Volume 6 Issue 5 - On Reliable Messaging - Part 1, RADAR ROC and the Power of Pi

NetKernel News Volume 6 Issue 5

April 17th 2015

Catch up on last week's news here, or see full volume index.

Repository Updates

NOTE: As announced previously, Java 6 is now end-of-life and since January 2015 all modules are built and target Java 7. Do not attempt to install updates if you are still running Java 6.

The following updates are available in the NKSE and NKEE repositories...

  • wink-1.23.1
    • Enhancement to provide support for , and tags which when present create well formed tags in the page header - enhancing SEO indexability.

Tony's Blog: RADAR, ROC and the power of Pi

As I hinted last time, Tony's push to create the world's most comprehensive IoT DeathStar shows no limits. This week he's written up a detailed account of how, with a Raspberry Pi and NetKernel, he's built a RADAR traffic monitoring system for his PoleStar stack, complete with a live mobile-friendly speed gun...

I can't comment on the rumours that he's now working on an accompanying gigawatt impulse infrared laser to blow out the tyres of the speeders, but its not a coincidence that we call his home automation system "The DeathStar".

On Reliable Messaging - Part 1

As I discussed last time, being able to reliably deliver a message from one system to another is a core requirement of many enterprise architectures.

The fact that it is important means there are tons of expensive products offering many varied solutions.

There are "standard" APIs such as JMS that attempt to abstract the details - but the APIs are still very complex and, in fact in any real-world solution, each technology ends up needing customised code and so becomes a very rigid, fragile, hard-wired piece of the architectural landscape.

A big issue with the complex monolithic messaging systems we use is that we don't actually truly know what's going on - they are opaque and unwieldy and we just have to "trust them".

When you're trying to build a provably secure trusted system (such as the Software Fortress pattern we discussed last time), then "not knowing what's going on" is not the best recipe for trust.

Nor, from a system engineering perspective, is it a good position if you need to understand the latency or scaling potential of your solution. Which is why you hear endless horror stories about messaging bottlenecks being the limiting factor in system performance.

Now of course, just like any other Java middleware container, NetKernel is used with all the common messaging systems you might already have in production. But as I speculated last time, if we know that ROC has a dramatic affect on the interactions in a simple 3-party trust relationship, what could we learn if we applied ROC to the problem of reliable messaging?

It feels like instead of "trusting monoliths" we ought to be able to create some light-weight messaging resources from which we can build evolvable and scalable composite reliable messaging architectures...

What is a message? And what is reliability?

Everyone knows what a message is don't they? Its information that a first party wants a second party to receive.

But its not so easy to define reliability.

In an ideal system (ie one that cannot fail), then we can say that a message is reliably delivered when the first party receives information that proves that the second party has received the message.

In legend, Pheidippides ran the first marathon (26 and a bit miles) to deliver a message from Marathon to Athens announcing victory over the Persians.

But on arrival he died!

The Greeks were not using a reliable messaging system.

To be reliable he'd have had to stay alive and run back with another message saying "Cool! The wine's open, let's party".

Simple Messaging

In an ideal world, reliable messaging would be very simple. As shown in the diagram below...

A message would be sent from the sender (S) to the recipient (R), who would return an acknowledgement (perhaps using a checksum generated from the message). When the sender gets the receipt, they know that the message has been delivered.

Unfortunately the real world is not perfect like the ideal case. Things can and will go wrong. Here are some of the ways this system can fail...

  1. The network fails as the message is going from S to R
  2. R is overloaded and does not have the capacity to receive the message
  3. R sends an ack but the network fails before it gets to S
  4. S receives the ack but R has used a different algorithm/or has a bug in its code that calculates the receipt.
  5. R sends a correct ack but S has a bug in its ack verification algorithm/code.
  6. R crashes (either due to software or hardware) mid-message.
  7. S crashes mid-message.
  8. Either S and/or R are destroyed in a data-center catastrophy.

When we consider what happens to recover from these failures we begin to see that we need to further assess what we mean by reliability.

For example...

  • If R is overloaded, how long does S wait?
  • If R's ack is lost, how long does S wait?
  • If S attempts to recover does it try to resend? What if R has already accepted the message and processed it? What are the consequences of sending the message multiple times?
  • What if R thinks it's sent the correct ack - but it's wrong (by bug or algorithm)? Does S tell R and if so how? If S does tell R then the error message can fail in all the ways the original message failed!
  • What if S crashes before R's ack is delivered? Does R send a new ack? But that ack is then a new message (S and R are inverted) and can fail in all the ways the original message failed.

We quickly see that our first definition of reliability is not sufficient.

Now lets add another layer of complexity. How do we introduce trust to this system? How can we assert that it satisfies security requirements.

Things go from complex, to insanely complex very quickly indeed. Its no wonder we just buy black box messaging products and fallback to the default "trust the monolith" position.

But does it really have to be this complex? Doesn't ROC allow us to think differently about the problem?

When is a Message more than a Message? When its a Resource.

I think there is a deep flaw in our naive definition of a message: Information that a first party wants a second party to receive. To understand why, we need to get back to ROC first principles...

With ROC we have a richer way of defining a message...

A message is a resource. A message is said to be delivered to a receiver when the receiver is able to reify the message resource state independent of the sender's context.

Don't be mistaken - I haven't just substituted the word "resource" for "information". When I say resource I mean by Axiom 1 of ROC: A resource is abstract information.

Axiom 2 states that a resource has any number of identifiers. So a message resource must be identifiable - so, in ROC, a message should have one (or more) identities.

For a message to be delivered the resource must be reifiable in the receiver's context independent from the context of the sender - we'll see why I carefully stated this in the definition in a moment - however, the short explanation is that a receiver can only be said to have received the message if the message has no contextual coupling with the sender's request scope.

ROC also gives a way of defining reliability...

A messaging system is said to be reliable if there exists no ambiguity between the resource state of the sender and receiver. We can get really formal: a messaging system is reliable when the complement of the intersection of the set of resources of the sender and receiver is the empty set

With these definitions we can see why the simple message passing approach discussed above is so limited.

  1. The message has no identity.
  2. The state of the receiver and sender is ambiguous.
  3. In the event of failure the sender and receiver are not able to validate that the message state is reifiable to the receiver.
  4. The existence of the resource state at the receiver is not itself a resource but is instead a representation of a derived acknowledgement.

For the moment lets leave the formal stuff and return to it in the context of a simple ROC messaging solution...

An Elementary ROC Message Pattern

What follows is a discussion of a pattern that I am pretty sure can be composed into higher-order messaging architectures. It took less than an hour to implement, far less time than its taken to write about it! I guess it took me a couple of days to think through the problem - the reason for that was that this pattern instantly jumped out at me but then I had to justify it to myself and work out all that stuff I just said above!!

The figure below depicts an elementary ROC message pattern...

The system comprises two address spaces. A Send space (S) and a Receive space (R). We require to send a message from S to R. That is, we want a message resource in S to be reifiable in R without the need of the scope of S.

For the implementation (provided below) we have Send and Receive endpoints in the S and R spaces respectively. Don't mistake this with the simple messaging above - we have called them Send and Receive to keep things clear, but they are neither sending nor receiving in the classical sense!

Here's how it works, step by step...

0. A message needs to be sent from S, so we request active:send with the message state as the operand argument. This request is resolved by the Send endpoint.

1. The Send endpoint sources the message (operand) and obtains a representation of the message state. The Send endpoint constructs a NEW request to active:message with the message state as the primary argument. The Message endpoint receives the NEW request, constructs a unique identifier for the NEW resource and persists the state of the message such that it can be SOURCEd by its new identifier. Finally, before returning the identity of the NEW message resource, it calculates a unique ACK GUID and stores this with the message.

1b. The Message endpoint returns the identity of the NEW resource to the Send endpoint. Also returned in a header is the ACK GUID. The Send endpoint now has an identity for the message resource and the ACK GUID.

2. The Send endpoint issues a SOURCE request for active:inbox and supplies the message identity as an argument of the request. The Receive endpoint in R resolves the request. Note the Send endpoint does not send the message representation nor does it share the ACK GUID identifier! The Send endpoint simply tells the Receive endpoint that there is a message for it when it is ready.

3. The Receive endpoint SOURCE's the message argument, which is the identity of the message created in the NEW request. The SOURCE request is resolved in the superstack to S's scope and is received by the Message endpoint.

3b. The Message endpoint returns the message state and also the ACK GUID. The Receive endpoint now has its own representation of the state of the message resource. It may now optionally choose to SINK it in its own message resource set or otherwise "classically process the message". Either way, by our definition, the message resource is reifiable in R without the need of the scope of S.

2b. To complete the message delivery the ACK GUID is returned as the response to the active:inbox request. The Send endpoint compares its copy of the ACK GUID (obtained with the NEW request) to the value from the active:inbox SOURCE request. The message is delivered if the ACK GUIDs are identical. If the message delivery is successful the Send endpoint issues a DELETE request for the message resource. The Message endpoint receives the DELETE request and cleans up the persisted message state, the ACK GUID etc. The Send endpoint returns the ACK GUID to the original message requestor which, if both sides retain a copy of this, serves as a mutual proof of delivery.

Demo Implementation

You can play with the solution I implemented for yourself, here's the module...

To run the demo, go to the XUnit tests and find the reliable messaging test here. You'll see some chat in the stdout illustrating the sequence discussed above.

Looking at the implementation, the first thing you'll notice is that its only a single module containing local copies of the S and R address spaces. That's deliberate - I wanted to develop the pattern first before I implement the distributed networking etc. You'll also notice that I've not worried too much about the message representation - its a resource so it can have many representations - but when in doubt or you don't know yet, use a tree! So I used HDS2 as the representation.

I'm still in two minds about how we should best implement the message delivery to the receiver in part 3b - you can see that I actually incorporate the ACK GUID into the HDS structure. I could equally have returned it in a response header. However my feeling is that the ACK GUID is useful metadata augmenting the message. The metadata doesn't change the core message resource state but by incorporating it, it still gets to tag along. Additionally, by embedding it, there can be no doubt that the received message structure has been successfully delivered since the Receiver has to extract it from the message before it can return it as the response to the active:inbox request.

Finally you will notice that there is no error handling, recovery etc etc!! That's because I've not thought about it yet! However that doesn't mean we can't already show that we're handling some of the failure modes of the simple messaging discussion...

Progress So Far

This implementation is not the final solution, but we're making progress. In fact of the eight failure modes of the simple messaging we've already eliminated three.

Firstly, what we've done is invert the simple message passing approach from push state transfer to pull. That is the Send endpoint does not attempt to push the message state to the receive side - it tells the receiver where the message is so that it can pull it when it is ready for it.

It is very unlikely that by just sending the identity of a message to the receiver that we will overload it. But, as we will see later, even if we do, the push-pull inversion means we can now scale out the receive side without any difficulty.

I'm playing with a Large Hadron Collider metaphore for this pattern - since it feels like the message identity goes clockwise, while the ACK GUID identity goes anti-clockwise. The two ACK GUID particles collide in the Send endpoint! While the message resource "collides" in the Receive endpoint!

My point with this stretched metaphore is that the coding/algorithm failures of the simple message story are completely removed - the ACK GUID is simply an identity - nobody except the message endpoint infers any semantics from it. By circulating the ACK GUID around the collider and doing a simple correlation of the two copies of the ACK GUID we have sufficient non-repudiable proof that the message was received.

More significantly, by thinking of the message as a resource, I sense that we will be able to cleanly deal with recovery from the critical failure modes. As food for thought, nothing we've done so far says that the Message state needs to be co-located in S with the Send endpoint. The pattern will work exactly the same, with no change to either Send or Receive, if we decide to introduce a redundant message broker to reliably manage the message state!

Next time we'll see what happens when we distribute the pattern and add some error handling...

Thanks to Richard Smith, City of London banking financial messaging expert, for sanity checking this!

Have a great weekend!


Please feel free to comment on the NetKernel Forum

Follow on Twitter:

@pjr1060 for day-to-day NK/ROC updates
@netkernel for announcements
@tab1060 for the hard-core stuff

To subscribe for news and alerts

Join the NetKernel Portal to get news, announcements and extra features.

NetKernel will ROC your world

Download now
NetKernel, ROC, Resource Oriented Computing are registered trademarks of 1060 Research

© 2008-2011, 1060 Research Limited