WiNK edit

wiki /NetKernel /News /2 /44 /September_9th_2011

NetKernel News Volume 2 Issue 44

September 9th 2011

What's new this week?

Repository Updates
Identifier State Duality and ROC Symmetry Breaking

Catch up on last week's news here

Repository Updates

There are no updates in the repos this week. Steady as you are.

Identifier State Duality and ROC Symmetry Breaking

Earlier in this volume of newsletters I wrote a series of articles about ROC and Languages. The third part offered extensive coverage of ROC arguments.

Lets pick this up again and think about the duality and innate symmetry between pass-by-reference and pass-by-value arguments. In the earlier article (I recommend you read it first otherwise what follows might not make much sense), we learnt that in NetKernel there is fundamentally no special case required for pass-by-value - state is not push-transferred (like with a POST in the web), instead we create a transient space and pass a reference. So in ROC all state is pulled.

There are many reasons why this is good thing. The most important is that all state is uniformly modelled, it all has an identifier and an associated spatial scope and it is all therefore potentially reusable. Often we are able to reuse accumulated computational state (inside sub-requests) that ostensibly would ordinarily look like it ought to be strictly transient - this is sometimes called micro-caching. But another way of saying it is that in ROC we have found a way to be able micro-cache even when the originating request is the equivalent of REST's POST.

Duality

When we think about any representation state we tend to think of it as concrete and tangible. Its the "value" in the pass-by-value right? Well actually it is and it isn't. Fundamentally representation state is always just a set of 1's and 0's - which, depending on your local relative ordering convention, is another identifier (its a, possibly very large, number). Don't be confused here - this isn't a human readable URI identifier, but it is still truly an "Identifier" in the ROC-sense of being an opaque token for the resource.

This all gets a bit weird to take in. But here's a much more tangible sense of identifier/representation duality.

data:text/plain,Hello%20World

(Try clicking the link, if your browser supports data: URIs, like Firefox, you'll see another, equivalent, representation)

A data: URI is simultaneously an identifier and a representation of the resource. Even if you're not familiar with data: URIs, you'll definitely have used a REST web interface where the identifier is "value-bearing" ie some parts of the path contain the values which will be used as state by the code. We'll talk more about this in a moment...

Symmetry Breaking

Setting aside the philosophical dilemmas exposed here. Ultimately ROC is practical and we use it to do useful things.

When we write an endpoint to do useful computation with state we implicitly are required to break the symmetry (for once I'm not to blame for the physics analogy - this is Tony's expression). So when a software function receives an argument, the argument can be dereferenced to provide a representational value to our local endpoint code.

At the NKF API level we can do something like this...

myrep=context.source("arg:myarg")

The variable myrep now holds a dereferenced representation of the argument "myarg". At the code level we never had to care what the identifier was.

But sometimes this boundary gets blurred. Notably in REST interfaces, in which the identifier is treated as being "value bearing". Take for example a REST path like this...

res:/service/12345/654321

With a grammar like this I can split the identifier into positional argument parts...

<grammar>res:/service/
    
  <group name="part1">
    <regex type="integer" />
  </group>
    /
    
  <group name="part2">
    <regex type="integer" />
  </group>
</grammar>

And now if this is the grammar for an endpoint, in the internals of the endpoint code I can do this...

part1rep=context.getThisRequest().getArgumentValue("part1")
part2rep=context.getThisRequest().getArgumentValue("part2")

But notice that I've now tightly coupled my code to assume that this will always be a value-bearing identifier (I've explicitly had to break the symmetry in my code). This is ugly. Why should the resource provided by this endpoint be explicitly coupled to Web-centric view of identity? What if I wanted to request it with arbitrary pass-by-value arguments too, for example by use from other internal tools? Furthermore, by using the identifier in this way I have no benefit of type decoupling through transreption. If I want to use the identifier value as an integer I am responsible for parsing it etc. Ugly tight coupling.

Fortunately there are set of tools that allow use to solve this interface pattern and keep cleanly decoupled.

Firstly we can use a normalized active grammar for our resource...

<grammar>
  <active>
    <identifier>active:service</identifier>
    <argument name="part1" />
    <argument name="part2" />
  </active>
</grammar>

In our code we can then do this...

part1rep=context.source("arg:part1", Integer.class)
part2rep=context.source("arg:part2", Integer.class)

With this we are saying, we will always source the argument and we will always allow transreption pipelining to ensure the representation is always a valid Integer object.

So now any internal service can uniformly request the active:service resources and they can be pass-by-reference or by-value and the impedance matching will happen for us.

But how do we marry this to the outside Web REST path we started the discussion with? Well we introduce a decoupling symmetry breaking mapping with the mapper...

<mapper>
  <config>
    <endpoint>
      <grammar>res:/service/
                
        <group name="part1">
          <regex type="integer" />
        </group>
                /
                
        <group name="part2">
          <regex type="integer" />
        </group>
      </grammar>
      <request>
        <identifier>active:service</identifier>
        <argument name="part1" method="as-string">arg:part1</argument>
        <argument name="part2" method="as-string">arg:part2</argument>
      </request>
    </endpoint>
  </config>
  <space>
        ...import space implementing active:service in here...
    </space>
</mapper>

Here we're translating the request for the REST path identifier to a request to the normalized active:service.

Notice that we are relaying the arg:part1 and arg:part2. But instead of the default, of relaying the identifiers straight through (which would require the active:service to have special code to deal with the value-bearing request argument identifiers), we are using the method="as-string" construct to take the parsed identifier argument value and pass it by-value (as a string) to the service (see the declarative request docs for details)

This mapping is moving the state bearing value from the identifier to a representation in a resource space. Along the way, and for free, the string will automatically be transreptable to an integer for the implementation code.

You might now be worried about implications on caching. But there is no need to worry. The pass-by-value state is inserted into a transient state space (see earlier article). The identifier of the pass-by-value argument is always consistent. Furthermore the pass-by-reference space, will appear equal so long as the value of the state it contains is equivalent in a comparison. Which means that to the cache, the sub-request with PBV state, is exactly as cacheable as it was when expressed purely in the identifier. So this symmetry breaking normalization does not change the cacheability at all.

REST uneasy

So this gets us to a slightly controversial conjecture. REST paths are not very good. (There I've said it!). There are a number of problems with them.

Firstly they are positional and so are fragile. Positional interfaces are prone to simple syntax bugs when requests are being constructed. By implication, they are also hard to evolve as a system needs to change.

Secondly, they are not good for expressing optional arguments. Fundamentally you can't. You have to create more than one interface - and then you have the problem of how you impedance match that to code.

A more subtle but even more important problem with REST paths is that they are linear and single dimensional (flat). It's a pretty sure bet that your underlying data model is not single dimensional. You therefore are confronted with the conceptual challenge of mapping your multidimensional data space into a linear mono-dimensional interface. (This is ultimately why you'll pay good money to the growing number of REST architectural consultants).

However, when we chose to implement our normalized service with an active URI we bought ourselves out of this straight jacket.

The active interface is not positionally rigid. It uses named arguments which can be expressed in any order. Also having names, means they can be given semantic value to help with understanding and future evolution.

Active arguments can be optional. They can also be constrained to fixed formats by having requirements for minimum and maximum number of occurrences.

More significantly, the active URI is multidimensional, it can express any number of arguments in any order. And each argument itself can be another multidimensional active URI (and so on and so on). There as many degrees of freedom as you need to map your data model. (No more mind-bending linear coercion to the flat REST path).

So I can understand that your first level perspective may well be that you're "creating a REST web interface" and so you need a REST path grammar. But it really really pays to actually think of active URI identifiers as your primary addressing model and introduce symmetry breaking mappings. Your code will be cleaner. Your resource state will be transreptable and imepedance matched for free. You will be very unlikely to create syntax bugs when constructing requests. And your services will be completely normalized and reusable with arbitrary transports and, to cap it all, will be easy to evolve.

No Rules

Having just presented a case for this approach. I am now going to frustrate those that want the world to be black-and-white. Guess what? This is just an architectural style. There are no rules.

In fact, I went and reviewed my code style in the NetKernel portal. Almost entirely I use the complete opposite approach, I never pass by value and all my services are using active identifiers where the arguments are primary keys. (Although I never use REST paths internally and always decouple them as discussed above).

I use argument as "value-bearing" because my underlying data model is uniform and it is very simple to take the primary key id out of the request and use it to construct a query etc. I never really need to source the state of the representation and can almost always use the identifier argument as a value.

I guess the lesson here is that it fundamentally doesn't matter which approach you choose to break the symmetry. The thing to aim for is to be uniform and consistent across your solution. But if in doubt, the style I outline in the first section will see you come to no harm.

Talk OTUG: NetKernel and the Resource Oriented Computing Revolution - Minneapolis/St Paul, USA 20th September

If you're in the Twin Cities area I'll be giving a presentation to the OTUG group on the 20th September. Details and venue are here.

NetKernel Europe Bootcamp - Brussels, Belgium, Thursday 27th October 2011

Reminder: Java 5 Support - End of Life Heads Up, October 2011

Java 5 support will reach end-of-life in October 2011. Please see the notice for details.

Please let us know if you have concerns or need assistance with planning/testing for this transition.

Have a great weekend,

Comments

Please feel free to comment on the NetKernel Forum

Follow on Twitter:

@pjr1060 for day-to-day NK/ROC updates
@netkernel for announcements
@tab1060 for the hard-core stuff

To subscribe for news and alerts

Join the NetKernel Portal to get news, announcements and extra features.

NetKernel will ROC your world

Download now

NetKernel, ROC, Resource Oriented Computing are registered trademarks of 1060 Research