|
NetKernel News Volume 3 Issue 6
December 2nd 2011
What's new this week?
Catch up on last week's news here
Repository Updates
The following NetKernel 5.1.1 updates are available in the NKSE and NKEE repositories...
- demo-addressbook 1.1.1
- NEW demo CRUD address book featuring RESTOverlay (see below)
- http-server 2.11.1
- Updated with the new RESTOverlay (see below)
- lang-javascript 1.6.1
- Fixed an NPE if context.createResponse() was not called.
- lang-trl 1.2.1
- Fixed escaping of any potential regex substitution patterns in the included resources. Added ability to explicitly terminate recursion with $t{...} syntax to eliminate any possible injection attack from untrusted resource includes. Thanks to Joe Devon for alerting us to this.
- restoverlay-test 1.1.1
- Test suite for the RESTOverlay
- wink 1.20.1
- Updated to use the RESTOverlay (see below)
Tom's Book - Tom's Blog
Here's the latest news from Tom...
The book is going steadily forward, the latest build contains an almost complete (need to cut off the rough edges) Chapter 7 on XRL. Can you solve the exercise in that chapter without looking at the solution ?
The latest build is available from:
http://www.netkernelbook.org/serving/pdf/practical_netkernel_nk5.pdf.
As always, your feedback and input is highly appreciated. You can send it to tom(dot)geudens(at)hush(dot)ai.
Next to the book I also started a blog which you can find at:
http://practical-netkernel.blogspot.com/
It is meant as a counterweight for Tony's "Durable Scope" blog and Peter's "ROC of ages" explanations in this newsletter. The entry level is "beginner" although I dare say intermediate level ROC'ers will find something useful in there as well.
The blog will have a weekly entry (this week it is about templates) together with the regular newsletter. If you have a NetKernel subject you always wanted to know about but were afraid to ask, let me know at practical(dot)netkernel(at)gmail(dot)com.
Attentive readers will have noticed the book changed name (again). All this is obviously going to lead somewhere ... so watch this space !
RESTOverlay Released
Following a couple of weeks of public discussion (thanks for all your input) - the RESTOverlay is ready for prime time and has been shipped in today's update to the http-server package.
I had a small epiphany earlier in the week and realised we could do something very elegant. Any managed REST target endpoint can specify a <wrapperTarget> endpoint id.
The RESTOverlay will invoke the wrapper target first and pass it an operand argument comprising the request to the actual target endpoint. The target endpoint request is only triggered when the wrapper SOURCEs the operand. Therefore the wrapper can perform both a pre-process phase (before sourcing operand) and a post process (after sourcing operand).
Better still, the RESTOverlay pulls off this trick while ensuring that the spacial scopes are well defined - so the response from the wrapper acquires the same cacheability as the response from the target endpoint.
I have a feeling this is going to be a very powerful pattern.
Also added this week is application-level 404/406 handlers, global exception handler, endpoint specific exception handlers.
All the other proposed features are implemented, with a few amendments to support multi-valent matching declarations where appropriate.
Examples/Demos
The feature set of the RESTOverlay is very rich. The reference documentation is reproduced below, but its one of those tools that's much easier to see when used for real...
AddressBook
In the repos you'll find a new demo application demo-addressbook. Actually this is a very old application - first seen on NK2 and not touched since the NK3 days. It took an hour to port to NK5 and, more importantly, introduce the RESTOverlay as the front-tier.
It implements a web-application (notice how I kept the "tasteful" beige look-n-feel for all the fans from back in NK2 days)...
It also implements content-negotiation and will switch to being a REST service if the Accept header priority is for text/xml...
Other key features are its use of negotiated compression content-encoding and MD5 Etag generation. These two features alone make this an astonishingly fast application.
The application is deliberately very simple HTML. It doesn't use any AJAX bells and whistles. It also shows off the use of the RDBMS tools to do a basic CRUD application.
You'll need to point the H2 database to a directory and run the installer. So, after install, read the docs here...
http://localhost:1060/book/view/book:addressbook:demo/
restoverlay-test
The repos also contain the restoverlay-test package - which provides detailed examples of all the features of the RESTOverlay.
wink
Finally, this very application, the "WiNK" wiki has been updated and now uses the RESTOverlay to do its compression and Etag generation. Use firebug on this page to see that it was compressed (probably with gzip - but depends on your client preference) and the Etag's are now MD5's (which will remain consistent and so be valid between restarts of NK). So, even though the content is dynamic, you'll mostly be seeing 304 responses from now on.
Reference Documentation
Below is the reference documentation with all the gory details, and after that, towards the end, is a detailed discussion of what we've learned through this process...
Design Overview
Broadly speaking the RESTOverlay is an endpoint routing switch for HTTP REST services/applications.
For a given internal resource request from the HTTPBridge, it will select and invoke a target endpoint based upon a best match with the external HTTP client's expressed preferences and the set of internal target endpoints.
Target endpoints declare their capabilities by specifying user-supplied endpoint metadata. This is ROC metadata and can loosely be considered as a richer generalisation of Java annotations.
Features
The RESTOverlay supports the following capabilities...
- flexible simple grammars
- dynamic content negotiation
- dynamic language negotiation,
- dynamic compression negotiation
- HTTP method routing
- automatic Etag generation
- automatic 404/406 handlers
- automatic exception handling - global or fine-grained
- targeted pre-processes
- lazily evaluated functional wrapping of target endpoints
The RESTOverlay manages the ROC spacial scope so that the caching semantics of the responses of targeted endpoints are conserved.
Architecture
The RESTOverlay is a companion to the HTTPBridge and must be implemented downstream of the HTTPBridge since it interacts with the httpRequest:/ and httpResponse:/ resource sets.
A schematic diagram of where the RESTOverlay sits and how it routes requests to target endpoints is shown below...
Instantiation
The RESTOverlay is instantiated from the prototype found in the urn:org:netkernel:tpt:http space. The broad structure of an instance will look like this...
<overlay>
<prototype>RESTOverlay</prototype>
<config>
<basepath>/somepath/</basepath>
</config>
<space>
<!-- Space containing target endpoints with suitable REST metadata --></space>
</overlay>
<import>
<private />
<uri>urn:org:netkernel:tpt:http</uri>
</import>
</rootspace>
The space contained within the RESTOverlay is the location where the managed REST endpoints should be present.
Configuration
The RESTOverlay requires a <config> parameter with the following values
- <basepath> mandatory the basepath from which managed rest services will be offset. All paths must start with "/". If the root is to be the basepath then use "/" alone.
- <auto404> optional automatically handle 404 responses for any request into the basepath that is not matched. If the tag has a value this is user-specified target id for a custom 404 handler (see below).
- <auto406> optional automatically handle 406 responses for found resources but with insufficient acceptability for the client. The tag must contain a user-specified target id for a custom 406 handler (see below).
- <globalExceptionTarget> optional the target id of a catch all exception handler endpoint (see <exceptionTarget> below for interface requirements)
- <strict> optional when specified it indicates that the RESTOverlay must use strict mode. In strict mode matching of target endpoints must be exact. The default is to be tolerant and find a best match target endpoint.
Endpoint REST Metadata Declaration
To make an endpoint a potential routing target for the RESTOverlay it must declare suitable metadata. An endpoint declares metadata by providing a <meta> tag on its declaration...
<meta> ...User specified metadata... </meta>
<grammar>foo</grammar> ...etc...
</endpoint>
The RESTOverlay searches for a <rest> metadata tag within the <meta> declaration. For example the following endpoint declares that it wishes to handle GET requests for a REST sub-path with a simple grammar of "hello"...
<meta>
<rest>
<method>GET</method>
<simple>hello</simple>
</rest>
</meta>
<grammar>res:/helloImpl</grammar> ...
</endpoint>
Grammars
A target REST endpoint must always provide a <simple> grammar definition. This is a full simple grammar and so it may use the pattern matching and named arguments capabilities of the simple grammar.
The value of the <simple> tag is prefixed with the value of the RESTOverlay <basepath> configuration parameter to construct a logical endpoint.
For example if the RESTOverlay has a basepath of /foo/ and a taret endpoint has a simple grammar of baa the RESTOverlay will automatically create a resolvable logical REST service with the simple grammar...
res:/foo/baa
If the declared simple grammar has pattern matching fields, these are passed through intact. For example if you have a basepath of /foo/ and a simple grammar of {country}/{state}/{city} the RESTOverlay will automatically create resolvable logical REST service with the simple grammar...
res:/foo/{country}/{state}/{city}
Multiple Targets
You may provide any number of target endpoints with the same <simple> grammar. At the level of the RESTOverlay these are logically combined into a single logical endpoint with the common constructed grammar.
When a request is resolved by the logical endpoint's grammar the RESTOverlay will then start its matching algorithm to determine which of the internal target endpoints that share the same <simple> grammar should receive the request.
Internal "True Grammar"
Whilst many target endpoints can share the same external <simple> grammar - declared in the /endpoint/meta/rest/simple declaration. Each target endpoint must have its own unique endpoint grammar - declared as /endpoint/grammar. To distinguish these, lets call this the "true grammar".
The "true grammar" must be unambiguous so that a potential external request can be uniquely routed to that endpoint.
The RESTOverlay will detect if any target endpoints have ambiguous "true grammars" and will log a warning. It will also show that it is incorrectly configured in the space explorer. If you see this warning, you must locate the endpoint with the ambiguous true grammar and make sure it is changed to be unique.
Argument Relaying
The external REST service logical endpoint may have a simple grammar containing argument fields. If you use arguments in your <simple> grammar you must declare the target endpoint's "true grammar" with the corresponding named arguments so that they can be relayed to the target endpoint.
For example this target endpoint implements the /country/state/city REST path <simple> grammar and its "true grammar" accepts the same named arguments.
<meta>
<rest>
<method>GET</method>
<!--The simple external logical grammar-->
<simple>{country}/{state}/{city}</simple>
</rest>
</meta>
<!--The true grammar-->
<grammar>
<active>
<identifier>active:CityService</identifier>
<argument name="country" />
<argument name="state" />
<argument name="city" />
</active>
</grammar> ...
</endpoint>
The external simple grammar arguments (country, state, city) are relayed onto the request to the target endpoint. So for example a request
res:/foo/USA/California/LA
would be resolved by the RESTOverlay and then internally routed as a request for
active:CityService +country@USA +state@California +city@LA
Note: In this case we declared the "true grammar" using an active grammar syntax. You don't have to make your internal endpoints use active grammars but we recommend it since they are normalized and ensure that you cannot get confused between external logical REST paths and normalized internal ROC resources.
Endpoint REST Declarations
We have seen that meta/rest/ tag must contain a <simple> tag. The following optional tags are also supported...
Routing Options
<method>
Purpose
Indicates which HTTP method or methods will be routed to this target endpoint.
Tag Value
A comma separated list of one or more GET, POST, PUT, PATCH, HEAD, DELETE etc.
<produces>
Purpose
The RESTOverlay can perform dynamic content negotiation in order to locate a target endpoint that satisfies the expressed content requirements of the HTTP client (via the incoming HTTP Accept header).
The value of the produces tag must be the mimetype which this endpoint produces.
An endpoint must only produce one mimetype. Declare multiple endpoints with different <produces> tags if you wish to support multiple resource types.
Tag Value
A single valued mimetype.
<consumes>
Purpose
The RESTOverlay will perform dynamic content delivery for entity bearing methods (POST, PUT, PATCH) and will use the mimetype of the body content to select a target endpoint. The <consumes> tag indicates which mimetypes of the body content the target endpoint is prepared to accept.
The values of the consumes tag must be the set of mimetypes which this endpoint is able to consume.
Tag Value
A comma separated list of one or more mimetypes.
<language>
Purpose
The RESTOverlay can perform dynamic language negotiation in order to locate a target endpoint that satisfies the expressed language requirements of the HTTP client (via the incoming HTTP Accept-Language header).
Tag Value
The value of the language tag must be an HTTP standard language code. eg "en" or "fr" etc
<mustBeSSL/>
Purpose
The RESTOverlay will ensure that the requested URL was on the https: SSL channel. If it was not it will automatically 302 redirect to the same URL with https: scheme.
Content Processing Options
<Etag>
Purpose
If specified the RESTOverlay will automatically perform content hashing and associate an Etag header with the resource.
Tag Value
The algorithm to use for the hash. Must be one of MD2, MD5, SHA1, SHA256, SHA384, SHA512
<compress/>
Purpose
If specified the RESTOverlay will automatically attempt to compress the representation using a negotiated compression algorithm that is acceptable to the HTTP client. Implemented formats include gzip and deflate compression.
The HTTP Header Content-Encoding is automatically set as appropriate.
Processing Options
<exceptionTarget>
Purpose
The RESTOverlay will catch any exception in the main target request and issue a request to the specified exception handling target.
The target endpoint must have a grammar that accepts request and exception arguments. For example...
<id>MyExceptionHandler</id>
<grammar>
<active>
<identifier>active:MyExceptionHandler</identifier>
<argument name="request" />
<argument name="exception" />
</active>
</grammar>
</endpoint>
The request argument will be the INKFRequest which was issued to the failed internal target endpoint. The exception argument will be an org.netkernel.http.rest.RESTOverlayWrappedException which wraps the thrown exception.
A local declaration of <exceptionTarget> will override any setting of the <globalExceptionTarget> config parameter.
<preTarget>
Purpose
Before the main request to the resolved target, the RESTOverlay will issue a request to the endpoint specified by the preTarget identifier.
The target endpoint must have a grammar that accepts a request argument. The request argument will be the INKFRequest which will be issued to the the target endpoint.
The preTarget endpoint can perform authentication, validation, logon etc. If necessary it can make requests to the httpRequest:/ and httpResponse:/ to do any necessary process.
Response
The preTarget endpoint must return a boolean indicate if the RESTOverlay should proceed with issueing the request to the target endpoint.
<wrapperTarget>
Purpose
The RESTOverlay will automatically wrap the target request with a request to the wrapperTarget endpoint.
A wrapper endpoint must accept an operand argument. The operand is a pass-by-request containing the request to the primary target. Only when the wrapper actually SOURCE's its operand argument is the primary target request triggered.
This pattern is functional lazy evaluation and looks like this: w(f(x))
The beauty of the pattern is that before sourcing the operand, the wrapper is able to perform a pre-processing phase. After sourcing the operand, it can move to a post-processing phase.
The RESTOverlay carefully constructs a spacial scope so that requests to the wrapper are cacheable and have a dependency expiry. The net effect is that GET endpoints that generate a cacheable resource can be combined into a composite in a wrapper handler and the composite resource is also cacheable.
Response
The response from the wrapper is returned as the final response from the overlay.
Examples
The restoverlay-test package can be installed with apposite and provides a comprehensive set of examples of the use of the RESTOverlay.
The installed module is located as modules/urn.test.org.netkernel.tpt.http.RESTOverlay-x.x.x/
The unit tests can be executed here
The demo-addressbook package has a simple CRUD address-book that presents both a web-application interface and a REST service interface using the RESTOverlay for content negotiation, a decorated wrapping template pattern, compression and Etag generation.
RESTOverlay - What have we learned?
I have to confess that I feel somewhat ambivalent to the world of REST services. On the one hand, it is plain as the nose on your face that a resource oriented approach to information engineering is proven to be the only game in town. (Its called the Web). On the other hand the rise of the "REST Framework" leaves me cold - even though, just as Iggy Pop is the godfather of punk, NetKernel is the "godfather of the REST Frameworks".
Yes really. Did you know that in 2005 RESTlet was originally inspired by NetKernel (circa 2000) (they used to give us credit in their acknowledgements but it seems to have gone now - edit: its been pointed out to me that the evidence is still in archive.org here). And RESTlet was the first of several REST frameworks, which eventually influenced JSR-311.
So what's my problem?
Well first of all, the idea of a binding layer between the resource oriented world of REST and "code/objects" is abhorrent. All the beautiful emergent potential of resource oriented composition gets flattened into a brutal, narrow, hard-wired, boundary-layer.
Oh my god! We have in our hands the knowledge that resource-oriented systems are proven to be the world's best-ever model for information architecture and you want to bind it to code? Forget. The. Bloody. Code.
Its the resources stupid.
You also know by now, I have the Physicists disease of needing to find a general model. REST is an instance - actually a partial instance - of a resource oriented system. Its a good start, but...
REST isn't sufficient
You can see this, even in some basic terminology. Consider "Content Negotiation". If you've looked at JSR-311, or kicked the tyres on the RESTOverlay, you will understand that.
- Its not "content" that is under discussion but representation
- There's no "negotiation" going on at all.
When a REST framework says it does "automatic content negotiation" what it really means is it has "weighted type binding".
And then there's the angels on the head of a pin discussion about whether the resource URL should be distinct for each type (a channel for XML, another for JSON etc etc). Or whether it is "purer" to steer the selection using the Accept header and "content negotiation".
Lets consider this a bit deeper. Here's some URLs with and without additional headers...
http://server.com/resource?type=XML http://server.com/resource Accept: text/xml http://server.com/resource Accept: text/xml;q=0.8,bizarre/elephant;q=1.0 http://server.com/resource+Accept@text/xml http://server.com/resource+Accept@text/xml;q=0.8,bizarre/elephant;q=1.0 http://server.com/resource+GIMME_XML
In fact, with the general perspective of ROC, we can see that these are all exactly equivalent.
Whether it feels like it or not, an HTTP header is an additional argument on the resource identifier. It just happens to travel "in-band". So whether the type is explicit in the URL, or in the header, or is qualified by a weighting factor - its still ultimately part of the resource identifier.
There is no problem here. Remember the ROC axioms?
- A resource is an abstract set of information
- Each resource may be identified by one or more logical identifiers
Its "abstract" and the "one or more" is significant. All of those examples above, are identifiers for the same abstract resource.
The job of a resource oriented system, is to provide a mechanism by which a resource identifier can be reified into a representation of that resource (or as axiom's 3 and 4 say A logical identifier may be resolved within an information-context to obtain a physical resource-representation and Computation is the reification of a resource to a physical resource-representation)
So with this perspective we can now understand precisely what the RESTOverlay is doing. It is a request identifier normalizer, which maps the external denormalized requests to a normalized internal resource. In terms of pure set-theoretic functions it is a surjection (you might remember I discussed set-theory more generally in respect of ROC this time last year).
Also, armed with this general perspective, we can see that there is no difference between GET and POST. No really. The entity bearing POST method contains state to be "transfered" to the URL resource. Or maybe not? What if, as we did with headers, we consider the body as simply being a part of the identifier (encode it however you like - a resource can have one or more identifiers)?
Why can't I consider this to be a very specific identifier into the abstract resource set? Why not if you give me exactly the same state (the same identifier) why can't I return a 304 not modified? What? Yes, its a POST, you didn't cache my response last time? More fool you!
So we are exploring state/identifier duality. It is a fact. It's an explicit part of the general ROC abstraction.
Here's a final thought. The RESTOverlay and the NK ROC system, knows when two POSTs are the same and can hit a cached resource. I would have written the overlay to return an Etag for the POSTs, but I doubt very much that any browser or REST client would ever even think to cache the response to a POST.
So I have vented some frustration. The practical and pragmatic engineer in me thinks the RESTOverlay is pretty damned cool. The physicist in me qualifies that with "for the particular special case of HTTP REST apps and services".
But, maybe, possibly, just maybe, you now see why its taken until now to provide the RESTOverlay - the idealist in me was hoping I'd never have to. You can go so much beyond REST in the general world of ROC.
Its not hard. Its easy if you try.
Have a great weekend,
Comments
Please feel free to comment on the NetKernel Forum
Follow on Twitter:
@pjr1060 for day-to-day NK/ROC updates
@netkernel for announcements
@tab1060 for the hard-core stuff
To subscribe for news and alerts
Join the NetKernel Portal to get news, announcements and extra features.