|
NetKernel News Volume 1 Issue 45
September 10th 2010
What's new this week?
Catch up on last week's news here
Repository Updates
The following updates are available in both the NKEE and NKSE repositories...
- http-client 2.0.0
- Significant update to use Apache Client version 4.x. Also adds OAuth (see below)
- layer0 1.41.1
- module-standard 1.26.1
- Changes to introduce the Boot Order Optimizer (see below)
- twitter 1.2.1
- A new twitter demo using the OAuth http client (see below)
The following updates are in the NKEE repository only...
- nkee-apposite 1.18.1
- Updated to use the new http-client 2.0.0 with a custom Connection Manager.
HTTP Client Update
The update of the HTTP client library is a fairly significant step forward. The implementation has been migrated from Apache HTTP Client 3.x to the considerably revised version 4.x family.
The ROC service interface is 100% compatible with the previous library and passes all our existing unit tests including the SOAP library tests (which is an "application" layer above the low-level accessors).
Whilst compatible, it does have several enhancements...
Connection Manager
The 4.x apache client has a cleaner implementation which means that it can now be configured with a more RESTful model. One immediate benefit is that, if you need to, you can supply your own Connection Manager as an argument on a request.
So, for example, this allows you to provide different socket factories for your application. For example, you might want https: requests to use an SSL socket factory that is preconfigured with your own SSL certificate authority etc. (Actually this is the very use-case that we have for the Apposite client tool in NKEE and is the reason that library has been updated in the repos to use the new capability).
Also, amongst other things, the Connection Manager allows you to control socket and connection timeouts per request too.
OAuth
Probably the most significant enhancement is the general support of OAuth. If you have OAuth credentials then using the http client tools with signed requests is as simple as just specifying the oauth argument referencing your credentials resource.
If you don't have credentials, there are a pair of tools that make the "OAuth dance" really simple. To see this, here's an excerpt from the http libraries docs to see how its done...
The HTTP client tools all accept an optional oauth argument which when specified means the HTTP client request will be signed using the OAuth 1.0a standard before being issued.
The oauth argument must be transreptable to a set of OAuth credentials with the following form...
<consumerKey>WWxWuphwcIKoOFvoLSPg</consumerKey>
<consumerSecret>21uEnu9HCMDGoVmi9HVYIqOiFK7N3RAWq2CILYIkH8</consumerSecret>
<accessToken>19526217-65R9sZW1ExXS9k9It0hX0Ecf6udMI0MzTYbjJ9nGY</accessToken>
<accessTokenSecret>6QfAJs3zYslV3aq5uMS8HqhJdEi94SJDU62YFvaciI</accessTokenSecret>
</oauth>
These credentials are all that is required to use OAuth and your service provider may be able to supply these to you without any other steps being necessary.
OAuth Authentication Workflow
If your service provider has an automated workflow for authenticating and generating credentials then a pair of accessors are provided for initiating and performing the retrieval of authenticated OAuth credentials.
The first step is to initiate the OAuth process by issuing an active:oauthPrepare request...
req = context.createRequest("active:oauthPrepare") req.addArgument("settings", "res:/twitter-panel/resources/oauth/appSettings.xml") prepareState = context.issueRequest(req)
The settings argument must have the following form...
<consumerKey>8xsKhEw7JncwzTs9LNPxw</consumerKey>
<consumerSecret>tKWgFVFxHgxyDzu28lQOxVFGj84yuSJ4B8E8USpNM</consumerSecret>
<requestTokenURL>http://twitter.com/oauth/request_token</requestTokenURL>
<accessTokenURL>http://twitter.com/oauth/access_token</accessTokenURL>
<authorizeWebSiteURL>http://twitter.com/oauth/authorize</authorizeWebSiteURL>
</oauth>
Your OAuth service provider must provide you with a consumerKey and consumerSecret and also provide you with the locations of the three services required by the OAuth process.
The response from active:oauthPrepare is a serializable HDS structure containing some transient state. This representation is needed in the second stage but it may be persisted to an intermediate persistence mechanism if required.
Importantly the prepareState contains an HDS node with the path /hds/authorizeURL. This URL is the location to which you must tell the user to go to in order to authenticate themselves and to approve the OAuth connection request. Once approved, the provider will give the user a PIN activation code which is the final state required to complete the authentication process.
At this stage you now have: settings, prepareState and PIN activation code and with these three resources you are able to download the authenticated credentials using the active:oauthPrime accessor...
req = context.createRequest("active:oauthPrime") req.addArgument("settings", "res:/twitter-panel/resources/oauth/appSettings.xml") req.addArgumentByValue("prepareState", prepareState) req.addArgumentByValue("validationCode", "THE USERS PIN CODE") myOAuthCredentials = context.issueRequest(req)
The response from active:oauthPrime is the OAuth credentials as shown in the first section. These credentials should be stored and used to sign all subsequent requests to the service provider. Typically the credentials will be valid until the user deauthorizes them with the service provider - which may mean they are effective for a long time.
Twitter Demo
In the repositories there is a new twitter package which is a demo twitter client with example code showing the OAuth workflow. Although OAuth has a number of stages to authenticating a connection, the two oauthPrepare and oauthPrime accessors make it very simple for the developer. And, as you can see below, the end-user experience is very smooth too...
Once authenticated the client uses the OAuth credentials to sign every REST service call to the Twitter Services API. The demo application provides a tweet editor and an interactive search tool...
The search results are formatted so that clicking on any @name or #tag will become the new search term. I've not seen this in any other twitter client apps but it gives a very nice "conversation" browser experience. With some more time and inclination it could be fleshed out with browsing history and multiple streams to be a fairly full featured client.
Breaking News It appears all is not perfect in the world of twitter OAuth, Tony just pointed this article to me
http://arstechnica.com/security/guides/2010/09/twitter-a-case-study-on-how-to-do-oauth-wrong.ars
...seems like there are some useful server-side implementation lessons to be learned here.
Tom Geudens' NetKernel Book
Tom Geudens has been in touch. He has been using NetKernel since version 3.x.x at his employer, a large European supermarket chain. Over the course of a few years he has created an internal getting-started reference manual for the in-house developer team. Recently they've started moving to NK 4.
Tom has now begun work on a more ambitious general introduction to NetKernel book for his company. He has kindly decided to make this openly available. He has a plan to work on this over the next four months. The first draft, featuring installation notes and how to mirror your own apposite repository is available here:
Tom is looking for feedback on progress so far, and also ideas for content or even contributions of sections. He can be contacted at tom (dot) geudens "at" hush {dot} ai
Thanks Tom!
Boot Order Optimizer
The changes to layer0 and standard-module were to introduce a new Boot Order Optimizer (BOO). To understand what it does we need to explain how NetKernel boots. The boot process goes through several phases...
- The shell boot script fires up Java and runs a BootLoader class.
- The bootloader creates the core system classloader.
- The bootloader finds, loads and instantiates a kernel instance together with dynamically adding the core system java-level libraries to the system classloader. These include: layer0, module implementations such as standard-module and the cache. (If you're interested you'll find these libraries listed in etc/bootloader.conf)
- The bootloader, using the classpath, discovers all the module factories present in the core system.
- The bootloader initiates the ModuleManager in layer0 and gives it a reference to the kernel and to the module factories. At this point the system has dynamically booted a "classical java" system, albeit with a rich dynamic classpath. The Kernel is running but it is completely "empty". It doesn't know about any spaces so the ROC domain is not live.
- The bootloader now finds the stem-system. This is the bare minimum set of modules. We call it the "stem" system by analogy with stem-cells, it is the base that can grow into any type of application stack. By default the stem system is defined in etc/stem-system.conf. For the current distributions it comprises: layer1, ext-system, mod-security and coremeta. These are ROC modules containing spaces, endpoints etc.
- By convention the stem modules have a runlevel of 1. The bootloader registers the stem modules with the ModuleManager and then tells the ModuleManager to step the runlevel from 0 (no ROC) to 1 (ROC is go).
- This transition causes the ModuleManager to instantiate an instance of each module - it does this using the Module Factories and will gracefully attempt to instantiate a module using a fallback approach until it finds a factory that understands the module.
- NetKernel has no hard-wired module technology, in fact all the kernel cares about are ISpace interfaces, so you could go back to stage 3 and write an embedded ROC system without modules just class libraries containing spaces. However, by default, all the modules currently provided with the system are implemented with the Standard Module. The Standard Module understands the module.xml and is able to instantiate endpoints, implements a request resolution algorithm and enables sophisticated relative spacial patterns like overlays etc.
- We are now able to step up into the ROC domain. Each standard module is now able to discover what spaces it contains and it registers each public space with the kernel. The standard module can now start the process of booting itself.
- The standard module proceeds to instantiate instances of all declared endpoints. At this point things get interesting...
There are several ways things go at this point. In particular we have now fully stepped into the ROC domain and so the system can make resource requests in order to build up the system's operating state.
For example, consider endpoints instantiated from prototypes (nearly all transports use this pattern). When an endpoint is declared to be constructed from a prototype, the standard module must issue a resource request into the nascent address space to find an endpoint that can provide a prototype instance. This is in the twilight zone - the spaces are not fully booted yet!
As the endpoints are initialized, the standard module begins to drive the lifecycle events of the endpoints. Such as commissioning the endpoint with its parameters. If you've been following these news letters you'll recall that parameters are really resources that provide configuration state (see this discussion for details) and the standard-module/endpoint can issue ROC resource requests to source them. Again, these requests can be happening as the system is coming alive.
The standard module ultimately starts to call the postCommission() interface on each endpoint. At this point many endpoints issue requests into the ROC address space in order to source configuration state etc.
In the boot story so far, we have initialized the kernel, core system and the stem ROC system.
I 09:15:53 Kernel Starting NetKernel Enterprise Edition Resource Oriented Computing Platform Version 4.1.1 Copyright 2002-2010 1060 Research Limited http://www.1060research.com 1060, NetKernel, Resource Oriented Computing are Trademarks of 1060 Research Ltd. I 09:15:54 ModuleManager Module Factory [org.netkernel.layer0.module.java.JavaModuleFactory] registered I 09:15:54 ModuleManager Module Factory [org.netkernel.module.standard.StandardModuleFactory] registered I 09:15:54 ModuleManager Module Factory [com.ten60.netkernel.module.encrypted.EncryptedModuleFactory] registered I 09:15:54 ModuleManager System changing to RunLevel [1] ...DETAILS NOT SHOWN... I 09:16:15 Kernel NetKernel Ready, accepting requests... I 09:16:15 ModuleManager System now at RunLevel [7] I 09:16:15 InitEndpoint Init completed - system at RunLevel [7] I 09:16:20 ModuleListAc~ ModuleListAccessor complete
Init Endpoint
One very important endpoint in the boot process is the Init endpoint in the ext-system module. This endpoint is somewhat similar to the init "process zero" in Unix. The Init endpoint uses the postCommission() event to start the process of boot strapping the remainder of the ROC domain.
Init finds the etc/modules.xml and, in the same way as the BootLoader at stage 6, tells the ModuleManager about the higher level modules and their respective runlevels. When the ModuleManager is loaded with the modules the Init endpoint steps the ModuleManager's runlevel up which starts the process of registering the spaces of these modules with the kernel. Each standard module now starts its own boot strap process.
The Kernel gets pretty busy since all this boot strapping is done in the ROC domain. Fortunately the ROC abstraction comes in to its own. Since all the boot system state is modeled as ROC resources then as with anything else they can be cached, have dependencies etc. Therefore as it progresses the system starts to find and hold on to the optimally efficient set of state required to boot the system. (This is why in the past I've glibbly thrown off statements like "ROC boots ROC").
Finally the system is booted and the transports can start to initiate root requests driving applications, services and whatever else you get up to.
Boot Hysteresis
In some cases during the ROC spacial boot cycle, an endpoint may not be initialized by the time some other unconfigured endpoint has made a request for its resources. In this case the returned representation is no use (an exception) but it is cached for a small "boot-hysteresis" period of time. By default this is one second but it can be configured in etc/kernel.properties.
After the hysteresis period elapses the useless resource expires. The first time the unconfigured endpoint actually needs to do something it finds its (useless) config state is expired which causes it to reinitiate the config request which, now being some time later, finds the endpoint that previously wasn't ready can now provide the state resource! For the computer scientists out there, we've introduced an "averaging" over the discrete discontinous state in order to cross through to the stable continuous post-boot state.
Its really a pretty interesting system. From a classical point of view the full richness of the ROC system is basically impossible*, but, by leveraging ROC itself, it is able to progressively find an optimal operating resource state.
Boot Order Optimizer: What is it?
So with this picture in mind, we can explain the Boot Order Optimizer. This is a tool that works with the ModuleManager and the standard modules to track and understand the spacial boot order (which is non-deterministic). It works out an ordering strategy to decide, for example, which spaces are leafs with no dependency requirements. It progressively finds an order that approximately minimizes the initialization graph. It builds up state dynamically, which helps the system get faster as it goes, but also, it stores the final discovered order and uses this the next time you boot the system!
But NetKernel boots fast already? For sure, the distribution boots fast even with all its tools and system applications on top of a core set of libraries. But BOO has been introduced because we're anticipating seriously loaded systems. For example my everyday development system has about 200 modules, each averaging about 3 address spaces per module. This working set includes all our development work, unit tests and production applications (the NK portal, this wiki, bug trackers etc etc). My set of modules has dozens of dynamic imports including a virtual hosting front-end layer and general dynamic application frameworks (see the WiNK architecture tutorial for an example in a single application).
Even my dev system is not really stretching the limits (for example it all runs on a default 128MB heap and boots in about 20 seconds) but, hopefully the boot story gives a clearer picture of how NetKernel is actually a Resource Oriented operating system. We're looking to the day when NK clouds are running thousands of modules per node.
Biological Implications
I had an interesting conversation with a Mathematical biologist some time ago. I was intrigued to learn that one of the big open questions in Biology is how can the first cluster of stem cells start to differentiate and produce the final organism. The problem is that when you look at DNA there doesn't seem to be enough information.
You know I can never resist talking about stuff I don't understand... But if you think about the ROC boot process, it is contextually developing state that drives the process. I reckon the spacial structures, the enzymes and chemical context provide a dynamically evolving state that drives the organism boot process. So the information is distributed between the program code (DNA) and the dynamically evolving context.
Talk: Resource Oriented Cloud Architectures
Don't forget, I'll be giving a talk on Resource Oriented Cloud Architectures at SkillsMatter in London a week on Tuesday (the 21st September). Here's the blurb...
As usual its a free evening event which usually ends-up down the pub. It requires registration, so book here now.
Have a great weekend.
Comments
Please feel free to comment on the NetKernel Forum
Follow on Twitter:
@pjr1060 for day-to-day NK/ROC updates
@netkernel for announcements
@tab1060 for the hard-core stuff
To subscribe for news and alerts
Join the NetKernel Portal to get news, announcements and extra features.