|
NetKernel News Volume 4 Issue 9
March 15th 2013
Repository Updates
The following update is available in the NKSE and NKEE repositories
- layer1-1.44.1
- Adds a new active:listGoldenThread accessor to provide a filterable list of assigned golden threads. Thanks to Tom Geudens for creating this. (Documentation here.)
Tom's Blog
The update to layer1 today, incorporates a new golden thread list endpoint, provided by Tom Geudens, that presents a filterable list of currently assigned golden threads. In his blog this week Tom explains his use case for this new tool and provides a brand new power tool that will have CDN's quaking...
http://practical-netkernel.blogspot.be/2013/03/going-gui.html
NetKernel Gradle Plugin - Release 0.2.0
Following last week's news item announcing the Gradle Plugin update, Randy has added some extra refinement and bumped the version to 0.2.0 - giving it a general release status...
I have refined and updated the documentation for the Gradle plugin and consider it ready for general use:
https://github.com/netkernelroc/netkernel-gradle-plugin
Included now are a lot of features focused on the daily tasks of setting up and developing NetKernel modules.
Still to come are tasks for compiling, packaging, downloading and installing NetKernel, etc.
Virtual Host Pattern
While its starting to feel like cloud-server instances are ten-a-penny commodities, the truth is that they are not free and pennies-an-hour adds up to serious money over a full year. Especially if you have multiple-domains for applications and services that end up requiring a set of independent hosts on a cluster of VMs.
Of course back in the good old days, amortizing the cost of a single host server across many domains was a standard trick. Just think of Apache's virtual host mapping. But there are other good reasons to have virtual hosting as a key tool in your locker.
For example it can make setting up staging and test infrastructure much simpler by deploying all your apps to one test box.
A favourite trick I use is to have "test.foo.com" mapped to the "localhost" network on my development box's etc/hosts so that I can develop multiple applications simultaneously without the problem of "root collision" (where every app wants to own the root rest path / ) and without actually having to have an external host set up.
So here's a quick guide to virtual hosting with NetKernel. The following examples are typical production configurations used every day with our servers (including this wiki.netkernel.org virtual host you're viewing now)...
HTTP Fulcrum Customisation
The first step is to take a copy the out-of-box (OOB) front-end-fulcrum (FEF) module and customize it for your needs. The OOB version gives a general default configuration that is a good 80% use-case - but for hard-core systems it would be a good idea to take control and have your own production fulcrum.
I won't discuss setting up the Jetty configuration - but you will probably want to tune the threading and sockets to suit your particular throughput and latency requirements.
In this example lets look at the HTTPBridge architecture we use in production. Below is a working example focused on the HTTPBridge arrangement (you can imagine that the HTTP Transport endpoint sits immediately above this overlay declaration in the fulcrum space)...
<prototype>Throttle</prototype>
<config>
<concurrency>8</concurrency>
<queue>500</queue>
</config>
<space>
<overlay>
<prototype>HTTPBridge</prototype>
<exceptionHandler>res:/introspect/exceptionhandler</exceptionHandler>
<config>
<rewrite>
<match>(https?://[^/]*/[^?]*)(\?.*)?</match>
<to>$1</to>
</rewrite>
</config>
<space name="Frontend Fulcrum HTTP Bridge Overlay">
<!--Static Import application modules here-->
<!--Dynamic Imports-->
<endpoint>
<prototype>SimpleImportDiscovery</prototype>
<grammar>active:SimpleImportDiscovery</grammar>
<type>PrimaryHTTPFulcrum</type>
</endpoint>
<endpoint>
<prototype>DynamicImport</prototype>
<config>active:SimpleImportDiscovery</config>
</endpoint>
<import>
<uri>urn:org:netkernel:ext:layer1</uri>
<private />
</import>
<import>
<uri>urn:org:netkernel:ext:system</uri>
<private />
</import>
</space>
</overlay>
</space>
</overlay>
First thing to notice is that we wrap the HTTPBridge with a throttle overlay. This provides us with armour plating to prevent system-overload under DoS.
However, it also has a much more valuable economic benefit - it shapes the internal load profile such that the concurrency is impedance matched to the hardware capabilities.
So, our throttle is configured to lock the throughput load-line on the peak capacity of our server (see this discussion about Software Load Lines for details).
Finally it has a third significant benefit - it prevents the stop-start free-for-all scheduling that an unconstrained free-running server has. Therefore with fewer executing threads, the net transient memory footprint for a given throughput is a fraction of that required by a free-running system. Which means that generation and collection of transient garbage is much smoother, which means less memory needed and more efficient management which pays back in a greater net throughput.
The throttle delivers a virtuous circle of engineering benefits from one simple engineering construct.
OK enough on the throttle - we're talking about the Virtual Host pattern...
HTTPBridge Rewrite
By default the HTTPBridge in the FEF moves all requests from the http:// address space to the res:/ space. It does this based upon the declarative rewrite configuration. Here's the default, which removes any query params from the URL, drops the hostname and switches the URI scheme to res:/
<match>http://[^/]*/([^\?]*)(\?.*)?</match>
<to>res:/$1</to>
</rewrite>
When we want to do virtual hosting we need to preserve the hostname, since that is the determinant for which virtual host channel we will route the request to.
So we have to set up the HTTPBridge rewrite to strip off query parameters, but pass through the full http or https base URL, including the hostname part...
<match>(https?://[^/]*/[^?]*)(\?.*)?</match>
<to>$1</to>
</rewrite>
The net result of this configuration change is that requests arriving into the HTTPBRidge's overlay <space> are fully host-qualified REST paths.
The only other thing we do to our FEF is to change the target name of the dynamic import discovery hook. Instead of looking to import "HTTPFulcrum" we instead change it to import "PrimaryHTTPFulcrum". This means that modules wanting to be exposed to a legacy HTTPFulcrum environment won't attach themselves here and get a shock that the requests are now fully qualified. (We'll see below how we provide a legacy context without impacting on existing applications that don't care about virtual hosting.)
So that's the customised FEF taken care of. Its really a two line config change if you ignore the throttle wrapper we discussed.
Virtual Host Space
Virtual hosting in the ROC domain is really just a resource routing pattern. A request with a fully qualified hostname needs to be routed to an address space that offers the context for that particular virtual host. The following discussion shows one way way we can do this, by using a set of <mapper> overlays each wrapping a contextual host space.
Below is the full structure for a system with two virtual hosts and a default localhost context. We'll break it into pieces and discuss the design below but the key idea is that requests arriving into this rootspace are routed to a specific "Virtual Fulcrum" based upon the hostname in their request URI. You can click here to jump past this monster...
<!--Boilerplate module metadata not shown for clarity-->
<rootspace>
<fileset>
<regex>res:/etc/system/SimpleDynamicImportHook.xml</regex>
</fileset>
<!-- ************************** download.netkernel.org ************************** -->
<mapper>
<config>
<endpoint>
<grammar>http://download.netkernel.org
<regex>.*?/</regex>
<group name="path">
<regex type="anything" />
</group>
</grammar>
<request>
<identifier>res:/[[arg:path]]</identifier>
</request>
</endpoint>
</config>
<space>
<endpoint>
<prototype>SimpleImportDiscovery</prototype>
<grammar>active:ImportDiscovery</grammar>
<type>download.netkernel.org</type>
</endpoint>
<endpoint>
<prototype>DynamicImport</prototype>
<config>active:ImportDiscovery</config>
</endpoint>
<import>
<uri>urn:com:1060research:virtual:hosts:common</uri>
</import>
</space>
</mapper>
<!-- ************************** wiki.netkernel.org ************************** -->
<mapper>
<config>
<endpoint>
<grammar>
<regex>(http|https)</regex>://wiki.netkernel.org
<regex>.*?/</regex>
<group name="path">
<regex type="anything" />
</group>
</grammar>
<request>
<identifier>res:/[[arg:path]]</identifier>
</request>
</endpoint>
</config>
<space>
<endpoint>
<prototype>SimpleImportDiscovery</prototype>
<grammar>active:ImportDiscovery</grammar>
<type>wiki.netkernel.org</type>
</endpoint>
<endpoint>
<prototype>DynamicImport</prototype>
<config>active:ImportDiscovery</config>
</endpoint>
<import>
<uri>urn:com:1060research:virtual:hosts:common</uri>
</import>
</space>
</mapper>
<!-- ************************** localhost ************************** -->
<mapper>
<config>
<endpoint>
<grammar>
<regex>https?://(localhost|ip6-localhost|127.0.0.1|192.168.200.130|XN--RC-39Z.ws)</regex>
<regex>.*?/</regex>
<group name="path">
<regex type="anything" />
</group>
</grammar>
<request>
<identifier>res:/[[arg:path]]</identifier>
<representation>org.netkernel.layer0.representation.IReadableBinaryStreamRepresentation</representation>
</request>
</endpoint>
</config>
<space>
<endpoint>
<prototype>SimpleImportDiscovery</prototype>
<grammar>active:ImportDiscovery</grammar>
<type>HTTPFulcrum</type>
</endpoint>
<endpoint>
<prototype>DynamicImport</prototype>
<config>active:ImportDiscovery</config>
</endpoint>
<import>
<uri>urn:com:1060research:virtual:hosts:common</uri>
</import>
</space>
</mapper>
<!--********************************* * Provide common services for transreptors etc - but prevent res:/etc/system/* leaking with a limiter. *********************************-->
<endpoint>
<prototype>Limiter</prototype>
<grammar>res:/etc/system/
<regex type="anything" />
</grammar>
</endpoint>
<import>
<uri>urn:com:1060research:virtual:hosts:common</uri>
</import>
</rootspace>
<!-- -->
<!--********************************* * Common Library Import Space - used by each virtual host context *********************************-->
<rootspace uri="urn:com:1060research:virtual:hosts:common" public="false">
<import>
<uri>urn:org:netkernel:ext:layer1</uri>
</import>
<import>
<uri>urn:org:netkernel:ext:system</uri>
</import>
<import>
<uri>urn:org:netkernel:xml:core</uri>
</import>
<import>
<uri>urn:org:netkernel:tpt:http</uri>
</import>
<import>
<uri>urn:org:netkernel:mod:security</uri>
</import>
<import>
<uri>urn:org:netkernel:nkse:style</uri>
</import>
</rootspace>
</module>
Common Import Toolbox
Last things, first. At the bottom of the listing you'll see a space which simply acts as an aggregate import of common library spaces...
<!--********************************* * Common Library Import Space - used by each virtual host context *********************************-->
<rootspace uri="urn:com:1060research:virtual:hosts:common" public="false"> ...imports not shown here... </rootspace>
These provide transreptors, styling and other gubbins that ordinarily an application might expect the FEF to be providing for them. This is just a simple example of the Toolbox pattern.
Virtual Host Mapping
Lets look at the first mapper declaration at the top of the first <rootspace>. This is providing a Virtual Fulcrum for the hostname download.netkernel.org - so when you requested your download of NetKernel - this is where your request was resolved...
<config>
<endpoint>
<grammar>http://download.netkernel.org
<regex>.*?/</regex>
<group name="path">
<regex type="anything" />
</group>
</grammar>
<request>
<identifier>res:/[[arg:path]]</identifier>
</request>
</endpoint>
</config>
<space>
<endpoint>
<prototype>SimpleImportDiscovery</prototype>
<grammar>active:ImportDiscovery</grammar>
<type>download.netkernel.org</type>
</endpoint>
<endpoint>
<prototype>DynamicImport</prototype>
<config>active:ImportDiscovery</config>
</endpoint>
<import>
<uri>urn:com:1060research:virtual:hosts:common</uri>
</import>
</space>
</mapper>
You can see that the mapper only specifies one logical endpoint in its configuration. The <grammar> looks quite complex, since we're using the full-fat power of the Standard Grammar but hopefully its not too difficult to see that this matches all requests beginning with http://download.netkernel.org. It captures the path as a named argument.
The logical endpoint constructs a declarative <request> and relays the grammar path by mapping it to the res:/ scheme.
The upshot is that only requests to download.netkernel.org get mapped to this mapper's inner <space> and, on the way, they are normalised to the res:/ address space.
Why do we normalize to res:/? Well, who knows what applications we might want to put inside here - they have no need to know anything about the virtual host - they can be general normalized solutions that, if we choose, we can relocate to other virtual hosts with zero impact and zero reconfiguration. Or actually, and more significantly, an application developed on the default FEF can be deployed to a Virtual FEF with zero changes.
Looking at the inside of the mapper <space> we see that it has a Dynamic Import and the discovery is looking for spaces that provide an import hook that matches the type "download.netkernel.org". So any module that has a Dynamic Import Hook that says it wants to be associated with this hook name will be automatically sucked into this space.
So when I said we can move our applications to whichever virtual host we like - its as simple as changing their dynamic import hook res:/etc/system/SimpleDynamicImportHook.xml. Here's what our download application uses...
<type>download.netkernel.org</type>
</connection>
So instead of importing itself to the generic "HTTPFulcrum" dynamic import (which you'll have probably used exclusively until now). We simply tell our download application to attach itself to this Virtual Fulcrum.
You can see that a second mapper is used to implement the same pattern but for "wiki.netkernel.org". Guess what, when you requested this page you were routed to this virtual fulcrum. Again it sets up a specifically named dynamic import - so that all wiki.netkernel.org related applications can attach themselves here.
I've kept this example simple, but you can see that the pattern scales nicely to arbitrarily large numbers of virtual hosts.
Virtual Host to FEF
Finally, the Virtual Host rootspace itself gets itself imported into our customized FEF by declaring its res:/etc/system/SimpleDynamicImportHook.xml like this...
<type>PrimaryHTTPFulcrum</type>
</connection>
So that's how this space gets the host-qualified requests coming in from the customized HTTPBridge endpoint.
Localhost
Finally we should discuss the special case of the undifferentiated applications which we want to carry on running but which don't know anything about Virtual Fulcrums. Simple we create a mapping for "everything else" and give it the standard "HTTPFulcrum" dynamic import endpoint...
<!-- ************************** localhost ************************** -->
<mapper>
<config>
<endpoint>
<grammar>
<regex>https?://(localhost|ip6-localhost|127.0.0.1|192.168.200.130|XN--RC-39Z.ws)</regex>
<regex>.*?/</regex>
<group name="path">
<regex type="anything" />
</group>
</grammar>
<request>
<identifier>res:/[[arg:path]]</identifier>
<representation>org.netkernel.layer0.representation.IReadableBinaryStreamRepresentation</representation>
</request>
</endpoint>
</config>
<space>
<endpoint>
<prototype>SimpleImportDiscovery</prototype>
<grammar>active:ImportDiscovery</grammar>
<type>HTTPFulcrum</type>
</endpoint>
<endpoint>
<prototype>DynamicImport</prototype>
<config>active:ImportDiscovery</config>
</endpoint>
<import>
<uri>urn:com:1060research:virtual:hosts:common</uri>
</import>
</space>
</mapper>
You could make the grammar a wildcard match and basically route all unspecified hosts here. However I like to use the virtual host architecture when I'm developing. So here I specifically have localhost and ip6-localhost (my IP6 binding), as well as a local network IP4 address so I can share demos with other people on our VPN etc. I'll let you figure out what I'm mucking about with by having this XN--RC-39Z.ws virtual host.
Conclusion
The Virtual Host pattern in ROC is really just an architectural routing pattern based upon aliasing and normalizing identifiers. The embodiment I discussed above can be easily adapted. For example, you can readily move the throttle down inside each individual virtual host - so that you can have differentiated load shaping.
Of course, if you then instrument the load to each host (using a transparent pluggable-overlay for example), you can also easily make the throttles adaptive to provide automatic virtual host load prioritization.
If all of this seems like too much configuration for you. Then NKEE provides a complete virtual host endpoint that constructs exactly the same spacial architecture as I've discussed but uses a simple declarative configuration. The documentation is provided here...
http://docs.netkernel.org/book/view/book:mod:architecture/doc:mod:architecture:virtualhost
Final observation. Nothing in this discussion is specific to HTTP - all the requests are internal and within the ROC domain. It follows that this pattern is completely general - you can take the same architecture and apply it to email processing virtual hosts, or JMS message vhosts or XMPP etc etc.
The reason this is so generalisable is because virtual hosting is an instance of a very old technique that has been in practice in electronics and opto-electronics for half a century. A virtual host system is really just a de-multiplexer (demux).
Have a great weekend. No news next week - travelling.
Comments
Please feel free to comment on the NetKernel Forum
Follow on Twitter:
@pjr1060 for day-to-day NK/ROC updates
@netkernel for announcements
@tab1060 for the hard-core stuff
To subscribe for news and alerts
Join the NetKernel Portal to get news, announcements and extra features.