NetKernel News Volume 2 Issue 3

November 5th 2010

What's new this week?

Catch up on last week's news here

Repository Updates

The following packages are available in both NKEE and NKSE repositories...

We recommend you update any production servers to get a JVM bug workaround in layer0 (see below)

  • coremeta 1.7.1
    • added documentation for PrivateFilterEndpoint, Prototype and Representation declarations
  • layer0 1.46.1
    • JVM bug workaround in PropertyConfiguration
    • grammar implementation updates to all extracting details for full text index
    • optimation of hds serialisation
    • fix to method="from-string" in decl request
  • layer1 1.21.1
    • added some argument and response representation metadata (this is going to be rolled out to more libraries soon)
  • module-standard 1.32.1
    • updates to branch-merge to expose structure of overlays into static structure diagrams
    • updates to endpoint base classes to allow declaration of argument and response representation types
    • various minor updates to improve naming and visualisation in new explorer
  • nkse-control-panel 1.16.1
    • added jquery 1.4.2 and jquery-ui 1.8.5
    • various style updates
  • nkse-dev-tools 1.24.1
    • removal of old space explorer
    • new space explorer II
  • nkse-doc-content 1.25.1
    • added more documentation around use of private filters in root spaces
  • nkse-docs 1.13.1
    • various updates to allow docs to be rendered inside space explorer
    • added javadoc into representation reference boilerplate
  • nkse-search 1.11.1
    • added support for cusom analyzers when searching
  • system-core 0.18.1
  • web-core 1.3.1
    • improvements to keywords in documentation to aid searching

*New* Space Explorer II

Space Explorer II is now available on the repositories. Also included are a whole raft of updates to other tools to provide integration with the new framework. Please synchronize apposite and accept the updates to get the new features and enhancements.

The new space explorer has integrated views of endpoint documentation, but it needs the system index to be rebuilt, this happens every 3 days when apposite auto-synchronizes but you can do it after updating with this link: http://localhost:1060/tools/search/fullIndex

Popcorn Time

It would take way too long to describe and I'd surely miss some details if I did - so instead here is a video tour of the new tool...

Recommend you view at 720p and fullscreen.

Space Explorer II is an extensible framework and today's release incorporates only the first set of tools. Expect a series of cool add-on tools over the next few weeks.

Please let us know your ideas and any suggestions for features, usability enhancements etc.

Oh yes one more thing. Sorry but it was not possible to make this release of the tool backwards compatible to IE6-8 due to need for dynamic SVGs etc. If you have to use IE then please use IE9.

File Handle Limits: A tragic tale with a happy ending

We introduced a workaround for a potential JVM bug in layer0. You may know that the Kernel is dynamically reconfigurable. One way to modify its configuration is to modify the etc/ file which is monitored by the kernel through org.netkernel.layer0.util.PropertyConfiguration.

This utility lives outside the ROC abstraction, but for reasons of keeping things cleanly isolated from the metal, it is written to use a and does not care if the kernel properties are coming from a file or jar. By default an installed system uses a file:/ URI to etc/ in your install location.

In order to detect if the kernel needs reconfiguring we use URLConnection.getLastModified()

Long story, but it turns out that when the underlying internal implementation of the URLConnection is a File then when calling getLastModified the Java Virtual Machine (versions 5 and 6, at least) leaves an open file handle to the underlying operating system file. There's no obvious way to release it and the OS-level file handle stays bound to the URLConnection object until it gets Garbage Collected.

Ordinarily you might think this a little sloppy of the JVM impl, but who cares it all comes out in the GC wash. Well yes, but it leaves open a nasty corner case...

JVM meets Operating System

PropertyConfiguration is monitored about every five seconds to look for kernel property changes. So, due to the JVM URLConnection impl, every five seconds it was inadvertently opening an OS level file handle to the file and leaving it open until the system GC cleaned it up.

The nasty corner is that your operating system may have a per user hard limit on how many file handles are permitted. Most Linux distros have a default maximum of 1024 file handles per user.

To make matters really tragic, when a Java process is prevented from opening file handles by the Linux kernel the JVM gets really upset and can segfault!

Thanks for the Memory

Unfortunately (fortunately!) ROC/NK is very favourable to memory use:

  • Since endpoints are stateless we only ever create one instance of each endpoint - they're not expressly singletons, its just NK naturally normalizes how many instances are necessary.
  • Endpoints live in spaces, and again even though spaces are not explicitly singletons, the import relations only require a single instance. So NK minimizes the working set (computational runtime state).
  • NetKernel caches everything and has both representation and resolution caches (systemically memoizes computational state). It discovers and constantly tracks the local minima for the transient operating state.
  • State means memory footprint.
  • NK minimizes state so it minimizes memory use.

If your JVM has a large allocation of heap memory and, since NetKernel does not churn memory like a regular fat application server, it can be a long time between GC's - a suprisingly long time!

The Tragedy of Circumstance

Ordinarily, light-memory use is a great thing and is another of the reasons NK is fast and efficient compared with classical alternatives. But do you see the problem?

This set of circumstances can conspire to mean the JVM had an inbuilt time bomb bug. If the GC didn't happen before the growing set of file handles reached the limit it might be goodnight Vienna. Fortunately this is pretty unlikely for most operating conditions and so a narrow corner case, but nevertheless nasty.

Beginning and End

Last week we got a report of a production system where the JVM suddenly fell over. We really really hate that.

Fortunately the Linux logs showed that the JVM was reaching the max file handles and using the lsof sysadmin tool we found several handles to the file...

lsof -p XXXX (where XXX is the PID of your NK process)

From there the whole story was revealed. The issue was a narrow but real JVM corner case. We just had to figure out a workaround to put the JVM in its box. With a quick poke about with the URLConnection's API here's the answer...

{   long lastmod=con.getLastMofified();
{   con.getInputStream().close();

Even though we never wanted or touched the input stream, it seems URLConnection opens a file handle as a side effect of calling getLastModified and hangs on to it, no matter what the underlying impl is able to do (a File doesn't have to be read to look at the last modified time). The solution, if you have a URLConnection backed by a File, you have to get the inputStream and if you then immediately close it, it releases the os file handle.

After that no more dangling file handles.

This workaround is in the layer0 update in the repositories - we recommend you update your production systems and also maybe think about the max file handles of your OS (see below).

Production Considerations

With the JVM corner case now trapped, its unlikely you'll find that any default OS file limits are an issue with a typical working set of applications on NetKernel. But its worth keeping in mind if your application is doing large numbers of file operations.

Its one of those "production things" that can bite when the underlying OS has some predetermined limits you never considered. It turns out that on Linux this comes under a sysadmin's firefighting roles and is common for Oracle or Apache httpd amongst others.

Here's a useful guide to increase Linux's per user file limits should you need it.

NetKernel West 2011 - Denver Area, April 2011

Its been two years since the last NetKernel conference, NK4 is finished and the NKEE release is out too. There are a ton of things to talk about. We don't have a precise space-time coordinate yet, but this much is certain: Denver Area USA, April 2011...

NetKernel West 2011
Location:Denver Area, USA
Time:April 2011

Details on Venue, Dates, Logistics coming soon...

Have a great weekend.


Please feel free to comment on the NetKernel Forum

Follow on Twitter:

@pjr1060 for day-to-day NK/ROC updates
@netkernel for announcements
@tab1060 for the hard-core stuff

To subscribe for news and alerts

Join the NetKernel Portal to get news, announcements and extra features.

NetKernel will ROC your world

Download now
NetKernel, ROC, Resource Oriented Computing are registered trademarks of 1060 Research

© 2008-2011, 1060 Research Limited