WiNK edit

wiki /NetKernel /News /5 /6 /April_11th_2014

NetKernel News Volume 5 Issue 6

April 11th 2014

Repository Updates
HeartBleed - Nothing to see here
NEW: Asynchronous HTTP Client
Groovy 2.3 Beta 2

Catch up on last week's news here, or see full volume index.

Repository Updates

The following update is available in the NKEE and NKSE repositories

twitter 1.6.1
- Updated to accommodate changes in the twitter API

Please note updates are published in the 5.2.1 repository.

HeartBleed - Nothing to see here

Heartbleed has been a serious issue for the IT community this week. This is just a formal notice that NetKernel is completely unaffected since it has no dependencies on OpenSSL.

If you're using SSL on NetKernel, for example with Jetty or NKP, then rest assured that you are using the Java SSL stack which is a clean-room stack running in the JVM and not a native library vulnerable to memory overrun attacks.

To reassure yourself here's a tool that you can use to check your server... http://filippo.io/Heartbleed/

Here is the result of a test of one of our SSL servers showing that NK is not affected Test Result

If you're responsible for other servers using technologies that are affected by heartbleed then you have our sympathies - it won't have been a good week.

NEW: Asynchronous HTTP Client
(or, how to do 20,000 simultaneous requests on 8 threads in 2.5 seconds)

HTTP is slow. CPUs are fast. When you make an HTTP request you go outside to interact with networked resources, so from an engineering perspective you are going from a super-fast world to a dead slow world. So you would think it would be a good idea to be getting on with something else while you wait for the slow world to come back with a response.

NetKernel has always made this dead easy. As I discussed last time - you can very easily do map reduce style patterns by making your requests asynchronous. For example here's how to fan out three HTTP requests...

urls = [ "http://foo.com/", "http://baa/com", "http://baz.com"]
handles=new Object[urls.size()]
for( i in 1..handles.size())
{   req=context.createRequest("active:httpGet")
    req.addArgument("url", urls[i-1])
    req.setRepresentationClass(java.lang.String.class)
    handles[i-1]=context.issueAsyncRequest(req)
}
//Do something useful here while we wait for the slow lane...

//OK, now we can get back the results
for ( i in 1..handles.size())
{   
    rep=handles[i-1].join();
    println("Received {$rep} for url {$urls[i-1]}")
}

This code is pretty easy to understand - we delegate the HTTP GETs to the kernel, do something on the dispatching thread while we wait, then go to see if our results have come back yet.

The problem is that active:httpGet uses the Apache client library - which by default uses blocking network level sockets. So while NetKernel happily accepts the async requests and schedules them on threads from the kernel. Each kernel thread goes into the HTTP client and becomes blocked on the HTTP network IO.

So the net affect is that while our one particular part of the system (the dispatching thread) feels like its doing pretty well, the net concurrency of the system has been reduced. If we have 8 kernel threads and we issue 7 http requests asynchronously we only have one thread (the dispatcher) doing anything while all the others are blocked. So the net throughput of our system has been coupled to the responsiveness of the remote http servers! This is really bad!!

It gets even worse though - because if we try to issue more than 8 requests then the kernel will schedule 8 of them, and queue the remainder (no more threads to use). So the net result will be that you are effectively defaulting to batching 8 HTTP requests at a time. The average client side response is therefore gated by the response time of the slowest response - so the "ensemble response" time will be roughly N x Mean response per batch / Batch Size.

Now, there's a naive answer to this problem. Throw more threads into the kernel.

Its very easy just to configure your system with hundreds (or thousands) of kernel threads.

And friends that is what your regular free-running application server or servlet engine does- since it has a thread per request.

Now think about this. Every thread is consuming resources on the system. For highly concurrent HTTP fan-outs you are consuming huge amounts of system resources just sitting doing nothing (blocking on the http response). Furthermore your CPU can't do more than 8 things at once (if you have 8 cores), so you are a hostage to the OS scheduling fortune.

In short, from an engineering perspective, you are being incredibly wasteful and have completely lost control.

There must be a better way? Well I'm pleased to say that as of now, there is - it requires precisely five characters of new code. In that example above all we need to do is change this...

req=context.createRequest("active:httpGet")

to this...

req=context.createRequest("active:httpAsyncGet")

This new Async client does not block. It immediately returns the requesting thread back to the kernel. It doesn't block any threads on the network IO, and only when a response comes back from the "slow" remote HTTP server does the kernel thread get assigned to deal with the response again. The full details are discussed below. But the upshot is that now your system at the engineering level is 100% decoupled from the remote HTTP endpoints you request and can continue doing good fast stuff locally for anyone else making requests. No blocking anywhere. Furthermore we only need the default set of 8 kernel threads - so the engineering footprint of the system is tiny (compared with thread-per-requests systems).

There's one more consideration too. Now, if you have a set of related requests to make (a batch), the maximum response time is guaranteed to be limited only by the longest single response in the ensemble. The engineering envelope is quantifiable and deterministic - every response will be available within the window of the longest request.

Download

To try this out you need to use the new http client module. Although we've tested this thoroughly we're not going to release it as an update via apposite just yet since we want to hear your feedback and get some mileage under our belts.

You can get it here:

Download: urn.org.netkernel.client.http-20140408.jar

Please note, you must issue your NK requests to active:httpAsyncXXXX (Get, Post, Put etc etc) using an async request

context.issueAsyncRequest(req)

since otherwise NetKernel won't know that it is free to carry on using the kernel threads that go into the http client and are immediately returned (see below).

Real Engineering: 20,000 concurrent outbound requests, 20,000 inbound requests, 8 threads, no blocking, 2.5 seconds total time

Here's one of our unit tests showing what this new capability delivers. This test concurrently issues 20,000 requests. You can see that it actually issues them to a test HTTP endpoint (ping) on the same host. My test machine has 8 cores, so I have 8 kernel threads. The test runs in 2.5 seconds. Meaning I'm averaging 0.125ms (125 micro-seconds) per request...

handles=new Object[20000]
for( i in 1..handles.size())
{   req=context.createRequest("active:httpAsyncGet")
    req.addArgument("url", "http://localhost:1060/test/http-client/ping")
    req.setRepresentationClass(java.lang.String.class)
    handles[i-1]=context.issueAsyncRequest(req)
}
result=""
for ( i in 1..handles.size())
{   result+=handles[i-1].join();
}
context.createResponseFrom(result)

Here's the killer part. I have 8 threads. None of my threads are blocking. How do I know this? Because our test is making "re-entrant" requests back to the same instance (ping endpoint). So not only am I sending 20,000 requests concurrently with no blocking. I'm receiving and processing 20,000 requests at the same time on the same system!

Anyone with another platform want to arm wrestle?

Incidentally, its unlikely you'll be be able to try this by default on your operating system since OS's have default limits on how many sockets you can open simultaneously. On most linux distros this is 1024 files (a socket is a file on Unix). I had to change my security policy to be able to do this test. Here's a guide for how you do that...

http://posidev.com/blog/2009/06/04/set-ulimit-parameters-on-ubuntu/

How it works

Maybe there is an official name for the pattern - but I call it "switching horses mid-stream". Here's how its done (referring to the steps in this diagram)...

0. The initial request for active:httpAsyncGet is issued asynchronously. The requesting thread (denoted as red) is immediately returned to the requesting endpoint which can do whatever it wants while it waits for a response. Typically this "dispatcher endpoint" might asynchronously issue a set of related requests (as with the test cases shown above).

1. The kernel (the circle with the blue hour-glass shape - depicting the kernel's thread pool) assigns a thread, shown in blue, and issues the request for execution by the active:httpAsyncGet endpoint. Internally, the HTTP client endpoint uses an NIO client to construct and issue the HTTP request.

2. The kernel thread is returned back to the kernel by the HTTP endpoint by calling context.setNoResponse(). The kernel can now re-use that thread. The blue horse is back in the blue stable ready for another rider.

3. At some future point the HTTP request gets a response from the remote server. The NIO socket acceptor assigns a thread from the HTTP client pool (denoted green). The pending NetKernel request is reinstated and a response for it is sent to the kernel. The green HTTP thread goes back into the HTTP client pool. The green horse is back in the green stable ready for another response from a remote server.

4. The original requestor (step 0.) finally decides they want to know what happened to the HTTP request and calls join() on the asynchronous NetKernel request (like in our examples above). The response is instantly received.

It should be clear that the requestor endpoint can issue a set of requests concurrently and be 100% sure that all the responses will be received within the envelope of the single longest remote server response time.

For the physicists out there - I think of this as being like the group velocity of a wave.

Groovy 2.3 Beta 2

The groovy team is very close to releasing version 2.3, which according to Mr Sletten, has some significant performance enhancements. Its been a while since we touched lang-groovy so it seemed like a good idea to try out the latest and greatest version.

All our tests pass and all of the system tools (that use groovy) work fine. I've been using it for everything for a week with no issues. So it looks like it will a side-effect-free update when the final release comes out. In the mean time if you want to give it a go here it is, we'd love to hear your feedback...

Download: urn.org.netkernel.lang.groovy-2.3.1-beta2-20140411.jar

Have a great weekend.

Comments

Please feel free to comment on the NetKernel Forum

Follow on Twitter:

@pjr1060 for day-to-day NK/ROC updates
@netkernel for announcements
@tab1060 for the hard-core stuff

To subscribe for news and alerts

Join the NetKernel Portal to get news, announcements and extra features.

NetKernel will ROC your world

Download now

NetKernel, ROC, Resource Oriented Computing are registered trademarks of 1060 Research