Weeks ago i performed some benchmarks to test Monkey project againts other webservers, i have included the test for one closed source web server which claims to be (or not to be?) the fastest on earth: GWan. My benchmark shows that Monkey is faster than GWan under certain conditions: number of request per second for a file >= 200KB.
One person around our community started to perform his own bencharks and shared his findings with Pierre the author of GWan. Looks like Pierre’s feelings were hurt and his ironic words expressed a strongly disagree because there is a project that performs better than G-Wan. He started to put in doubt my reputation and abused of the usage of my employer’s name:
What makes an expert “cheat”? http://gwan.ch/blog/20120728.html
Oracle and I
Pierre, let’s make the things clear here. Oracle is my employer and it have not any relationship with the Monkey open source project. It looks like a lack of professionalism to abuse of Oracle name to increase visibility and try to pseudo-expose your startup in a different manner, why to mention the ‘Oracle‘ word 12 times without need it ?, there are other ways to positionate your blog with the search engines.
The G-Wan author decided the best way to bencharmk his software, so he went straigforward to benchmark under two specific static conditions:
- KeepAlive : all test are made under keep alive
- Small file sizes: the tests are performed with a file of 100 bytes
If the benchmark method is static as the used here, its useful when measuring current version againts development version, and see how it behave each other, and this is good. But you cannot use the same method to measure other web servers, let’s perform a brief analysis:
The KeepAlive HTTP feature (default in HTTP/1.1), aims to keep the communication channel open between the client and the server to perform multiple request in a FIFO way, this approach reduces latency due to the avoidance of the TCP handshake and less network traffic. In a HTTP KeepAlive session for example, you can perform 1000000 request over the same persistent channel if the server allows that.
As stated before, using HTTP KeepAlive the transfer of user-level data is faster. But there is an important point that we cannot omit to mention when we are under a *benchmark* context: if the web server use threads to balance the work due to SMP or other need, using KeepAlive sessions we will *hide* the overhead of the connection balancing between workers and for hence the internal scheduling and time to start processing each request. So use KeepAlive for benchmarking is useful to measure just a part of the web
server core and *not* a fundament to determinate which is the fastest solution in the world.
From a real-life perspective, an HTTP Client (browser) using KeepAlive, rarely will perform more than 25 request over the same channel. So the tests under KeepAlive does not reflect how the Internet/HTTP world behave.
Small file sizes
G-Wan perform caching over requested files in memory when they are pretty small, so when testing a 100 bytes file, this is not hitting I/O and is not requiring to perform extra expensive system call. This is the common approach over almost all web servers, a few KB of extra memory is not bad to avoid I/O.
Caching is good, but testing againts the same small cached file, it’s just trying to determinate how fast is accessing memory buffer and sending out the same data.
Now having describe how the static benchmark is done, sounds like a good plan to determinate that G-Wan is the fastest solution available testing *only* under KeepAlive mode plus the same small cached file for all requests ?, not really.
Every person who wants to benchmark G-WAN and break one of those rules (KeepAlive/small file), will realize than Monkey and NginX performs faster than G-Wan.
G-Wan author claims that testing in non-keepalive mode is equal to “test the TCP IP stack rather than the user-mode code of the server”, which is totally wrong, a web server depends on TCP/IP stack and meassuring the performance of a web server is more than test a simple access to a memory buffer in user space.
The G-Wan benchmarks are done using a wrapper utility called ‘ab wrapper’, described by Pierre as the “most capable tool” (http://gwan.ch/source/ab.c), the sad part is that it cannot perform well with files of a few KB, it get stuck when used with Weighttp backend or when is used without keepalive mode. It’s a good idea of tool as it takes snapshots of user-mode/kernel-mode stats once a concurrent round of request is finished.
As the code of ab.c it’s not legible and is spawning third party utilities to perform it’s job, i wrote a similar tool but based in proc filesystem, the tool name is ‘watch resources‘ (aka wr) and i have published the code on GitHUB:
So with this new tool i have performed a new set of tests of benchmark using concurrency as ab.c does and having more accurate memory usage for the results.
Benchmark: Monkey v/s G-Wan
This test have been made in the following way:
- Wrapper by Watch Resources tool
- Weighttp backend to stress the server
- The URL tested hits a file of 200KB
- Concurrency tests from 100 to 1000 (100 concurrents to 1000), increasing the level with 10 concurrents.
- KeepAlive enable
- Backend stress tool with 10 workers
- Each round did 500.000 requests
Requests per second
- Monkey did an average of 29231 requests per second
- G-Wan did an average of 17222 request per second
Monkey did 42% more requests than G-Wan under the same conditions
Its expected to see high level of memory consumption as the concurrency hits a major load in the server so the goal is to optimize the resources and reduce memory allocations when they are not necessary:
- Monkey consumed an average of 3.4MB along the whole test
- G-Wan consumed an average of 7.09MB along the whole test
G-Wan consumed 52% more memory than Monkey for the same test
User and system time
In the Linux Kernel (OS) design which comes from Unix family, exist two executions context or virtual address spaces: User space and Kernel Space (or System space). The User space belongs to the container of user applications and related resources plus an interface to communicate to the Kernel, as well the Kernel Space and its roles covers I/O, Memory allocations, Scheduling of user space tasks and others.
All job that occurred in a user process/task and which do not require a direct Kernel intervention is called user space, everything else is Kernel Space. We call User-time to the CPU cycles consumed by the user user task, and we call Kernel/System time to the CPU cycles consumed by the user space task through a system call to the Kernel.
So this metric shows how much CPU time is spent in user and system space, and depending of the point of view it could be good or not so good:
- Monkey user time: 5267 milliseconds
- G-Wan user time: 3830 milliseconds
- Monkey Kernel/System time 36159 milliseconds
- G-Wan Kernel/System time 50999 milliseconds
G-Wan have focused too much into reduce user-time being a non-friendly program running under the Linux Kernel. Monkey is a project devoted to run over a Linux Kernel and that’s the reason about why it runs pretty optimized. Having a basic knowledge of Linux system calls is not hard to achieve a great performance.
In the blog post mentioned earlier in the graphics comments it states:
As opposed to Monkey, with G-Wan, the Kernel is using CPU resources faster than G-Wan user mode CPU usage. As a result, the Linux Kernel is the bottleneck far before G-Wan
The previous comment, denotes a lack of understanding about how the Linux Kernel works internally, there is no knowledge of user/system spaces or scheduling. Referring to the G-Wan project history it comes from Windows so we can excuse the lack of knowledge on Linux. Rarely the Linux Kernel at this operation level is a bottleneck.
For short, in order to avoid misunderstandings i already have learned my lesson and i will provide more detailed benchmarks from now. In the meanwhile G-Wan have a couple of things to fix.. starting from it design ?.
Other points that i forgot to mention:
- The test in Monkey took 25 minutes and 58 seconds. G-Wan took 44 minutes and 23 seconds
- If you want to validate the information on this post, you can download the source reports and graphics from here.
- I am emaling Pierre about this blog post