Note: This is part two of my previous post on proxies.
When I first started at OpenDNS, my first task was to figure out how Nginx works and write a custom C module for it to handle some business logic. Nginx was going to reverse proxy to Apache Traffic Server (ATS), which would do the actual forward proxying. Here is a simplified diagram:
Nginx turned out to be easy to understand and work with. This was in contrast with ATS, which is bigger, more complex, and just plain not fun. As a result, “Why don’t we just use Nginx for the whole thing?” became a popular question, especially after it was decided that the proxy will not be doing any caching.
Forward Proxy
Though Nginx is a reverse proxy designed to be used with explicitly defined upstreams:
http { upstream myapp1 { server srv1.example.com; server srv2.example.com; server srv3.example.com; } server { listen 80; location / { proxy_pass http://myapp1; } } }
It’s also possible to configure it to use an upstream based on some variable, like the Host header:
http { server { listen 80; location / { proxy_pass http://$http_host$request_uri; } } }
This actually works just fine. The main caveat is the Host header can match a pre-defined upstream{} in the config, if any exist:
http { ... upstream foo { server bar; } ... server { listen 80; location / { proxy_pass http://$http_host$request_uri; } } }
Then a request like this will match foo and be proxied to bar:
GET / HTTP/1.1 Accept: */* Host: foo
The approach can be extended a bit with the use of new variables within a custom module, instead of the built-in $http_host and $request_uri for better destination control, error handling, etc.
That all works wonderfully — note that this is an HTTP (port 80) proxy and we are not considering the HTTPS case here; for one thing, Nginx does not recognize the CONNECT method used in explicit HTTPS proxying so that would never work. As I mentioned in my previous blog post, our Intelligent Proxy takes on a more unconventional approach in general.
A big question is performance. Our initial load tests with ATS resulted in less-than-ideal numbers. Does this Nginx ‘hack’ have any effect on how well it performs?
Load Test
Skipping over the finer details, our setup uses wrk as the load generator and a custom C program as the upstream. The custom upstream is very basic; All it does is accept connections and reply with a static binary blob to any request that looks like HTTP. Connections are never closed explicitly to remove any potential skew in the results from unnecessary extra TCP sessions.
We first establish a benchmark by loading the upstream server directly:
Running 30s test 10 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 3.27ms 680.48us 5.04ms 71.95% Req/Sec 3.21k 350.69 4.33k 69.67% 911723 requests in 30.00s, 3.19GB read 100 total connects (of which 0 were reconnects) Requests/sec: 30393.62 Transfer/sec: 108.78MB
Everything looks good, wrk created 100 connections as expected and managed to squeeze out 30k requests per second.
Now let’s repeat that while going through our Nginx forward proxy (2 workers):
Running 30s test 10 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 6.42ms 14.37ms 211.84ms 99.50% Req/Sec 1.91k 245.53 2.63k 83.75% 552173 requests in 30.00s, 1.95GB read 5570 total connects (of which 5470 were reconnects) Requests/sec: 18406.39 Transfer/sec: 66.53MB
This almost halves the possible throughput.. something is not right.
Doing a few manual requests, we see that going through Nginx doesn’t really add any significant latency. The Nginx workers got close to 100% CPU usage during the test, but bumping the worker count doesn’t help much.
What about the upstream, what does it see in the two cases?
After a quick update to print some stats, everything looks good in the direct case — the numbers reported by wrk and the upstream server match up as expected. But we find something startling in the proxy case when looking at the upstream server stats:
status: 552263 connects, 552263 closes, 30926728 bytes, 552263 packets
Looks like Nginx created a new connection for every single request going upstream, even though wrk only made 100 connections downstream…
Diving into the Nginx core and reading the documentation more thoroughly, things start to make sense. Nginx is a load balancer, where “load” equals requests, not connections. A connection can issue an arbitrary number of requests, and it’s important to equally distribute these among the backends. As it stands, Nginx closes upstream connections after each request. The upstream keepalive module tries to remedy this slightly by keeping a certain minimum number of persistent connections open at all times. Nginx Plus offers extra features like Session Persistence (and by the way, an equivalent open source module exists as well) — enabling requests to be routed to the same upstreams more consistently.
What we really want is a 1-to-1 persistent connection mapping between clients and their respective upstreams. In our case, the upstreams are completely arbitrary and we want to avoid creating unnecessary connections, and more importantly not “sharing” upstream connections in any way. Our session is the whole client connection itself.
The Patch
The solution is fairly straightforward, and we’ve made it available on Github*.
Re-running the load test with this change we get much better results, outlining the importance of keeping TCP connections persistent and avoiding those costly opens/closes:
Running 30s test 10 threads and 100 connections Thread Stats Avg Stdev Max +/- Stdev Latency 10.82ms 48.67ms 332.65ms 97.72% Req/Sec 3.00k 505.22 4.46k 95.81% 854946 requests in 30.00s, 3.02GB read 8600 total connects (of which 8500 were reconnects) Requests/sec: 28498.99 Transfer/sec: 103.01MB
The numbers on the upstream match up to that of wrk:
status: 8600 connects, 8600 closes, 47882016 bytes, 855036 packets
There is still a problem, however. There are 8,600 connections instead of just 100; Nginx decided to close a lot of connections both down and up stream. When debugging to see why, we end up tracing back to “lingering_close_handler”:
...
nginx: _ngx_http_close_request(r=0000000000C260D0) from ngx_http_lingering_close_handler, L: 3218
nginx: ngx_http_close_connection(00007FD41B057A48) from _ngx_http_close_request, L: 3358
...
Since the overall performance even with this behavior is satisfactory, that’s where I left it for the time being.
In Closing
We’ve been running Nginx as a forward HTTP proxy in production for some time now, with virtually no issues. We hope to continue to expand Nginx’s capabilities and push new boundaries going ahead. Keep an eye out for future blog posts and code snippets/patches.
*This is a rewritten patch (the original was a bit hacky), this new code has gone out to production just recently. If any issues creep up, I’ll update the public patch with any adjustments.