I am experimenting with simple HTTP servers and performance testing, using the http module.More specifically, I am using ab (Apache Bench) to test the number of requests per second with various threaded (ThreadingHTTPServer) and non threaded implementations.
The benchmarking tool is running GET / requests. My web server runs forever, intercepts the request and the do_GET(self) method is invoked.
When I set up the do_GET(self) method to actually do something productive, like incrementing a counter in a loop before returning an HTTP response, I clearly see the benefits of threading.
BaseHTTPServer supports nearly 10 requests per second (with a single request, dominated by the for loop, taking approximately 0.1s), which confirms it is running one request at a time. With ThreadedHTTPServer, performance is much better, reaching even 300 requests per second.
However, when I use sleep(0.1) before returning an HTTP response, the number of requests per second seems to be essentially the same regardless of threading, which makes me suspect that sleep blocks more than a single thread.
What could be the reason behind this behavior?
Some additional details
I'm running these tests on a Linux virtual machine with 2 vCPUs, 1 thread per core and 1 core per socket. Python version: 3.7.2
The ab tool is running on the same virtual machine hosting the server, so there is no additional network latency.
Code snippets:
With sleep
def do_GET(self): time.sleep(0.1) self.respond({'status':200})Without sleep
def do_GET(self): dummy = 0 for i in range(0,10000): dummy = dummy + 2 * i self.respond({'status':200})I have measured that, on my setup, a single request without sleep (dominated by the for loop) takes on average 0.12s without threading and 0.03s with threading.
Full code with sleep - single threaded web server
import timefrom http.server import BaseHTTPRequestHandler, HTTPServerclass TestHandler(BaseHTTPRequestHandler): def respond(self, message): http_code = message['status'] self.send_response(http_code) self.wfile.write(bytes(str(message),'UTF-8')) def do_GET(self): time.sleep(0.1) self.respond({'status': 200})if __name__ == '__main__': try: print(time.asctime(), 'Starting') HTTPServer(('localhost',8000),TestHandler).serve_forever() except: print(time.asctime(),'An error was detected')Full code with sleep - multithreaded web server
import timefrom http.server import BaseHTTPRequestHandler, ThreadingHTTPServerimport threadingimport asyncioclass TestHandler(BaseHTTPRequestHandler): async def respond(self, message): await asyncio.sleep(0.1) http_code = message['status'] self.send_response(http_code) self.wfile.write(bytes(str(message),'UTF-8')) def do_GET(self): asyncio.run(self.respond({'status': 200}))if __name__ == '__main__': try: print(time.asctime(), 'Starting') ThreadingHTTPServer(('localhost',8000),TestHandler).serve_forever() except: print(time.asctime(),'An error was detected')Test: ab -n 1000 -c 500 http://localhost:8000/
Test results (sleep):
Single threaded: 8.72 requests per second
Multi threaded: 8.93 requests per second