该程序创建500个客户端进程和一个旋风式HTTP服务器。每个客户端向服务器发送一个帖子。有时我会遇到requests.exceptions.ConnectionError异常。我添加了重试代码来补偿。当客户开始的时候,惊慌失措也有帮助。
我认为requests和tornado模块应该能够在不需要额外编码的情况下处理这个问题。我不确定谁该对此负责。
#! /usr/bin/env python3
import time
import multiprocessing
import random
import requests
import tornado.ioloop
import tornado.web
class Client(multiprocessing.Process):
def __init__(self, client_id):
multiprocessing.Process.__init__(self)
self.client_id = client_id
self.host = 'http://localhost:8888/log'
self.sample_record = 'x'*300
self.start()
def run(self):
# Wait until top of current 5 sec interval so all clients start together.
time.sleep(5 - time.time()%5)
#time.sleep(random.random()) # This seems to eliminate the problem.
# requests.Session will do keep-alive by default.
session = requests.Session()
payload = { 'record': 'x'*300 }
attempt_count = 0
while True:
try:
response = session.post(self.host, data=payload, timeout=10)
break
except requests.exceptions.ConnectionError:
print('Retry: id=', self.client_id)
attempt_count += 1
assert attempt_count < 10
continue
assert response.status_code == 200 and response.text == 'Success'
post_received_count = 0
class LogHandler(tornado.web.RequestHandler):
def post(self):
global post_received_count
post_received_count += 1
if post_received_count%100 == 0:
print('post count=', post_received_count)
self.write('Success')
def make_app():
return tornado.web.Application(
[
(r"/log", LogHandler), # http://localhost:8888/log
],
cookie_secret = "__TODO:_GENERATE_YOUR_OWN_RANDOM_VALUE_HERE__",
)
if __name__ == "__main__":
time.sleep(6 - time.time()%5) # Wait until one sec into next 5 sec interval.
print('Start clients...')
client_list = [Client(i) for i in range(500)]
print('Done.')
app = make_app()
app.listen(8888)
try:
tornado.ioloop.IOLoop.current().start()
except KeyboardInterrupt:
pass在Linux上运行两次输出。一开始没问题。该程序打印所执行的帖子数。它在500个帖子之后就停止了。然后我使用Ctrl-\来杀死服务器。需要在第二次尝试中重试的多个ConnectionError异常。
如果我不对time.sleep(random.random())语句进行注释,它将无异常地运行。这是0到1秒之间的随机睡眠。显然,问题只是当太多的客户同时提交一个请求。
$ ulimit -n 10000
$ test_log_server.py
Start clients...
Done.
post count= 100
post count= 200
post count= 300
post count= 400
post count= 500
^\Quit (core dumped)
$ test_log_server.py
Start clients...
Done.
post count= 100
Retry: id= 223
Retry: id= 340
Retry: id= 116
Retry: id= 164
Retry: id= 258
Retry: id= 150
Retry: id= 290
Retry: id= 16
Retry: id= 40
Retry: id= 5
post count= 200
post count= 300
post count= 400
post count= 500
^\Quit (core dumped)
$如果删除try块,则会得到以下异常。这是500个客户中的一个。其他客户也报告了同样的例外情况。
Process Client-102:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 387, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 383, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.6/http/client.py", line 1373, in getresponse
response.begin()
File "/usr/lib/python3.6/http/client.py", line 311, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.6/http/client.py", line 272, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
ConnectionResetError: [Errno 104] Connection reset by peer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 440, in send
timeout=timeout
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 639, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/lib/python3/dist-packages/urllib3/util/retry.py", line 367, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/lib/python3/dist-packages/six.py", line 692, in reraise
raise value.with_traceback(tb)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 601, in urlopen
chunked=chunked)
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 387, in _make_request
six.raise_from(e, None)
File "<string>", line 3, in raise_from
File "/usr/lib/python3/dist-packages/urllib3/connectionpool.py", line 383, in _make_request
httplib_response = conn.getresponse()
File "/usr/lib/python3.6/http/client.py", line 1373, in getresponse
response.begin()
File "/usr/lib/python3.6/http/client.py", line 311, in begin
version, status, reason = self._read_status()
File "/usr/lib/python3.6/http/client.py", line 272, in _read_status
line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1")
File "/usr/lib/python3.6/socket.py", line 586, in readinto
return self._sock.recv_into(b)
urllib3.exceptions.ProtocolError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "./test_log_server.py", line 31, in run
response = session.post(self.host, data=payload, timeout=10)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 567, in post
return self.request('POST', url, data=data, json=json, **kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 520, in request
resp = self.send(prep, **send_kwargs)
File "/usr/lib/python3/dist-packages/requests/sessions.py", line 630, in send
r = adapter.send(request, **kwargs)
File "/usr/lib/python3/dist-packages/requests/adapters.py", line 490, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))发布于 2021-09-12 19:10:30
显然只有当太多的客户同时提交请求时才会出现问题。
500个客户端流程非常多。一种可能是套接字侦听待办事项处理,当您使用app.listen接口时,它被设置为128。在旧版本的内核中,这里允许的最大版本是128,但最近增加了它(取决于内核版本和其他配置)。尝试将app.listen(8888)替换为:
server = tornado.httpserver.HTTPServer(app)
server.bind(8888, backlog=4096)
server.start()我还建议,如果您真的需要支持同时启动的这么多连接,那么您可能希望运行多个服务器进程,而不是仅仅增加单个进程的套接字积压。
https://stackoverflow.com/questions/68899331
复制相似问题