Context
Microservice based out of FastAPI microframework with SocketIO server wrapped as ASGI application executed on Uvicorn Webserver.
The service has several socket events written to handle events from single feature and some supporting RESTful APIs(they are not used much).
- There are socketio events that invoke async tasks which has await function calls to ensure there is a single worker which orchestrates all events to the service.
- There is a keep alive async function which emits a keep alive message(every 15secs) to each of the sids connected to ensure it is not disconnected automatically on being idle. This is done to ensure there are events which invoke long running tasks(upto 15 mins) that are not able to emit the messages after the processing is completed.
We are running it on Kubernetes with memory(request 8gb & limit as 16gb) and CPU(request 800m & limit as 4). Event loop used here is asyncio.
Uvicorn is configured with port = 8000 workers= 1, limit_max_requests=1000, limit_concurrency=100, timeout-graceful-shutdown=30.
uvicorn.run(socket_app, host="0.0.0.0", port=8000, workers=1, limit_max_requests=1000, limit_concurrency=100, timeout_graceful_shutdown=30)
Kubernetes has a readiness probe which does a curl to health check API with delay of 60s, timeout of 30s, period of 30s, success as 1 & failure as 4
Readiness: exec [curl localhost:8000/healthz] delay=60s timeout=30s period=30s #success=1 #failure=4
Problem
As the load to the service increases, the readiness probe fails and kubernetes marks the pods as unhealthy leading to clients that are connected to it via SocketIO not able to send or receive messages for events of the feature. Investigation further lead to understanding that TCP connections are in CLOSE_WAIT status and it was increasing in linear fashion until the pod couldn't respond to existing connections. Most of these TCP connections that are in CLOSE_WAIT status are between local address 127.0.0.1:8000 and Foreign address 127.0.0.1:xxxxx. Doesn't have pid attached to it. There are some tcp connections that are in FIN_WAIT2 between 127.0.0.1:xxxxx and 127.0.0.1:8000. And also some connections in CLOSE_WAIT which is between PodIP:8000 to another PodIP:xxxxx of a different service.This could be some APIs calls between services. I understood that application(probably uvicorn) has to close these connections as the client has sent is fin packets from here
What are different configurations and implementations I can add for both Uvicorn and SocketIO to ensure these are handled properly? Or Do we have any other solution that I must look into to make the service functions properly?
Edit1: Adding the libraries used.
fastapi==0.112.0
uvicorn==0.30.5
python-socketio==5.11.3
asyncio==3.4.3
Context
Microservice based out of FastAPI microframework with SocketIO server wrapped as ASGI application executed on Uvicorn Webserver.
The service has several socket events written to handle events from single feature and some supporting RESTful APIs(they are not used much).
- There are socketio events that invoke async tasks which has await function calls to ensure there is a single worker which orchestrates all events to the service.
- There is a keep alive async function which emits a keep alive message(every 15secs) to each of the sids connected to ensure it is not disconnected automatically on being idle. This is done to ensure there are events which invoke long running tasks(upto 15 mins) that are not able to emit the messages after the processing is completed.
We are running it on Kubernetes with memory(request 8gb & limit as 16gb) and CPU(request 800m & limit as 4). Event loop used here is asyncio.
Uvicorn is configured with port = 8000 workers= 1, limit_max_requests=1000, limit_concurrency=100, timeout-graceful-shutdown=30.
uvicorn.run(socket_app, host="0.0.0.0", port=8000, workers=1, limit_max_requests=1000, limit_concurrency=100, timeout_graceful_shutdown=30)
Kubernetes has a readiness probe which does a curl to health check API with delay of 60s, timeout of 30s, period of 30s, success as 1 & failure as 4
Readiness: exec [curl localhost:8000/healthz] delay=60s timeout=30s period=30s #success=1 #failure=4
Problem
As the load to the service increases, the readiness probe fails and kubernetes marks the pods as unhealthy leading to clients that are connected to it via SocketIO not able to send or receive messages for events of the feature. Investigation further lead to understanding that TCP connections are in CLOSE_WAIT status and it was increasing in linear fashion until the pod couldn't respond to existing connections. Most of these TCP connections that are in CLOSE_WAIT status are between local address 127.0.0.1:8000 and Foreign address 127.0.0.1:xxxxx. Doesn't have pid attached to it. There are some tcp connections that are in FIN_WAIT2 between 127.0.0.1:xxxxx and 127.0.0.1:8000. And also some connections in CLOSE_WAIT which is between PodIP:8000 to another PodIP:xxxxx of a different service.This could be some APIs calls between services. I understood that application(probably uvicorn) has to close these connections as the client has sent is fin packets from here
What are different configurations and implementations I can add for both Uvicorn and SocketIO to ensure these are handled properly? Or Do we have any other solution that I must look into to make the service functions properly?
Edit1: Adding the libraries used.
fastapi==0.112.0
uvicorn==0.30.5
python-socketio==5.11.3
asyncio==3.4.3
Share
Improve this question
edited 7 hours ago
Jeyshiv
asked 8 hours ago
JeyshivJeyshiv
256 bronze badges
1 Answer
Reset to default 0Below is an approach that has helped us resolve similar issues. The key is to ensure that both Uvicorn and SocketIO properly detect and clean up idle or disconnected clients, especially when long-running tasks might block the event loop.
Modify Uvicorn’s connection settings
If idle HTTP connections aren’t needed (e.g. for your health check or non-streaming endpoints), you can shorten the
keep-alive
period so that the server closes idle connections quicker.Also, Uvicorn supports multiple HTTP protocol implementations (e.g.
h11
andhttptools
). Depending on the version and your workload, switching or upgrading might affect how FIN packets are handled.Tip: Try explicitly setting the HTTP protocol if you suspect differences in connection handling.
Configure SocketIO
SocketIO uses heartbeat messages (ping/pong) to check that the connection is still alive. Make sure these intervals are configured aggressively enough so that stale or half-closed connections are detected quickly:
import socketio sio = socketio.AsyncServer( ping_interval=10, # seconds between pings ping_timeout=5, # wait time for a pong before disconnecting )
This ensures that even if the underlying TCP connection lingers in CLOSE_WAIT for a bit, your application isn’t holding onto extra resources.
Offload long-running tasks
When you have tasks that may run up to 15 minutes, they can block the event loop and delay processing disconnects or other events. Therefore, you can consider using either background tasks through
asyncio.create_task()
or external task queues.Adjust the health check endpoint and readiness probe
Your Kubernetes readiness probe is calling
/healthz
. Ensure that this endpoint does not maintain keep-alive connections by sending aConnection: close
header:from fastapi import FastAPI from fastapi.responses import JSONResponse app = FastAPI() @app.get("/healthz") async def healthz(): response = JSONResponse({"status": "ok"}) response.headers["Connection"] = "close" # force the connection to close return response
If possible, consider serving health checks on a dedicated port or endpoint so that they do not clash with the SocketIO traffic.
By making these changes, you will make sure that once a client sends a FIN packet, your server can promptly complete the teardown process, consequently preventing a gradual accumulation of half-closed sockets that eventually overwhelm your pod.