最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - How to have a (FastAPI) GKE deployment handle multiple requests? - Stack Overflow

programmeradmin0浏览0评论

I have a FastAPI deployment in GKE that has an end-point /execute that reads and parses a file, something like below:

from fastapi import FastAPI

app = FastAPI()

@app.post("/execute")
def execute(
    filepath: str
):
    res = 0
    with open(filepath, "r") as fo:
         for line in fo.readlines():
              if re.search("Hello", line):
                   res += 1
         return {"message": f"Number of Hello lines = {res}."}

The GKE deployment has 10 pods with a load balancer and service exposing the deployment.

Now, I would like to send 100 different file paths to this deployment. In my mind, I have the following options, and related questions:

  1. Send all 100 requests at the same time and not wait for a response, either using threading, asyncio and aiohttp, or something hacky like this:
for filepath in filepaths:
    try:
        requests.post("http://127.0.0.1:8000/execute?filepath=filepath",timeout=0.0000000001)
    except requests.exceptions.ReadTimeout: 
        pass

Ref:

In this case, what does the GKE load balancer do when it receives a 100 requests - does it deliver 10 requests to each pod at the same time (in which case I would need to make sure a pod has enough resources to handle all the incoming requests at the same time), OR does it have a queuing system delivering a request to a pod only when it is available?

  1. Send 10 requests at a time, so that no pod is working on more than 1 request at any given time. That way, I can have predictable resource usage in a pod and not crash it. But how do I accomplish this in Python? And do I need to change anything in my FastAPI application or GKE deployment configuration?

Any help would be greatly appreciated!

发布评论

评论列表(0)

  1. 暂无评论