I want to execute a long running task on one pod from another. For this I started using subprocess.Popen
to accomplish this task which does what I need. However, subprocess.Popen
is not asynchronous and thus blocks event loop which prevents other endpoints from executing while process is running. To make this task asynchronous, I switched to asyncio.create_subprocess_exec
, however I am getting errors which I am understanding how to fix.
Here is the command that I am executing, which works on `subprocess.Popen'
kubectl exec -n ns1 devpod-654fcd84dd-r7spk -- /bin/sh -c llamafactory-cli train /home/llama-factory-runtime/.tmpfiles/e2ece994-7f10-442c-aae4-e938381fb67a.yaml > .tmpfiles/e2ece994-7f10-442c-aae4-e938381fb67a.log 2>&1
This is the error I am getting. Can someone help me understand the cause and possible solution to this error? I am thinking, it has to do with where the command or kubectl is executed, but I am fully certain.
No such file or directory
'Traceback (most recent call last):
File "/home/devacc/git/fine-tuning-service/app/k8s/k8s_client.py", line 140, in exec_command_in_pod_asyncio
process = await asyncio.create_subprocess_exec(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/subprocess.py", line 224, in create_subprocess_exec
transport, protocol = await loop.subprocess_exec(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/base_events.py", line 1744, in subprocess_exec
transport = await self._make_subprocess_transport(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/unix_events.py", line 211, in _make_subprocess_transport
transp = _UnixSubprocessTransport(self, protocol, args, shell,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/asyncio/base_subprocess.py", line 36, in __init__
self._start(args=args, shell=shell, stdin=stdin, stdout=stdout,
File "/usr/lib/python3.12/asyncio/unix_events.py", line 820, in _start
self._proc = subprocess.Popen(
^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.12/subprocess.py", line 1026, in __init__
self._execute_child(args, executable, preexec_fn, close_fds,
File "/usr/lib/python3.12/subprocess.py", line 1955, in _execute_child
raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: \'kubectl exec -n dsx-genai-finetuned-models-api dsx-ai-accelerators-llama-factory-654fcd84dd-r7spk -- /bin/sh -c llamafactory-cli train /home/llama-factory-runtime/.tmpfiles/e2ece994-7f10-442c-aae4-e938381fb67a.yaml > .tmpfiles/e2ece994-7f10-442c-aae4-e938381fb67a.log 2>&1\'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/devacc/.vscode-server/extensions/ms-python.debugpy-2024.8.0/bundled/libs/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_resolver.py", line 189, in _get_py_dictionary
attr = getattr(var, name)
^^^^^^^^^^^^^^^^^^
AttributeError: characters_written
'
I tried debugging the code, but haven't found tangible solution.