I want to build local Docker container with and Azure CLI & Azure SDKv2, which afterwards I want to replicate in Azure ML. My objective is to have a container which can run YOLO models.
With the following script you can replicate the error:
endpoint_name = "web-svc-local-test"
# create an online endpoint
endpoint = ManagedOnlineEndpoint(
name = endpoint_name,
description="Test a local endpoint",
auth_mode="key"
)
ml_client.online_endpoints.begin_create_or_update(endpoint, local=True)
model = Model(path="model.pt")
env=Environment(
# conda_file="environment/conda.yaml",
conda_file="environment/conda.yaml",
image="mcr.microsoft/azureml/openmpi4.1.0-ubuntu20.04:latest"
)
test_deployment = ManagedOnlineDeployment(
name="websvclocalvideo",
endpoint_name=endpoint_name,
model=model,
environment=env,
code_configuration=CodeConfiguration(
#code=".", scoring_script="failure_2_score_remote_model.py"
#code=".", scoring_script="failure_3_score_invalid_script.py"
#code=".", scoring_script="failure_4_score_script_error.py"
code=".", scoring_script="YOLO11 Finetuned Model/score-copy.py"
),
instance_type="Standard_DS3_v2",
instance_count=1,
)
ml_client.online_deployments.begin_create_or_update(test_deployment, local=True)
The conda.yaml file which I'm using to build upon the Docker image:
name: model-env
channels:
- conda-fe
dependencies:
- python=3.9
- numpy=1.23.5
- pip=23.0.1
- scikit-learn=1.2.2
- scipy=1.10.1
- pip:
- numpy
- pandas==1.1.5
- azureml-defaults==1.53.0
- inference-schema[numpy-support]==1.5.1
- joblib==1.2.0
- torch==2.6.0
- requests==2.32.3
- pillow==10.4.0
- supervision==0.25.1
- opencv-python-headless==4.7.0.68
- ultralytics==8.3.85
This is the error log from the container:
---------------
Liveness Probe: GET 127.0.0.1:31311/
Score: POST 127.0.0.1:31311/score
2025-03-14 15:58:55,845 I [21] gunicorn.error - Starting gunicorn 20.1.0
2025-03-14 15:58:55,845 I [21] gunicorn.error - Listening at: :31311 (21)
2025-03-14 15:58:55,846 I [21] gunicorn.error - Using worker: sync
2025-03-14 15:58:55,847 I [104] gunicorn.error - Booting worker with pid: 104
2025-03-14 15:58:56,365 W [104] azmlinfsrv - AML_FLASK_ONE_COMPATIBILITY is set. However, compatibility patch for Flask 1 has failed. This is only a problem if you use @rawhttp and relies on deprecated methods such as has_key().
Traceback (most recent call last):
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/azureml_inference_server_http/server/create_app.py", line 58, in <module>
patch_flask()
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/azureml_inference_server_http/server/create_app.py", line 33, in patch_flask
patch_werkzeug = LooseVersion(werkzeug.__version__) >= LooseVersion("2.1")
AttributeError: module 'werkzeug' has no attribute '__version__'
Initializing logger
2025-03-14 15:58:56,367 I [104] azmlinfsrv - Starting up app insights client
2025-03-14 15:58:59,093 E [104] azmlinfsrv - Traceback (most recent call last):
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/azureml_inference_server_http/server/user_script.py", line 77, in load_script
main_module_spec.loader.exec_module(user_module)
File "<frozen importlib._bootstrap_external>", line 850, in exec_module
File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
File "/var/azureml-app/YOLO11/YOLO11 Finetuned Model/score-copy.py", line 1, in <module>
from ultralytics import YOLO
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/ultralytics/__init__.py", line 11, in <module>
from ultralytics.models import NAS, RTDETR, SAM, YOLO, FastSAM, YOLOWorld
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/ultralytics/models/__init__.py", line 3, in <module>
from .fastsam import FastSAM
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/ultralytics/models/fastsam/__init__.py", line 3, in <module>
from .model import FastSAM
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/ultralytics/models/fastsam/model.py", line 5, in <module>
from ultralytics.engine.model import Model
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/ultralytics/engine/model.py", line 11, in <module>
from ultralytics.cfg import TASK2DATA, get_cfg, get_save_dir
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/ultralytics/cfg/__init__.py", line 10, in <module>
import cv2
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/cv2/__init__.py", line 181, in <module>
bootstrap()
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/cv2/__init__.py", line 153, in bootstrap
native_module = importlib.import_module("cv2")
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/importlib/__init__.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/azureml_inference_server_http/server/aml_blueprint.py", line 91, in setup
self.user_script.load_script(config.app_root)
File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/azureml_inference_server_http/server/user_script.py", line 79, in load_script
raise UserScriptImportException(ex) from ex
azureml_inference_server_http.server.user_script.UserScriptImportException: Failed to import user script because it raised an unhandled exception
2025-03-14 15:58:59,093 I [104] gunicorn.error - Worker exiting (pid: 104)
2025-03-14 15:58:59,625 I [21] gunicorn.error - Shutting down: Master
2025-03-14 15:58:59,626 I [21] gunicorn.error - Reason: Worker failed to boot.
2025-03-14T15:58:59,658862454+00:00 - gunicorn/finish 3 0
2025-03-14T15:58:59,659978615+00:00 - Exit code 3 is not normal. Killing image.
ERROR conda.cli.main_run:execute(125): `conda run runsvdir /var/runit` failed. (See above for error)
2025-03-14T15:58:59,668900102+00:00 - rsyslog/finish 0 0
2025-03-14T15:58:59,668900902+00:00 - nginx/finish 0 0
2025-03-14T15:58:59,670339080+00:00 - Exit code 0 is not normal. Restarting rsyslog.
2025-03-14T15:58:59,670354181+00:00 - Exit code 0 is not normal. Killing image.
runsvdir: no process found