最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - Azure AI ML - can't build local Docker container - Stack Overflow

programmeradmin2浏览0评论

I want to build local Docker container with and Azure CLI & Azure SDKv2, which afterwards I want to replicate in Azure ML. My objective is to have a container which can run YOLO models.

With the following script you can replicate the error:

endpoint_name = "web-svc-local-test"
# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name = endpoint_name, 
    description="Test a local endpoint",
    auth_mode="key"
)
ml_client.online_endpoints.begin_create_or_update(endpoint, local=True)

model = Model(path="model.pt") 

env=Environment(
        # conda_file="environment/conda.yaml",
        conda_file="environment/conda.yaml", 
        image="mcr.microsoft/azureml/openmpi4.1.0-ubuntu20.04:latest"
    )

test_deployment = ManagedOnlineDeployment(
    name="websvclocalvideo",
    endpoint_name=endpoint_name,
    model=model,
    environment=env,
    code_configuration=CodeConfiguration(
        #code=".", scoring_script="failure_2_score_remote_model.py"
        #code=".", scoring_script="failure_3_score_invalid_script.py"
        #code=".", scoring_script="failure_4_score_script_error.py"
        code=".", scoring_script="YOLO11 Finetuned Model/score-copy.py"
    ),
    instance_type="Standard_DS3_v2",
    instance_count=1, 
)
ml_client.online_deployments.begin_create_or_update(test_deployment, local=True)

The conda.yaml file which I'm using to build upon the Docker image:

name: model-env
channels:
  - conda-fe
dependencies:
  - python=3.9
  - numpy=1.23.5
  - pip=23.0.1 
  - scikit-learn=1.2.2
  - scipy=1.10.1
  - pip:
    - numpy
    - pandas==1.1.5
    - azureml-defaults==1.53.0
    - inference-schema[numpy-support]==1.5.1
    - joblib==1.2.0
    - torch==2.6.0
    - requests==2.32.3
    - pillow==10.4.0
    - supervision==0.25.1
    - opencv-python-headless==4.7.0.68
    - ultralytics==8.3.85

This is the error log from the container:

---------------
Liveness Probe: GET   127.0.0.1:31311/
Score:          POST  127.0.0.1:31311/score

2025-03-14 15:58:55,845 I [21] gunicorn.error - Starting gunicorn 20.1.0
2025-03-14 15:58:55,845 I [21] gunicorn.error - Listening at: :31311 (21)
2025-03-14 15:58:55,846 I [21] gunicorn.error - Using worker: sync
2025-03-14 15:58:55,847 I [104] gunicorn.error - Booting worker with pid: 104
2025-03-14 15:58:56,365 W [104] azmlinfsrv - AML_FLASK_ONE_COMPATIBILITY is set. However, compatibility patch for Flask 1 has failed. This is only a problem if you use @rawhttp and relies on deprecated methods such as has_key().
Traceback (most recent call last):
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/azureml_inference_server_http/server/create_app.py", line 58, in <module>
    patch_flask()
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/azureml_inference_server_http/server/create_app.py", line 33, in patch_flask
    patch_werkzeug = LooseVersion(werkzeug.__version__) >= LooseVersion("2.1")
AttributeError: module 'werkzeug' has no attribute '__version__'

Initializing logger
2025-03-14 15:58:56,367 I [104] azmlinfsrv - Starting up app insights client
2025-03-14 15:58:59,093 E [104] azmlinfsrv - Traceback (most recent call last):
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/azureml_inference_server_http/server/user_script.py", line 77, in load_script
    main_module_spec.loader.exec_module(user_module)
  File "<frozen importlib._bootstrap_external>", line 850, in exec_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
  File "/var/azureml-app/YOLO11/YOLO11 Finetuned Model/score-copy.py", line 1, in <module>
    from ultralytics import YOLO
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/ultralytics/__init__.py", line 11, in <module>
    from ultralytics.models import NAS, RTDETR, SAM, YOLO, FastSAM, YOLOWorld
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/ultralytics/models/__init__.py", line 3, in <module>
    from .fastsam import FastSAM
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/ultralytics/models/fastsam/__init__.py", line 3, in <module>
    from .model import FastSAM
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/ultralytics/models/fastsam/model.py", line 5, in <module>
    from ultralytics.engine.model import Model
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/ultralytics/engine/model.py", line 11, in <module>
    from ultralytics.cfg import TASK2DATA, get_cfg, get_save_dir
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/ultralytics/cfg/__init__.py", line 10, in <module>
    import cv2
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/cv2/__init__.py", line 181, in <module>
    bootstrap()
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/cv2/__init__.py", line 153, in bootstrap
    native_module = importlib.import_module("cv2")
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ImportError: libGL.so.1: cannot open shared object file: No such file or directory

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/azureml_inference_server_http/server/aml_blueprint.py", line 91, in setup
    self.user_script.load_script(config.app_root)
  File "/opt/miniconda/envs/inf-conda-env/lib/python3.9/site-packages/azureml_inference_server_http/server/user_script.py", line 79, in load_script
    raise UserScriptImportException(ex) from ex
azureml_inference_server_http.server.user_script.UserScriptImportException: Failed to import user script because it raised an unhandled exception

2025-03-14 15:58:59,093 I [104] gunicorn.error - Worker exiting (pid: 104)
2025-03-14 15:58:59,625 I [21] gunicorn.error - Shutting down: Master
2025-03-14 15:58:59,626 I [21] gunicorn.error - Reason: Worker failed to boot.
2025-03-14T15:58:59,658862454+00:00 - gunicorn/finish 3 0
2025-03-14T15:58:59,659978615+00:00 - Exit code 3 is not normal. Killing image.
ERROR conda.cli.main_run:execute(125): `conda run runsvdir /var/runit` failed. (See above for error)
2025-03-14T15:58:59,668900102+00:00 - rsyslog/finish 0 0
2025-03-14T15:58:59,668900902+00:00 - nginx/finish 0 0
2025-03-14T15:58:59,670339080+00:00 - Exit code 0 is not normal. Restarting rsyslog.
2025-03-14T15:58:59,670354181+00:00 - Exit code 0 is not normal. Killing image.
runsvdir: no process found
发布评论

评论列表(0)

  1. 暂无评论