最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

airflow - Trigger scripts on Cloudera CDP using MWAA - Stack Overflow

programmeradmin2浏览0评论

I need to run scripts on Cloudera CDP. This needs to be done via MWAA(Managed Workspace for Apache Airflow) on AWS. Here are the details regarding MWAA environment:

  1. Airflow version 2.10.1
  2. Python Version 3.11
  3. Private Webserver
  4. Connected to VPC
  5. Library used: apache-airflow-providers-ssh

Attempt 1:

Create below SSHOperator:

run_script = SSHOperator(
    task_id="run_test_script",
    ssh_conn_id=None,  # No predefined connection, using retrieved values
    command="sudo -u etl_app bash /path/to/test.sh",
    remote_host=hostname,
    username=username,
    password=password,  # Required for password authentication
    dag=dag,
)

Here host, user and password are retrieved from AWS Secrets manager.

This DAG did not get imported in MWAA and gave error: No Module Found error: airflow.providers.ssh

Fix attempted:

Created requirements.txt with following content and placed in the MWAA bucket under requirements folder but got same error. Cloudwatch Logs revealed below error for requirement:

Retrying multiple times and failing with NoConnectionError and says conflicting library versions.

Attempt 2:

Further changed requirements.txt as below:

-c .10.1/constraints-3.11.txt
apache-airflow-providers-ssh

Still facing same error.

Attempt 3:

Placed .whl files in plugin.zip and placed it in S3 bucket for MWAA and referenced same in MWAA environment. Still same error.

How to overcome this and fix?

发布评论

评论列表(0)

  1. 暂无评论