最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

tensorflow - Pytorch in Azure Synapse causing problems - Stack Overflow

programmeradmin0浏览0评论

I have a notebook in Azure Synapse that is using these libraries

import pandas as pd
import numpy as np
from sqlalchemy import create_engine, text
import sqlalchemy as sa
from azure.core.credentials import AzureKeyCredential
from azure.ai.textanalytics import TextAnalyticsClient
from sentence_transformers import SentenceTransformer, util
import time
import torch

from notebookutils import mssparkutils

Since this month the session stops after the cell with the imports above with the warning below and I can't really find a solution on my own.

/home/trusted-service-user/cluster-env/env/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libc10_cuda.so: cannot open shared object file: No such file or directory
  warn(f"Failed to load image Python extension: {e}")
2024-11-19 11:31:17.000433: I tensorflow/core/platform/cpu_feature_guard:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

I tried to rollback to old versions of Torch and also to force CPU usage but without success.

The input is just the cell with the imports. Then I get the warning. After the warning the Spark session stops with the massage.

Session failed. Run the notebook to start a new session.

This notebook was working just fine 2-3 weeks ago and something out of my sight happened. I also believe that even the warning was there and everything was fine.

Additionally I can provide the logs from Monitoring > Apache Spark applications > Driver (stderr) > Latest, but I think everything there is unrelated to the problem (maybe):

    WARN TokenLibrary [pool-43-thread-2]: Access token cache miss or expired
    2024-11-19 11:24:16,395 ERROR TokenLibrary [pool-43-thread-2]: Unable to determine host value from URI = tokenservice2.westeurope.azuresynapse:443. Using localhost as header value

WARN SQLConf [spark-listener-group-shared]: The SQL config 'spark.sql.legacy.replaceDatabricksSparkAvro.enabled' has been deprecated in Spark v3.2 and may be removed in the future. Use `.format("avro")` in `DataFrameWriter` or `DataFrameReader` instead.

2024-11-19 11:23:32,622 WARN AzureBlobFileSystemStore [Thread-32]: checkDnsEntry: blabla.dfs.core.windows not found in the file /etc/hosts.

I have a notebook in Azure Synapse that is using these libraries

import pandas as pd
import numpy as np
from sqlalchemy import create_engine, text
import sqlalchemy as sa
from azure.core.credentials import AzureKeyCredential
from azure.ai.textanalytics import TextAnalyticsClient
from sentence_transformers import SentenceTransformer, util
import time
import torch

from notebookutils import mssparkutils

Since this month the session stops after the cell with the imports above with the warning below and I can't really find a solution on my own.

/home/trusted-service-user/cluster-env/env/lib/python3.10/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libc10_cuda.so: cannot open shared object file: No such file or directory
  warn(f"Failed to load image Python extension: {e}")
2024-11-19 11:31:17.000433: I tensorflow/core/platform/cpu_feature_guard:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

I tried to rollback to old versions of Torch and also to force CPU usage but without success.

The input is just the cell with the imports. Then I get the warning. After the warning the Spark session stops with the massage.

Session failed. Run the notebook to start a new session.

This notebook was working just fine 2-3 weeks ago and something out of my sight happened. I also believe that even the warning was there and everything was fine.

Additionally I can provide the logs from Monitoring > Apache Spark applications > Driver (stderr) > Latest, but I think everything there is unrelated to the problem (maybe):

    WARN TokenLibrary [pool-43-thread-2]: Access token cache miss or expired
    2024-11-19 11:24:16,395 ERROR TokenLibrary [pool-43-thread-2]: Unable to determine host value from URI = tokenservice2.westeurope.azuresynapse:443. Using localhost as header value

WARN SQLConf [spark-listener-group-shared]: The SQL config 'spark.sql.legacy.replaceDatabricksSparkAvro.enabled' has been deprecated in Spark v3.2 and may be removed in the future. Use `.format("avro")` in `DataFrameWriter` or `DataFrameReader` instead.

2024-11-19 11:23:32,622 WARN AzureBlobFileSystemStore [Thread-32]: checkDnsEntry: blabla.dfs.core.windows not found in the file /etc/hosts.
Share Improve this question edited Nov 19, 2024 at 13:38 halfer 20.4k19 gold badges109 silver badges202 bronze badges asked Nov 19, 2024 at 12:45 Dimitar PetrovDimitar Petrov 11 bronze badge 2
  • What is your requirement ? Could you please provide sample input and output? – Bhavani Commented Nov 19, 2024 at 12:54
  • Post edited, sorry. – Dimitar Petrov Commented Nov 19, 2024 at 13:33
Add a comment  | 

1 Answer 1

Reset to default 0

I have tried installing the libraries

Like below:

%pip install pandas numpy azure-core sqlalchemy textanalytics torch==2.0.1 tensorflow==2.13.0 sentence-transformers==2.2.2

Regarding the:

Itensorflow/core/platform/cpu_feature_guard:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

TensorFlow is designed for high performance, leveraging hardware capabilities to run computations efficiently. It can work with CPUs, GPUs, or TPUs, adapting its code to the hardware available. Some CPUs support advanced operations, like vectorized addition (processing multiple variables simultaneously), which others may not. TensorFlow notifies you that the installed version can utilize AVX and AVX2 instructions—Advanced Vector Extensions that accelerate tasks such as matrix multiplication during forward or backward propagation. This is not an error; it is simply informing you that TensorFlow will optimize for your CPU's capabilities to enhance performance.

If you want to you can disable this messages using:

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
import tensorflow as tf

Results:

print("Available devices:", tf.config.list_physical_devices())

Available devices: [PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]
发布评论

评论列表(0)

  1. 暂无评论