最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

amazon web services - Deploy TPU TF Serving Model to AWS SageMaker - Stack Overflow

programmeradmin0浏览0评论

I have a couple of pre-trained and tested TensorFlow LSTM models, which have been trained on Google Colab. I want to deploy these models with AWS as our entire application is deployed there.

I've tried deploying with Docker Containers using TensorFlow serving but my models are created with either TPU or NVIDIA GPUs processing units. So my EC2 instances and local enviroments fail to run predictions due to OpKernel errors (Different GPUs or straight up CPUs).

So we started looking into SageMaker documentation, and found that we need to deploy a Notebook Instance on SageMaker, but available instance types do not include TPU instances and available GPU instances do not seem to be NVIDIA compatible. No mention of the term TPU can be seen in AWS documentation as far as I've seen, so my question simply put is:

Can AWS SageMaker Instances run TF Serving models trained in a TPU? If so, how do I select the instance type for this use case?

Similarly, can they run NVIDIA trained GPU models? Which GPU instance type is suitable?

Is there a way to run either of these models in a Docker Container with TF Serving in any of AWS's services (ECS, EC2, SageMaker, etc)? Or are these TPUs exclusive to Google Cloud?

I have a couple of pre-trained and tested TensorFlow LSTM models, which have been trained on Google Colab. I want to deploy these models with AWS as our entire application is deployed there.

I've tried deploying with Docker Containers using TensorFlow serving but my models are created with either TPU or NVIDIA GPUs processing units. So my EC2 instances and local enviroments fail to run predictions due to OpKernel errors (Different GPUs or straight up CPUs).

So we started looking into SageMaker documentation, and found that we need to deploy a Notebook Instance on SageMaker, but available instance types do not include TPU instances and available GPU instances do not seem to be NVIDIA compatible. No mention of the term TPU can be seen in AWS documentation as far as I've seen, so my question simply put is:

Can AWS SageMaker Instances run TF Serving models trained in a TPU? If so, how do I select the instance type for this use case?

Similarly, can they run NVIDIA trained GPU models? Which GPU instance type is suitable?

Is there a way to run either of these models in a Docker Container with TF Serving in any of AWS's services (ECS, EC2, SageMaker, etc)? Or are these TPUs exclusive to Google Cloud?

Share Improve this question asked Feb 26 at 15:51 Manu SiskoManu Sisko 3101 gold badge3 silver badges18 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 1 +50

AWS does not support TPUs, as they are specific to Google Cloud. However, AWS provides alternative purpose-built chipsets like Inferentia and Trainium, both of which can be used for TensorFlow models. Additionally, SageMaker supports NVIDIA GPUs, allowing you to deploy TensorFlow Serving models trained on a GPU.

If your model was trained on a TPU, you can still deploy it on SageMaker by converting it to a format compatible with AWS-supported hardware or NVIDIA GPUs. For Inferentia or Trainium, AWS provides the Neuron SDK, which allows you to compile TensorFlow models to run on these instances.

TensorFlow trained on NVIDIA GPUs

For deploying on NVIDIA GPUs, SageMaker offers various instance types, including ml.p5, ml.p4, ml.g5 and ml.g6, all of which support TensorFlow Serving.

If the model was originally trained on NVIDIA GPUs, it still requires some preparation before being hosted on SageMaker. The model must be exported from TensorFlow, saved to the file system, and structured in a SageMaker-compatible format. Specifically, the model should be placed inside a directory named export/Servo/ and compressed into a .tar.gz file. SageMaker recognizes this format as a valid TensorFlow model and can load it for inference.

import tensorflow as tf
import tarfile

# Save the model in the required format
model.save("export/Servo/1")

# Compress the model directory into a tar.gz file
with tarfile.open("model.tar.gz", "w:gz") as tar:
    tar.add("export")

For a full walkthrough, consider this article: https://saturncloud.io/blog/tensorflow-serving-on-amazon-sagemaker-a-comprehensive-guide/

TensorFlow trained on TPUs

Models trained on TPUs may include TPU-specific operations that must be removed or modified before they can be executed on GPUs. Once converted and saved in TensorFlow’s SavedModel format, the model can be deployed on SageMaker using TensorFlow Serving.

If the model was trained on TPUs, you should export the model to TensorFlow’s SavedModel format, which is the standard for TensorFlow Serving. In many cases, TPU-trained models use tf.function with jit_compile=True or TPU-specific operations like tf.tpu.experimental.outside_compilation, which must be removed or replaced.

After saving, you can check whether the model contains TPU-specific operations that may not work on CPUs or GPUs by using tf.debugging.set_log_device_placement. This function logs the devices (CPU, GPU, or TPU) on which each operation is executed during model execution.

tf.debugging.set_log_device_placement(True)

If the model fails due to missing operations, you may need to replace TPU-specific layers with CPU/GPU-compatible alternatives.

发布评论

评论列表(0)

  1. 暂无评论