最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

amazon web services - How to properly connect AWS SageMaker studio instance to OpenSearch Database - Stack Overflow

programmeradmin7浏览0评论

I am trying to connect an OpenSearch domain to the code editor running in SageMaker AI. I have created the OpenSearch instance in the same VPC and subnet, than the SageMaker domain. Next, I have added both to the same security group and allowed all incoming traffic and all incoming http traffic from within the group:

Security Group Rule ID IP Version Type Protocol Port Range Source Description
sgr-*************** IPv4 All traffic All All 0.0.0.0/16 -
sgr-*************** IPv4 HTTPS TCP 443 172.31.0.0/16 -
sgr-*************** - All traffic All All sg-*************** -

I am trying to connect an OpenSearch domain to the code editor running in SageMaker AI. I have created the OpenSearch instance in the same VPC and subnet, than the SageMaker domain. Next, I have added both to the same security group and allowed all incoming traffic and all incoming http traffic from within the group:

Security Group Rule ID IP Version Type Protocol Port Range Source Description
sgr-*************** IPv4 All traffic All All 0.0.0.0/16 -
sgr-*************** IPv4 HTTPS TCP 443 172.31.0.0/16 -
sgr-*************** - All traffic All All sg-*************** -

Next, the settings of the SageMaker domain are as follows (vpc and subnet are blurred but they match with the OpenSearch instance):

{
  "DomainId": "d-************",
  "DomainName": "QuickSetupDomain-20250320T091943",
  "Status": "InService",
  "AuthMode": "IAM",
  "DefaultUserSettings": {
    "ExecutionRole": "arn:aws:iam::************:role/service-role/AmazonSageMaker-ExecutionRole-************",
    "SharingSettings": {
      "NotebookOutputOption": "Allowed",
      "S3OutputPath": "s3://sagemaker-studio-************-************/sharing"
    },
    "JupyterServerAppSettings": {
      "DefaultResourceSpec": {
        "SageMakerImageArn": "arn:aws:sagemaker:eu-north-1:************:image/jupyter-server-3",
        "InstanceType": "system"
      }
    },
    "CanvasAppSettings": {
      "EmrServerlessSettings": {
        "ExecutionRoleArn": "arn:aws:iam::************:role/service-role/AmazonSageMakerCanvasEMRSExecutionAccess-************",
        "Status": "ENABLED"
      }
    },
    "DefaultLandingUri": "studio::",
    "StudioWebPortal": "ENABLED",
    "StudioWebPortalSettings": {
      "HiddenAppTypes": ["JupyterServer"]
    },
    "AutoMountHomeEFS": "Enabled"
  },
  "DomainSettings": {
    "SecurityGroupIds": ["sg-************"]
  },
  "AppNetworkAccessType": "PublicInternetOnly",
  "SubnetIds": ["subnet-************"],
  "Url": "https://d-************.studio.eu-north-1.sagemaker.aws",
  "VpcId": "vpc-************"
}

Next, the OpenSearch setup is as follows:

{
  "DomainId": "************/my-vector-db",
  "DomainName": "my-vector-db",
  "Endpoints": {
    "vpc": "vpc-my-vector-db-************.eu-north-1.es.amazonaws"
  },
  "EngineVersion": "OpenSearch_2.17",
  "ClusterConfig": {
    "InstanceType": "or1.medium.search",
    "InstanceCount": 1
  },
  "VPCOptions": {
    "VPCId": "vpc-************",
    "SubnetIds": ["subnet-************"],
    "AvailabilityZones": ["eu-north-1a"],
    "SecurityGroupIds": ["sg-************"]
  },
  "AccessPolicies": {
    "Statement": [
      {
        "Effect": "Allow",
        "Principal": {
          "AWS": "arn:aws:iam::************:role/service-role/AmazonSageMaker-ExecutionRole-************"
        },
        "Action": "es:*",
        "Resource": "arn:aws:es:eu-north-1:************:domain/my-vector-db/*"
      }
    ]
  },
  "IPAddressType": "ipv4",
  "EncryptionAtRestOptions": {
    "Enabled": true
  },
  "NodeToNodeEncryptionOptions": {
    "Enabled": true
  },
  "DomainEndpointOptions": {
    "EnforceHTTPS": true
  },
  "AdvancedSecurityOptions": {
    "Enabled": true
  },
  "DomainProcessingStatus": "Active"
}

I added the sagemaker role to the OpenSearch full access policy and access via boto3.

client = boto3.client("opensearch")
response = client.describe_domain(DomainName="my-vector-db")

However, when I create the client:

import boto3
from opensearchpy import OpenSearch, RequestsHttpConnection
from requests_aws4auth import AWS4Auth

# Set OpenSearch VPC endpoint
host = "vpc-my-vector-*********************************.eu-north-1.es.amazonaws"

# Get AWS credentials
session = boto3.Session()
credentials = session.get_credentials()
region = "eu-north-1"

# Setup AWS authentication
auth = AWS4Auth(credentials.access_key, credentials.secret_key, region, "es", session_token=credentials.token)

# Connect to OpenSearch
client = OpenSearch(
    hosts=[{"host": host, "port": 443}],
    http_auth=auth,
    use_ssl=True,
    verify_certs=True,
    connection_class=RequestsHttpConnection,
)

# Test connection
print(client.info())

It errors out with a connection timeout trace. How can I fix my connectivity issues here?

Share Improve this question asked Mar 21 at 13:20 Amadou cisseAmadou cisse 214 bronze badges
Add a comment  | 

1 Answer 1

Reset to default 0

The timeout is likely from network configuration from the SageMaker domain setting AppNetworkAccessType: PublicInternetOnly, which means Studio cannot access private VPC endpoints like your VPC-only OpenSearch domain.

You need to set AppNetworkAccessType to VpcOnly so SageMaker Studio can access VPC resources (like OpenSearch). Currently, with PublicInternetOnly, the Studio app doesn't get VPC network access.

Unfortunately, you can't change this value in-place for an existing SageMaker domain. You’ll need to recreate the SageMaker domain with AppNetworkAccessType=VpcOnly. This also means Internet access is disabled by default. To allow internet access, you must specify a NAT gateway.

Please refer to VPC configuration section from below AWS doc
https://docs.aws.amazon/sagemaker/latest/APIReference/API_CreateDomain.html

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论