Error: .apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
"com.amazonaws.auth.profile.DefaultAWSCredentialsProviderChain")
.config("spark.hadoop.fs.s3a.access.key",
AWSHandler.get_session(Constant.aws_sso_profile).get_credentials().access_key)
.config("spark.hadoop.fs.s3a.secret.key",
AWSHandler.get_session(Constant.aws_sso_profile).get_credentials().secret_key)
.config("spark.hadoop.fs.s3a.impl", ".apache.hadoop.fs.s3a.S3AFileSystem")
.config('spark.executor.instances', 4).getOrCreate()
)
return spark
In production, hard coding the access and secret key is not allowed which leaves me with either this apporach of getting the access from .aws
Error: .apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : com.amazonaws.SdkClientException: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
"com.amazonaws.auth.profile.DefaultAWSCredentialsProviderChain")
.config("spark.hadoop.fs.s3a.access.key",
AWSHandler.get_session(Constant.aws_sso_profile).get_credentials().access_key)
.config("spark.hadoop.fs.s3a.secret.key",
AWSHandler.get_session(Constant.aws_sso_profile).get_credentials().secret_key)
.config("spark.hadoop.fs.s3a.impl", ".apache.hadoop.fs.s3a.S3AFileSystem")
.config('spark.executor.instances', 4).getOrCreate()
)
return spark
In production, hard coding the access and secret key is not allowed which leaves me with either this apporach of getting the access from .aws
Share Improve this question edited Jan 20 at 1:26 OneCricketeer 192k20 gold badges142 silver badges267 bronze badges asked Jan 19 at 19:54 shishi 19 1- idownvotedbecau.se/noresearch – OneCricketeer Commented Jan 20 at 1:30
2 Answers
Reset to default 0In production, hard coding the access and secret key is not allowed
Exactly
This is why you'd use ENVIRONMENT VARIABLES mentioned in the error; Read it and actually understand before posting here.
Also read the AWS documentation.
no need for explicit credentials using DefaultCredentialProvider is enough AWS Java SDK way :
```DefaultCredentialsProvider.create()``` = Uses ~/.aws/credentials or environment
import software.amazon.awssdk.auth.credentials.DefaultCredentialsProvider;
import software.amazon.awssdk.services.s3.S3Client;
import software.amazon.awssdk.services.s3.model.GetObjectRequest;
import software.amazon.awssdk.services.s3.model.S3Object;
import software.amazon.awssdk.services.s3.model.S3Exception;
// Uses ~/.aws/credentials or environment
try (S3Client s3 = S3Client.builder()
.credentialsProvider(DefaultCredentialsProvider.create())
.build()) {
}
} catch (S3Exception e) {
System.err.println("Error occurred: " + e.awsErrorDetails().errorMessage());
}
Now ... Translating in to sparksession way
spark = SparkSession.builder \
.appName("S3A Example Without Explicit Credentials") \
.config("spark.hadoop.fs.s3a.aws.credentials.provider", "com.amazonaws.auth.DefaultAWSCredentialsProviderChain") \
.config("spark.hadoop.fs.s3a.endpoint", "s3.amazonaws") \
.config("spark.hadoop.fs.s3a.fast.upload", "true") \
.config("spark.hadoop.fs.s3a.multipart.size", "104857600") \
.config("spark.hadoop.fs.s3a.threads.max", "10") \
.getOrCreate()
Sequence of resolution :
Environment variables: AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY.
Java system properties: aws.accessKeyId and aws.secretKey.
AWS credentials file: By default, that is ~/.aws/credentials
.
Instance profile credentials: Automatically populated for EC2 or AWS containers.