amazon web services - Is there a way to drop file extensions when using AWS CLI with --recursive?

I am trying to recursively upload parquet files to an AWS S3 bucket using AWS CLI. I want to drop the .parquet and use the file name as the target table name.

So in a directory of table1.parquet, table2.parquet I am to run something like this:

aws s3 cp ./MyDir s3://mybucket/ --recursive

Where I get the below error, which makes sense because the expected table is table1 not table1.parque:

s3://mybucket/table1.parquet is not found

Ideally I would be able to specify in my CLI statement something like, where filename changes to table1, table2 etc:

aws s3 cp ./MyDir s3://mybucket/{filename} --recursive

I am trying to recursively upload parquet files to an AWS S3 bucket using AWS CLI. I want to drop the .parquet and use the file name as the target table name.

So in a directory of table1.parquet, table2.parquet I am to run something like this:

aws s3 cp ./MyDir s3://mybucket/ --recursive

Where I get the below error, which makes sense because the expected table is table1 not table1.parque:

s3://mybucket/table1.parquet is not found

Ideally I would be able to specify in my CLI statement something like, where filename changes to table1, table2 etc:

aws s3 cp ./MyDir s3://mybucket/{filename} --recursive

Share Improve this question edited Nov 19, 2024 at 20:51 John Rotenstein 270k28 gold badges446 silver badges530 bronze badges Recognized by AWS Collective asked Nov 19, 2024 at 15:23 clueless 111 silver badge1 bronze badge

1 Are you saying that you get the "is not found" error while running the aws s3 cp command? That's very strange, since it shouldn't be looking for any particular table names in the destination. What do you mean by "which makes sense"? Can you explain more? (Oh, and perhaps try changing ./MyDir into ./MyDir/?) – John Rotenstein Commented Nov 19, 2024 at 20:54

Add a comment |

1 Answer 1

Sorted by: Reset to default 0

The AWS CLI does not have a built-in feature to rename files during upload directly. However, you can achieve your goal by using a script. Here’s a simple script in Bash to upload Parquet files to S3 and rename them by dropping the .parquet extension:

#!/bin/bash

# Directory containing the Parquet files
SOURCE_DIR="./MyDir"
# Target S3 bucket
S3_BUCKET="s3://mybucket/"

# Loop through all .parquet files in the directory
for filepath in "$SOURCE_DIR"/*.parquet; do
  # Extract the filename without the path
  filename=$(basename "$filepath")
  
  # Remove the .parquet extension
  target_name="${filename%.parquet}"

  # Upload the file to S3 with the new name
  aws s3 cp "$filepath" "$S3_BUCKET$target_name"
  
  if [ $? -eq 0 ]; then
    echo "Uploaded $filepath as $target_name"
  else
    echo "Failed to upload $filepath"
  fi
done

This will upload all .parquet files from the ./MyDir directory to your S3 bucket, using the filename (without .parquet) as the key.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

amazon web services - Is there a way to drop file extensions when using AWS CLI with --recursive? - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)