I'm trying to export table data (9M rows with 40GB data) on vanilla MySQL RDS to S3 using Glue. I'm testing it out on a smaller instance t3.medium. This is the only job/query running on the DB instance. I see WriteOps happening. On further investigating I see that temporary tables are being created and the count keeps increasing. One of the observation is that the Command that Glue executed is of Execute type and not Query. The query doesn't have any where/groupby/order clause. I'm trying to figure out why this is happening and what the performance impact would If I run it on a production instance.
I also tried just running the query as is
mysql -u admin -p -e "SELECT * FROM <table>" <db> \
--batch --silent --quick | sed 's/\t/,/g' > output.csv
This didn't trigger any Write Ops. I was thinking of a Glue Job as it would be simpler to write in parquet format to s3 for later use.
But the unnecessary WriteOps worry me. Can you help me understand why the Glue job query results in Writes and also am I taking the right approach for archiving the data to S3?