最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - Does Snowflake ODBC Driver Support fast_executemany - Issue with varchar(max) columns - Stack Overflow

programmeradmin1浏览0评论

Request your help and guidance on the following queries.

Scenario:

Trying to create a simple python script that utilizes pyodbc and works with various datasources say sqlserver, azuresql, snowflake etc. Basically any source that supports ODBC connections. We have an issue when trying to load from SQL Server to Snowflake. The source contains a column whose datatype is varchar(max). Here are the issues/questions encountered.

Questions:

  1. Does Snowflake ODBC Driver support fast_executemany ? Not able to find documentation that supports this.
  2. If fast_executemany is set to True, I am getting a MemoryError. Of course I have seen various issues and articles that discusses this, but none of the approaches tried seem to fix this. For example have tried both snow_cursor.setinputsizes([(pyodbc.SQL_WVARCHAR, 0, 0)]) and snow_cursor.setinputsizes([(pyodbc.SQL_WVARCHAR, 16777216, 0)]). Both are failing.
  3. If fast_executemany is set to False, the records are getting inserted one by one which is painfully slow.

What would be the right approach to fix the issue.

Sample Code:

import pyodbc

# Snowflake connection parameters
print("Starting script execution...")

# Snowflake connection parameters
conn_params = {
    'DRIVER': 'SnowflakeDSIIDriver',
    'SERVER': '<account>.snowflakecomputing',
    'DATABASE': '<database>',
    'SCHEMA': '<schema>',
    'WAREHOUSE': '<warehouse>',
    'ROLE': '<role>',
    'AUTHENTICATOR': 'snowflake_jwt',
    'PRIV_KEY_FILE': '<key_file_path>',
    'PRIV_KEY_FILE_PWD': '<key_password>',
    'UID': '<username>',
    'CLIENT_SESSION_KEEP_ALIVE': 'TRUE'
}

print("Connection parameters defined...")

# SQL Server connection parameters
sql_params = {
    'DRIVER': '{ODBC Driver 18 for SQL Server}',
    'SERVER': '<server>',
    'DATABASE': '<database>',
    'INSTANCE': '<instance>',
    'ENCRYPT': 'yes',
    'TRUSTSERVERCERTIFICATE': 'yes',
    'CONNECTION_TIMEOUT': '30',
    'UID': '<username>',
    'PWD': '<password>'
}

# Create connection strings
snow_conn_str = ';'.join([f"{k}={v}" for k, v in conn_params.items()])
sql_conn_str = ';'.join([f"{k}={v}" for k, v in sql_params.items()])

try:
    # Connect to SQL Server
    sql_conn = pyodbc.connect(sql_conn_str)
    sql_cursor = sql_conn.cursor()

    # Connect to Snowflake
    snow_conn = pyodbc.connect(snow_conn_str)
    snow_cursor = snow_conn.cursor()
    snow_cursor.fast_executemany = False #True
    # snow_cursor.setinputsizes([(pyodbc.SQL_WVARCHAR, 0, 0)])
    snow_cursor.setinputsizes([(pyodbc.SQL_WVARCHAR, 16777216, 0)])

    # Prepare insert query
    insert_query = """
    INSERT INTO SNOWFLAKE_TABLE
    (COL_01, COL_02, COL_03, COL_04,
    COL_05, COL_06, COL_07, COL_08, COL_09)
    VALUES (?,?,?,?,?,?,?,?,?)
    """

    # Source query
    source_query = "SELECT top 1000 * FROM <source_table> with (nolock)"
    sql_cursor.execute(source_query)
    print("SQL query executed successfully")

    batch_size = 1000
    total_rows = 0

    while True:
        # Fetch batch of rows
        rows = sql_cursor.fetchmany(batch_size)
        print(f"Fetched {len(rows) if rows else 0} rows from SQL Server")
        if not rows:
            break

        # Insert batch into Snowflake
        snow_cursor.executemany(insert_query, rows)
        print(f"Executed batch insert of {len(rows)} rows to Snowflake")
        snow_connmit()
        print("Committed changes to Snowflake")

        total_rows += len(rows)
        print(f"Inserted {len(rows)} rows. Total rows processed: {total_rows}")

    print(f"Successfully completed. Total rows inserted: {total_rows}")

except pyodbc.Error as e:
    print(f"ODBC Error: {str(e)}")
    import traceback
    print(traceback.format_exc())
    raise  # Re-raise to see full error chain
except Exception as f:
    print(f"Unexpected error: {str(f)}")
    import traceback
    print(traceback.format_exc())
    raise  # Re-raise to see full error chain

finally:
    # Close all connections
    for cursor in [sql_cursor, snow_cursor]:
        if cursor in locals():
            cursor.close()
    for conn in [sql_conn, snow_conn]:
        if conn in locals():
            conn.close()

Thanks.. Cheers..

发布评论

评论列表(0)

  1. 暂无评论