I am writing a stored procedure with Snowflake's Snowpark API, using Python as the language. I want my procedure to log every time it executes a DML statement. When I write procedures that construct the SQL strings and then execute them via session.call(SQLString)
, this is easy - I can log the SQL string I constructed. This approach looks something like this in a JavaScript procedure:
const EXAMPLE_QUERY = `CALL EXAMPLE_PROCEDURE('${ARG1}', '${ARG2}', '${ARG3}');`;
let exampleStatement = snowflake.createStatement({sqlText: EXAMPLE_QUERY});
log(EXAMPLE_QUERY, "info", `Exit code: ${exitCode}`);
The EXAMPLE_QUERY
is itself a string that I can log directly, and it tells me exactly what DML statement is executed when I read through the logs. Very informative and convenient.
Right now I am trying to use the Snowpark session
object and methods, so I don't have a SQL string I construct in the procedure. What can I get from my session
object to put in my log table? It does not need to be an SQL statement exactly, but I would like to know what is happening to the data. The last line of the below example procedure is the problem I am trying to solve.
CREATE OR REPLACE PROCEDURE EXAMPLE()
RETURNS VARIANT
LANGUAGE PYTHON
RUNTIME_VERSION = '3.9'
PACKAGES = ('snowflake-snowpark-python')
HANDLER = 'run'
AS
$$
def run(session):
PROCEDURE_NAME = 'EXAMPLE'
def log(execution, level, message):
try:
logging_function = 'EXAMPLE_DB.EXAMPLE_SCHEMA.LOGGING' # This UDF already lives in the database
session.call(logging_function, PROCEDURE_NAME, execution, level, message)
except Exception as e:
return {"status": "ERROR", "message": "Error in the logging function: "+str(e)}
try:
log(f'CALL {PROCEDURE_NAME}({ADMIN_DB}, {BUILD_DB}, {BUILD_SCHEMA}, {BUILDS_TO_KEEP})', 'info', 'Begin run')
example_table = "EXAMPLE_DB.EXAMPLE_SCHEMA.EXAMPLE_TABLE"
get_tables_query = session.table(example_table).select('EXAMPLE_COLUMN').distinct()
example_output = [row['EXAMPLE_COLUMN'] for row in get_tables_query.collect()]
log(get_tables_query, 'info', 'Unique tables returned: '+str(len(target_tables))) # Does not work!
I am writing a stored procedure with Snowflake's Snowpark API, using Python as the language. I want my procedure to log every time it executes a DML statement. When I write procedures that construct the SQL strings and then execute them via session.call(SQLString)
, this is easy - I can log the SQL string I constructed. This approach looks something like this in a JavaScript procedure:
const EXAMPLE_QUERY = `CALL EXAMPLE_PROCEDURE('${ARG1}', '${ARG2}', '${ARG3}');`;
let exampleStatement = snowflake.createStatement({sqlText: EXAMPLE_QUERY});
log(EXAMPLE_QUERY, "info", `Exit code: ${exitCode}`);
The EXAMPLE_QUERY
is itself a string that I can log directly, and it tells me exactly what DML statement is executed when I read through the logs. Very informative and convenient.
Right now I am trying to use the Snowpark session
object and methods, so I don't have a SQL string I construct in the procedure. What can I get from my session
object to put in my log table? It does not need to be an SQL statement exactly, but I would like to know what is happening to the data. The last line of the below example procedure is the problem I am trying to solve.
CREATE OR REPLACE PROCEDURE EXAMPLE()
RETURNS VARIANT
LANGUAGE PYTHON
RUNTIME_VERSION = '3.9'
PACKAGES = ('snowflake-snowpark-python')
HANDLER = 'run'
AS
$$
def run(session):
PROCEDURE_NAME = 'EXAMPLE'
def log(execution, level, message):
try:
logging_function = 'EXAMPLE_DB.EXAMPLE_SCHEMA.LOGGING' # This UDF already lives in the database
session.call(logging_function, PROCEDURE_NAME, execution, level, message)
except Exception as e:
return {"status": "ERROR", "message": "Error in the logging function: "+str(e)}
try:
log(f'CALL {PROCEDURE_NAME}({ADMIN_DB}, {BUILD_DB}, {BUILD_SCHEMA}, {BUILDS_TO_KEEP})', 'info', 'Begin run')
example_table = "EXAMPLE_DB.EXAMPLE_SCHEMA.EXAMPLE_TABLE"
get_tables_query = session.table(example_table).select('EXAMPLE_COLUMN').distinct()
example_output = [row['EXAMPLE_COLUMN'] for row in get_tables_query.collect()]
log(get_tables_query, 'info', 'Unique tables returned: '+str(len(target_tables))) # Does not work!
Share
edited 23 hours ago
MackM
asked yesterday
MackMMackM
3,0245 gold badges35 silver badges48 bronze badges
1 Answer
Reset to default 1You can use Query History as a context manager to record queries
From documentation
Create an instance of QueryHistory as a context manager to record queries that are pushed down to the Snowflake database.
And then you can use QueryRecord to get the details like sql_text, query_id etc
With above info, you can collect the DML statements like below
with session.query_history(True) as query_history:
example_table = "TEST.TEST.TABLE1"
df = session.table(example_table).select('X')
for query in query_history.queries:
sql_text = query.sql_text
query_id = query.query_id
I have simplified your procedure for demo purpose
CREATE OR REPLACE PROCEDURE EXAMPLE()
RETURNS VARIANT
LANGUAGE PYTHON
RUNTIME_VERSION = '3.9'
PACKAGES = ('snowflake-snowpark-python')
HANDLER = 'run'
AS
$$
def run(session):
PROCEDURE_NAME = 'EXAMPLE'
try:
with session.query_history(True) as query_history:
example_table = "TEST.TEST.TABLE1"
df = session.table(example_table).select('X')
for query in query_history.queries:
sql_text = query.sql_text
query_id = query.query_id
return (sql_text,query_id)
except Exception as ex:
raise
$$;
This returns