最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

python - Trouble constructing left join query - Stack Overflow

programmeradmin1浏览0评论

The error is

Unknown database 'second_query'

The variable stats_group can contain one or more comma-separated field names. The query works when entered directly into MySQL Workbench with the variables replaced with literal values. The generated query looks right.

sql = "WITH first_query AS (SELECT " + stats_group + ", COUNT(*) AS count_all FROM " + table_name + " GROUP BY " + stats_group + "), second_query AS (SELECT " + stats_group + ", COUNT(*) AS count_filtered FROM " + table_name + " WHERE latitude <> '' AND longitude <> '' GROUP BY " + stats_group + ") SELECT "
stats_fields = stats_group.split(",")
for stat_field in stats_fields:
    sql = sql + "COALESCE(first_query." + stat_field + ", second_query." + stat_field + ") AS " + stat_field + ", "
sql = sql + "first_query.count_all, second_query.count_filtered FROM first_query LEFT JOIN second_query ON "
and_flag = 0
for stat_field in stats_fields:
    if and_flag == 1:
        sql = sql + "AND "
    sql = sql + "first_query." + stat_field + ", second_query." + stat_field + " "
    and_flag = 1
sql = sql + "ORDER BY " + stats_group + ";"

The error is

Unknown database 'second_query'

The variable stats_group can contain one or more comma-separated field names. The query works when entered directly into MySQL Workbench with the variables replaced with literal values. The generated query looks right.

sql = "WITH first_query AS (SELECT " + stats_group + ", COUNT(*) AS count_all FROM " + table_name + " GROUP BY " + stats_group + "), second_query AS (SELECT " + stats_group + ", COUNT(*) AS count_filtered FROM " + table_name + " WHERE latitude <> '' AND longitude <> '' GROUP BY " + stats_group + ") SELECT "
stats_fields = stats_group.split(",")
for stat_field in stats_fields:
    sql = sql + "COALESCE(first_query." + stat_field + ", second_query." + stat_field + ") AS " + stat_field + ", "
sql = sql + "first_query.count_all, second_query.count_filtered FROM first_query LEFT JOIN second_query ON "
and_flag = 0
for stat_field in stats_fields:
    if and_flag == 1:
        sql = sql + "AND "
    sql = sql + "first_query." + stat_field + ", second_query." + stat_field + " "
    and_flag = 1
sql = sql + "ORDER BY " + stats_group + ";"
Share Improve this question edited 12 hours ago philipxy 15.2k6 gold badges43 silver badges97 bronze badges asked 22 hours ago Bob BurleyBob Burley 92 bronze badges New contributor Bob Burley is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 8
  • 1 you should mention it in question, not in comment. – furas Commented 21 hours ago
  • 3 maybe first you should use print( sql ) to see if you have expected query. And if it doesn't work in code then it should show some error message which you should show in question (not in comments). We can't run your code, we can't see your database, and we can't read in your mind - you have to show all details in question. – furas Commented 21 hours ago
  • did you copy this string (displayed with print( sql )) to workbench to test it? Do you get any errro message when your code in console? Without error we have no idea what is the problem. – furas Commented 21 hours ago
  • Yes, I did print(sql) and it looks correct Provide it. – Akina Commented 21 hours ago
  • 2 Thank you @furas for your comment. Pasting it into workbench allowed me to spot the one wrong character that I couldn't see by looking at it. The corrected line is sql = sql + "first_query." + stat_field + " = second_query." + stat_field + " " – Bob Burley Commented 21 hours ago
 |  Show 3 more comments

3 Answers 3

Reset to default 1

If compatibility with older MySQL versions is required, rewrite the query without CTEs by using subqueries

sql = (
    "SELECT "
)

stats_fields = stats_group.split(",")
for stat_field in stats_fields:
    sql += f"COALESCE(first_query.{stat_field}, second_query.{stat_field}) AS {stat_field}, "

sql += (
    "first_query.count_all, second_query.count_filtered "
    "FROM (SELECT " + stats_group + ", COUNT(*) AS count_all "
    "FROM " + table_name + " GROUP BY " + stats_group + ") AS first_query "
    "LEFT JOIN (SELECT " + stats_group + ", COUNT(*) AS count_filtered "
    "FROM " + table_name + " WHERE latitude <> '' AND longitude <> '' GROUP BY " + stats_group + ") AS second_query "
    "ON "
)

# Add JOIN conditions
and_flag = 0
for stat_field in stats_fields:
    if and_flag == 1:
        sql += "AND "
    sql += f"first_query.{stat_field} = second_query.{stat_field} "
    and_flag = 1

sql += f"ORDER BY {stats_group};"

print(sql) 

Thanks to a suggestion from @furas, I finally spotted the bad character. The comma near the middle should have been an equal sign.

The original line was

sql = sql + "first_query." + stat_field + ", second_query." + stat_field + " "

The corrected line is

sql = sql + "first_query." + stat_field + " = second_query." + stat_field + " "

Others, particularly the answer by szymon-roziewski have highlighted the slight error in your join criteria. As others have suggested, use of an f-string here would likely make your sql easier to read and debug. Just a demo one way you might approach that:

table_name = "geo_data"
stats_group = "country,region,city"
stats_fields = stats_group.split(",")

## --------------
## Create a nicely formatted string representing our COALESCE statements
## --------------
coalesce_statements = [
    f"COALESCE(first_query.{stat_field}, second_query.{stat_field}) AS {stat_field}"
    for stat_field in stats_fields
]
coalesce_statements = ", \n    ".join(coalesce_statements)
## --------------

## --------------
## Create a nicely formatted string representing our JOIN criteria
## --------------
join_conditions = [
    f"first_query.{stat_field} = second_query.{stat_field}"
    for stat_field in stats_fields
]
join_conditions = " AND \n    ".join(join_conditions)
## --------------

## --------------
## Our SQL statement
## --------------
sql = f"""
WITH
    first_query AS (
        SELECT {stats_group}, COUNT(*) AS count_all
        FROM {table_name}
        GROUP BY {stats_group}
    ),
    second_query AS (
        SELECT {stats_group}, COUNT(*) AS count_filtered
        FROM {table_name}
        WHERE latitude <> '' AND longitude <> ''
        GROUP BY {stats_group}
    )

SELECT
    {coalesce_statements},
    first_query.count_all,
    second_query.count_filtered
FROM first_query LEFT JOIN second_query ON
    {join_conditions}
ORDER BY
    {stats_group};
"""
## --------------

print(sql)

That should give you back:

WITH
    first_query AS (
        SELECT country,region,city, COUNT(*) AS count_all
        FROM geo_data
        GROUP BY country,region,city
    ),
    second_query AS (
        SELECT country,region,city, COUNT(*) AS count_filtered
        FROM geo_data
        WHERE latitude <> '' AND longitude <> ''
        GROUP BY country,region,city
    )

SELECT
    COALESCE(first_query.country, second_query.country) AS country, 
    COALESCE(first_query.region, second_query.region) AS region, 
    COALESCE(first_query.city, second_query.city) AS city,
    first_query.count_all,
    second_query.count_filtered
FROM first_query LEFT JOIN second_query ON
    first_query.country = second_query.country AND 
    first_query.region = second_query.region AND 
    first_query.city = second_query.city
ORDER BY
    country,region,city;
发布评论

评论列表(0)

  1. 暂无评论