最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

sql - BigQuery - Sorting a Datetime string - Stack Overflow

programmeradmin2浏览0评论

I have a column of datatype String. Eg: 2025-01-20T23:38:31.8223598Z

If I apply a ORDER BY on this column inside a window function as below:

ROW_NUMBER() OVER (PARTITION BY id ORDER BY modifiedOn DESC) AS rank

Will the sorting be actually based on the DateTime or as a String. Do I need to explicitly convert this to a DateTime/Timestamp before ordering.

Any insights on this please.

I have a column of datatype String. Eg: 2025-01-20T23:38:31.8223598Z

If I apply a ORDER BY on this column inside a window function as below:

ROW_NUMBER() OVER (PARTITION BY id ORDER BY modifiedOn DESC) AS rank

Will the sorting be actually based on the DateTime or as a String. Do I need to explicitly convert this to a DateTime/Timestamp before ordering.

Any insights on this please.

Share Improve this question edited Feb 8 at 3:37 Dale K 27.2k15 gold badges56 silver badges82 bronze badges asked Feb 7 at 18:28 SafiyurSafiyur 1554 silver badges12 bronze badges 2
  • If you don't cast , it's going to sort by however it's stored. Assuming your data is always in that format, it will sort the same. The real question is, why are you storing a timestamp as a string? – Andrew Commented Feb 7 at 18:50
  • Surely you could just try it and see? You should always store your data using the correct datatype, as using functions to convert it can result in poor performance, and/or issues and exceptions if you ever have a row with invalid data. – Dale K Commented Feb 8 at 3:37
Add a comment  | 

1 Answer 1

Reset to default 1

If your column has all the values in the format like your example (YYYY-MM-DDTHH:MM:SS...), it will be chronological sorted right. It's because each part of the datetime is already ordered by itself, a string with year 2020 come first than a string with year 2022. So is based on the largest to smallest unit even if it's string.

I do recommend you to do it explicity and convert, because of data quality, consistency, optimization, etc. You could use PARSE_TIMESTAMP:

ROW_NUMBER() OVER (
  PARTITION BY id 
  ORDER BY PARSE_TIMESTAMP('%Y-%m-%dT%H:%M:%E*SZ', modifiedOn) DESC
) AS rank

Let me know if you need more details or have any doubt!

Resources: https://www.ionos.com/digitalguide/websites/web-development/iso-8601/, https://cloud.google.com/bigquery/docs/reference/standard-sql/timestamp_functions#parse_timestamp

发布评论

评论列表(0)

  1. 暂无评论