最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - JS UDF that returns a struct for Standard SQLBigQuery and creates two columns - Stack Overflow

programmeradmin1浏览0评论

I’m trying to write a user defined function for BigQuery using Javascript that returns a struct and generates two columns:

CREATE TEMP FUNCTION exampleFunction(exampleString STRING)
  RETURNS STRUCT<index INT64, latency INT64> LANGUAGE js AS
  LANGUAGE js AS 
"""
    var exampleStruct = {1:100, 2:200}
    return exampleStruct;
""";

My query would be something like this:

SELECT
exampleCol,
exampleFunction(stringCol) -- use SELECT AS STRUCT somewhere here? with the aliases “First” and “Second”
FROM
[SOME DATATBASE HERE]

I want the output of exampleFunction(stringCol) to generate two columns (three columns total if we include exampleCol). For example, if exampleCol gave us “SOMETHING”, I’d want to return the columns: “SOMETHING” for exampleCol, 1 for “First”, and 2 for “Second”. It this something that is possible?

I’m having trouble returning a STRUCT from the JS function (unsure if my syntax is off) and getting the query right. For the query, I want to avoid running the JavaScript function twice. Thanks!

I’m trying to write a user defined function for BigQuery using Javascript that returns a struct and generates two columns:

CREATE TEMP FUNCTION exampleFunction(exampleString STRING)
  RETURNS STRUCT<index INT64, latency INT64> LANGUAGE js AS
  LANGUAGE js AS 
"""
    var exampleStruct = {1:100, 2:200}
    return exampleStruct;
""";

My query would be something like this:

SELECT
exampleCol,
exampleFunction(stringCol) -- use SELECT AS STRUCT somewhere here? with the aliases “First” and “Second”
FROM
[SOME DATATBASE HERE]

I want the output of exampleFunction(stringCol) to generate two columns (three columns total if we include exampleCol). For example, if exampleCol gave us “SOMETHING”, I’d want to return the columns: “SOMETHING” for exampleCol, 1 for “First”, and 2 for “Second”. It this something that is possible?

I’m having trouble returning a STRUCT from the JS function (unsure if my syntax is off) and getting the query right. For the query, I want to avoid running the JavaScript function twice. Thanks!

Share Improve this question edited Apr 3, 2019 at 18:07 asked Apr 3, 2019 at 17:39 user7281939user7281939
Add a ment  | 

2 Answers 2

Reset to default 5

I'd like to add to Mikhail Berlyant's answer, which works fine in this situation, but I've run into a problem in slightly different situation.

Instead of using "this" in JavaScript which retains data across rows, I propose using a new object to do it.

In my example, I want to return one more column, with this column value based on value of another existing column. I'll add one more column named "latencyUnder150" which will have value of 'Y' if latency field has value lower than 150, otherwise just leave the field blank.

#standardSQL
CREATE TEMP FUNCTION exampleFunction(exampleString STRING)
  RETURNS STRUCT<index INT64, latency INT64, latencyUnder150 STRING> 
  LANGUAGE js AS 
"""
    arr = exampleString.split(':');
    this.index = arr[0];
    this.latency = arr[1];
    if (this.latency < 150) {
        this.latencyUnder150 = 'Y'
    }
    return this;
""";
WITH `project.dataset.table` AS (
  SELECT 1 exampleCol, '10:100' stringCol UNION ALL
  SELECT 2, '20:200' UNION ALL
  SELECT 3, '30:456'
)
SELECT exampleCol, exampleFunction(stringCol).*
FROM `project.dataset.table`
-- ORDER BY exampleCol   

Using "this" variable in JS, you'll get this answer.

| Row | exampleCol | index | latency | latencyUnder150 |
|-----|------------|-------|---------|-----------------|
| 1   | 1          | 10    | 100     | Y               |
| 2   | 2          | 20    | 200     | Y               |
| 3   | 3          | 30    | 456     | Y               |

You can see that the field latencyUnder150 retains value of "Y" from the first record.

By slightly changing code to use a new object, each row starts without value from previous row.

#standardSQL
CREATE TEMP FUNCTION exampleFunction(exampleString STRING)
  RETURNS STRUCT<index INT64, latency INT64, latencyUnder150 STRING> 
  LANGUAGE js AS 
"""
    var outObj = {}
    arr = exampleString.split(':');
    outObj.index = arr[0];
    outObj.latency = arr[1];
    if (outObj.latency < 150) {
        outObj.latencyUnder150 = 'Y'
    }
    return outObj;
""";
WITH `project.dataset.table` AS (
  SELECT 1 exampleCol, '10:100' stringCol UNION ALL
  SELECT 2, '20:200' UNION ALL
  SELECT 3, '30:456'
)
SELECT exampleCol, exampleFunction(stringCol).*
FROM `project.dataset.table`
-- ORDER BY exampleCol   
| Row | exampleCol | index | latency | latencyUnder150 |
|-----|------------|-------|---------|-----------------|
| 1   | 1          | 10    | 100     | Y               |
| 2   | 2          | 20    | 200     | (null)          |
| 3   | 3          | 30    | 456     | (null)          |

Below example is for BigQuery Standard SQL

#standardSQL
CREATE TEMP FUNCTION exampleFunction(exampleString STRING)
  RETURNS STRUCT<index INT64, latency INT64> 
  LANGUAGE js AS 
"""
    arr = exampleString.split(':');
    this.index = arr[0];
    this.latency = arr[1];
    return this;
""";
WITH `project.dataset.table` AS (
  SELECT 1 exampleCol, '10:100' stringCol UNION ALL
  SELECT 2, '20:200' UNION ALL
  SELECT 3, '30:456'
)
SELECT exampleCol, exampleFunction(stringCol).*
FROM `project.dataset.table`
-- ORDER BY exampleCol   

with result

Row exampleCol  index   latency  
1   1           10      100  
2   2           20      200  
3   3           30      456   

Note: if you want columns to be aliased with First, Second - you can either replace index and latency with respectively first, second like in below example

#standardSQL
CREATE TEMP FUNCTION exampleFunction(exampleString STRING)
  RETURNS STRUCT<first INT64, second INT64> 
  LANGUAGE js AS 
"""
    arr = exampleString.split(':');
    this.first = arr[0];
    this.second = arr[1];
    return this;
""";
SELECT exampleCol, exampleFunction(stringCol).*
FROM `project.dataset.table`  

or you can use below approach

#standardSQL
CREATE TEMP FUNCTION exampleFunction(exampleString STRING)
  RETURNS STRUCT<index INT64, latency INT64> 
  LANGUAGE js AS 
"""
    arr = exampleString.split(':');
    this.index = arr[0];
    this.latency = arr[1];
    return this;
""";
SELECT exampleCol, index AS first, latency AS second   
FROM (
  SELECT exampleCol, exampleFunction(stringCol).*
  FROM `project.dataset.table`
)

with below result in both cases

Row exampleCol  first   second   
1   1           10      100  
2   2           20      200  
3   3           30      456  
发布评论

评论列表(0)

  1. 暂无评论