最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Translate excel formulas into Redshift SQL query - Stack Overflow

programmeradmin1浏览0评论

I am looking for help to translate excel formulas into Redshift SQL . With given fixed input values for rows 1 (Month ascending) and 2 (SU - some number) I have to calculate through particular cells to get value for D8.

Complexity is, that formulas reference values in recurring levels/previous column/earlier months

Let me provide calculations for one column, as an example let's take column D

D1 - Month
D2 - SU 
D3 - C7 
D4 - C4 + B7
D5 - D3 * 1.54 + D4
D6 - D2 - D5
D7 - D6 / 2.38
D8 - D3 + D4 + D7

Below queries to build and populate exemplary table and populate it with data.


  CREATE TABLE TESTOSS
   (    PERIOD_MONTH VARCHAR2(20 CHAR), 
    SU NUMBER
   );
   
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2020-09',503);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2020-10',1109);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2020-11',1537);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2020-12',2327);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-01',3000);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-02',4000);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-03',5000);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-04',3292);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-05',3721);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-06',4149);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-07',4578);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-08',5006);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-09',5435);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-10',5863);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-11',6292);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-12',6720);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2022-02',7577);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2022-01',7149);

This one was once solved in: Translate excel formulas into SQL query using an ORACLE function MODEL, which is not present in Redshift. So far I didn't came up with the idea how to translate it properly.

I am looking for help to translate excel formulas into Redshift SQL . With given fixed input values for rows 1 (Month ascending) and 2 (SU - some number) I have to calculate through particular cells to get value for D8.

Complexity is, that formulas reference values in recurring levels/previous column/earlier months

Let me provide calculations for one column, as an example let's take column D

D1 - Month
D2 - SU 
D3 - C7 
D4 - C4 + B7
D5 - D3 * 1.54 + D4
D6 - D2 - D5
D7 - D6 / 2.38
D8 - D3 + D4 + D7

Below queries to build and populate exemplary table and populate it with data.


  CREATE TABLE TESTOSS
   (    PERIOD_MONTH VARCHAR2(20 CHAR), 
    SU NUMBER
   );
   
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2020-09',503);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2020-10',1109);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2020-11',1537);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2020-12',2327);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-01',3000);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-02',4000);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-03',5000);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-04',3292);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-05',3721);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-06',4149);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-07',4578);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-08',5006);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-09',5435);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-10',5863);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-11',6292);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2021-12',6720);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2022-02',7577);
Insert into TESTOSS (PERIOD_MONTH,SU) values ('2022-01',7149);

This one was once solved in: Translate excel formulas into SQL query using an ORACLE function MODEL, which is not present in Redshift. So far I didn't came up with the idea how to translate it properly.

Share Improve this question asked Nov 17, 2024 at 13:45 KondjitsuKondjitsu 211 gold badge1 silver badge6 bronze badges 2
  • ` WITH RECURSIVE cte AS ( SELECT PERIOD_MONTH, SU, ROW_NUMBER() OVER (ORDER BY PERIOD_MONTH) AS rn, 0 AS t1, 0 AS t2, 0 AS rp_su, 0 AS n_su, 0 AS np, 0 AS tpe FROM TESTOSS UNION ALL SELECT t.PERIOD_MONTH, t.SU, t.rn, c.np AS t1, c.t2 + c.np AS t2, c.t1 * 1.54 + c.t2 AS rp_su, t.SU - c.t1 * 1.54 - c.t2 AS n_su, (t.SU - c.t1 * 1.54 - c.t2) / 2.38 AS np, c.t1 + c.t2 + (t.SU - c.t1 * 1.54 - c.t2) / 2.38 AS tpe FROM TESTOSS t JOIN cte c ON t.rn = c.rn + 1 ) SELECT * FROM cte; ` – samhita Commented Nov 17, 2024 at 16:01
  • I could not test it as I dont have redshift, so putting it here in the comment. I tried converting the oracle logic to redshift, let me know if it returns the expected output – samhita Commented Nov 17, 2024 at 16:03
Add a comment  | 

2 Answers 2

Reset to default 1

This is a recursive problem so a recursive CTE is what you need.

SET UP:

CREATE TABLE testoss(    
   period_month VARCHAR(20), 
   su INT
);
   
INSERT INTO testoss (period_month, su) VALUES 
('2020-09',503),
('2020-10',1109),
('2020-11',1537),
('2020-12',2327),
('2021-01',3000),
('2021-02',4000),
('2021-03',5000),
('2021-04',3292),
('2021-05',3721),
('2021-06',4149),
('2021-07',4578),
('2021-08',5006),
('2021-09',5435),
('2021-10',5863),
('2021-11',6292),
('2021-12',6720),
('2022-02',7577),
('2022-01',7149);

QUERY:

WITH RECURSIVE build(period_month, su, t1, t2, rp_su, n_su, np, tpe, cur_rn, max_rn) AS (
   SELECT period_month, su, t1, t2, rp_su, n_su, np, tpe, cur_rn, max_rn
   from (SELECT period_month, su, 0::float AS t1, 0::float AS t2, 0::float AS rp_su, 
      su::float AS n_su, n_su / 2.38 AS np, np AS tpe, 
      ROW_NUMBER() OVER (ORDER BY PERIOD_MONTH) AS rn, 2::int as cur_rn, count(1) over() as max_rn
   FROM testoss ) where rn = 1
   UNION ALL 
   SELECT t.period_month, t.su, b.np as t1, b.t2 + b.t1 AS t2,
      b.np * 1.54 + b.t2 + b.t1 AS rp_su, t.su - (b.np * 1.54 + b.t2 + b.t1) AS n_su, 
      (t.su - (b.np * 1.54 + b.t2 + b.t1)) / 2.38 AS np, b.np + b.t2 + b.t1 + (t.su - (b.np * 1.54 + b.t2 + b.t1)) / 2.38 AS tpe, 
      b.cur_rn + 1 AS cur_rn, b.max_rn
   FROM build b 
   JOIN (SELECT period_month, su, lag(period_month) over(order by period_month) prev_period, 
      ROW_NUMBER() OVER (ORDER BY PERIOD_MONTH) AS rn from testoss) t 
   ON t.prev_period = b.period_month
   WHERE t.rn = b.cur_rn AND b.cur_rn <= b.max_rn) 
SELECT period_month, su, t1, t2, rp_su, n_su, np, tpe FROM build order by period_month, cur_rn;

RESULT (CSV)

period_month,su,t1,t2,rp_su,n_su,np,tpe
2020-09,503,0,0,0,503,211.34453781512607,211.34453781512607
2020-10,1109,211.34453781512607,0,325.47058823529414,783.5294117647059,329.2140385565991,540.5585763717252
2020-11,1537,329.2140385565991,211.34453781512607,718.3341571922888,818.6658428077112,343.9772448771896,884.5358212489148
2020-12,2327,343.9772448771896,540.5585763717252,1070.2835334825972,1256.7164665174028,528.0321287888247,1412.5679500377396
2021-01,3000,528.0321287888247,884.5358212489148,1697.7052995837048,1302.2947004162952,547.1826472337375,1959.750597271477
2021-02,4000,547.1826472337375,1412.5679500377396,2255.229226777695,1744.7707732223048,733.0969635387836,2692.847560810261
2021-03,5000,733.0969635387836,1959.750597271477,3088.719921121204,1911.2800788787958,803.058856671763,3495.9064174820237
2021-04,3292,803.058856671763,2692.847560810261,3929.5582000847758,-637.5582000847758,-267.88159667427556,3228.0248208077483
2021-05,3721,-267.88159667427556,3495.9064174820237,3083.3687586036394,637.6312413963606,267.91228630099187,3495.9371071087403
2021-06,4149,267.91228630099187,3228.0248208077483,3640.6097417112755,508.39025828872445,213.60935222215315,3709.5464593308934
2021-07,4578,213.60935222215315,3495.9371071087403,3824.895509530856,753.1044904691439,316.4304581803126,4025.976917511206
2021-08,5006,316.4304581803126,3709.5464593308934,4196.849364928575,809.1506350714253,339.97925843337197,4365.956175944578
2021-09,5435,339.97925843337197,4025.976917511206,4549.544975498599,885.4550245014007,372.0399262610927,4737.9961022056705
2021-10,5863,372.0399262610927,4365.956175944578,4938.89766238666,924.1023376133398,388.2782931148487,5126.274395320519
2021-11,6292,388.2782931148487,4737.9961022056705,5335.9446736025375,956.0553263974625,401.70391865439603,5527.978313974915
2021-12,6720,401.70391865439603,5126.274395320519,5744.898430048289,975.1015699517111,409.7065419965173,5937.684855971433
2022-01,7149,409.7065419965173,5527.978313974915,6158.926388649552,990.0736113504481,415.9973156934656,6353.682171664898
2022-02,7577,415.9973156934656,5937.684855971433,6578.3207221393695,998.6792778606305,419.6131419582481,6773.295313623146

EXPLANATION

  1. This recursive CTE plucks the first row off the test table and calculates the few values that don't need previous rows' data.
  2. Then this joined with the next row from the input table using the cur_rn column and row numbers of the source table. The join is done by using a lag() to get the previous period_month and joining this with the previously created results.
  3. Calculations only need t1 and t2 from the previous row which has been joined.
  4. Calculate values of current row and iterate.

If you have questions just ask in a comment.

With the sample data provided ...

period_month su
2020-09 503
2020-10 1109
2020-11 1537
2020-12 2327
2021-01 3000
2021-02 4000
2021-03 5000
2021-04 3292
2021-05 3721
2021-06 4149
2021-07 4578
2021-08 5006
2021-09 5435
2021-10 5863
2021-11 6292
2021-12 6720
2022-02 7577
2022-01 7149

... create a recursive cte with your table selfjoined on two consecutive rows so you can have together the values from both rows ---

WITH 
  Recursive grid ( RN, KEY, PERIOD_MONTH, SU, rownum,  
                   t1_rownum, t1_period_month, t1_su
                 ) AS
    ( SELECT t.RN, t.KEY, t.PERIOD_MONTH, t.SU, t.rownum,  
             t1.rownum, t1.PERIOD_MONTH, t1.SU
      FROM ( Select       1 as RN, 'Month' as KEY, PERIOD_MONTH, SU, 
                          Row_Number() Over(Partition By 1 Order By PERIOD_MONTH) as rownum
             From       TESTOSS ) t
      INNER JOIN ( Select     1 as RN, 'Month' as KEY, PERIOD_MONTH, SU, 
                              Row_Number() Over(Partition By 1 Order By PERIOD_MONTH) as rownum, 
                              PERIOD_MONTH as "C"
                   From       TESTOSS ) t1 ON(t1.RN = t.RN And t1.rownum - 1 = t.rownum )
     Union All
      Select      RN + 1, Case When RN + 1 = 2 Then 'SU'
                               When RN + 1 = 3 Then 'T1'
                               When RN + 1 = 4 Then 'T2'
                               When RN + 1 = 5 Then 'RP_SU'
                               When RN + 1 = 6 Then 'N_SU'
                               When RN + 1 = 7 Then 'NP'
                               When RN + 1 = 8 Then 'TPE'
                         End, 
                 PERIOD_MONTH, SU, rownum, 
                 t1_rownum, t1_period_month, t1_su
      From       grid 
      Where      RN < 8
    ) 

... use Case expressions to "populate cells"

--      M a i n    S Q L :
Select    RN, KEY as "A", 
          Case When rownum = 1 And RN = 1 Then 'Dummy'::Varchar(32) Else 0::Varchar(32) End as "B",
          Case When rownum = 1 And RN = 1 Then PERIOD_MONTH::Varchar(32)
               When rownum = 1 And RN = 2 Then SU::Varchar(32)
               When rownum = 1 And RN = 3 Then  0::Varchar(32)
               When rownum = 1 And RN = 4 Then 0::Varchar(32)
               When rownum = 1 And RN = 5 Then 0::Varchar(32)
               When rownum = 1 And RN = 6 Then SU::Varchar(32)
               When rownum = 1 And RN = 7 Then Round(SU / 2.38, 4)::Varchar(32)
               When rownum = 1 And RN = 8 Then Round(SU / 2.38, 4)::Varchar(32)
          End as "C", 
            --
          Case When t1_rownum = 2 And RN = 1 Then t1_PERIOD_MONTH::Varchar(32)
               When t1_rownum = 2 And RN = 2 Then t1_SU::Varchar(32)
               When t1_rownum = 2 And RN = 3 Then Round(SU / 2.38, 4)::Varchar(32)    
               When t1_rownum = 2 And RN = 4 Then 0::Varchar(32)
               When t1_rownum = 2 And RN = 5 Then Round(Round(SU / 2.38, 4) * 1.54, 4)::Varchar(32)
               When t1_rownum = 2 And RN = 6 Then (t1_SU - Round(Round(SU / 2.38, 4) * 1.54, 4) )::Varchar(32)
               When t1_rownum = 2 And RN = 7 Then ( Round(( t1_SU - Round(Round(SU / 2.38, 4) * 1.54, 4) ) / 2.38, 4) )::Varchar(32)
               When t1_rownum = 2 And RN = 8 Then ( Round(SU / 2.38, 4) + 0 + (( Round(( t1_SU - Round(Round(SU / 2.38, 4) * 1.54, 4) ) / 2.38, 4) )) )::Varchar(32)
          End as "D"
From      grid
Where     rownum = 1
Order By  RN, rownum

R e s u l t :

rn A B C D
1 Month Dummy 2020-09 2020-10
2 SU 0 503 1109
3 T1 0 0 211.3445
4 T2 0 0 0
5 RP_SU 0 0 325.4705
6 N_SU 0 503 783.5295
7 NP 0 211.3445 329.2141
8 TPE 0 211.3445 540.5586

fiddle

发布评论

评论列表(0)

  1. 暂无评论