How to update my column Col2 based on sequence from other column Col1?
create table test
(
id int identity(1,1),
col1 int,
col2 int
)
--drop table test
insert into test(col1) values (1)
insert into test(col1) values (2)
insert into test(col1) values (3)
insert into test(col1) values (4)
--COl2 Should have 1
insert into test(col1) values (1)
insert into test(col1) values (2)
--COl2 Should have 2
insert into test(col1) values (1)
insert into test(col1) values (2)
insert into test(col1) values (3)
--COl2 Should have 3
insert into test(col1) values (1)
insert into test(col1) values (2)
--COl2 Should have 4
select * from test
My Expected Final Output Should be
id | col1 | col2 |
---|---|---|
1 | 1 | 1 |
2 | 2 | 1 |
3 | 3 | 1 |
4 | 4 | 1 |
5 | 1 | 2 |
6 | 2 | 2 |
7 | 1 | 3 |
8 | 2 | 3 |
9 | 3 | 3 |
10 | 1 | 4 |
11 | 2 | 4 |
How to update my column Col2 based on sequence from other column Col1?
create table test
(
id int identity(1,1),
col1 int,
col2 int
)
--drop table test
insert into test(col1) values (1)
insert into test(col1) values (2)
insert into test(col1) values (3)
insert into test(col1) values (4)
--COl2 Should have 1
insert into test(col1) values (1)
insert into test(col1) values (2)
--COl2 Should have 2
insert into test(col1) values (1)
insert into test(col1) values (2)
insert into test(col1) values (3)
--COl2 Should have 3
insert into test(col1) values (1)
insert into test(col1) values (2)
--COl2 Should have 4
select * from test
My Expected Final Output Should be
id | col1 | col2 |
---|---|---|
1 | 1 | 1 |
2 | 2 | 1 |
3 | 3 | 1 |
4 | 4 | 1 |
5 | 1 | 2 |
6 | 2 | 2 |
7 | 1 | 3 |
8 | 2 | 3 |
9 | 3 | 3 |
10 | 1 | 4 |
11 | 2 | 4 |
This is my sample scenario, but in real time I need to update more then 20K records. I need to update Col2
programmatically based on Col1
the sequence pattern and not using any direct update script.
How can I do this?
Share Improve this question edited Mar 28 at 14:12 Joel Coehoorn 417k114 gold badges578 silver badges813 bronze badges asked Mar 28 at 13:19 Annapoorani RAnnapoorani R 11 bronze badge 10 | Show 5 more comments3 Answers
Reset to default 1You got off lucky because your row boundaries are a constant value, 1, and there is a sequential order of how the data should be processed, the id field. A conditional sum to find the 1's summed over with the ROWS UNBOUNDED PRECEDING option will produce a running sum of the 1's produced per group break (where the other values are 0) and this will continue down the range.
--fix the data
; with normalized as (
select
id,
sum (case when col1 = 1 then 1 else 0 end) over (order by id ROWS UNBOUNDED PRECEDING) as col2
from test
)
--update part
update t
set t.col2 = n.col2
from test t
inner join normalized n on n.id = t.id
select * from test
I have often wished for this as well, but the truth is there is nothing built into any relational database to do this for you. Real relational schema design expects you have two independent sequences for col1
and col2
, where the col1
sequence increments more rapidly than col2
. So instead of the expected result, it would look like this:
id | col1 | col2 |
---|---|---|
1 | 1 | 1 |
2 | 2 | 1 |
3 | 3 | 1 |
4 | 4 | 1 |
5 | 5 | 2 |
6 | 6 | 2 |
7 | 7 | 3 |
8 | 8 | 3 |
... |
Then if you need to show data like your expected output, you use row_number()
or similar to show a sequence from col1
relative to col2
at output time. But proper schema design does not store the data that way.
In other words, you'll need to do this work manually, and always specify both values as part of your inserts. But you can use the NEXT VALUE FOR
a sequence as part of the insert to avoid needing to look for the max value in the table, which hurts performance and/or leads to race conditions.
in real time I need to update more then 20K records. I need to update Col2 programmatically based on Col1 sequence Pattern
This is generally backwards. Typically, col2
, with the repeating identifier, is the parent and col1
is the dependent record. In this order, you can produce col1
values for a given col2
using a window function like row_number() over (partition by col2 order by id)
.
If you really only have col1
, and need to recreate col2
what you have is a sort of gaps and islands problem.
You can try https://dbfiddle.uk/uVpGMqGA
create view merge_view as
with cte as (
select t.*,
count(*) over(partition by t.col1, t.col2 order by t.id rows between unbounded preceding and current row) +
coalesce(max(t.col2) over(order by t.id rows between unbounded preceding and current row),0) as new_col2,
case when t.col2 IS NULL then 'Y' else 'N' end as updatable,
t1.id as t1_id
from test t
left join test t1
on t1.col1 = t.col1 and t1.col2 is null and t1.id < t.id
)
select t.id, t.col1, t.new_col2 from cte t
where updatable = 'Y' and t1_id is null
and not exists(
select 1 from cte t1 where t1.id < t.id and t1.new_col2 > t.new_col2
)
;
and then loop on
merge into test dst
using (
select * from merge_view
) src
on (
dst.id = src.id
)
when matched then
update set col2 = new_col2
;
until all col2 are NOT NULL or you have enough data for your tests.
Since I don't have any backup data
that's arguably a more important problem – Panagiotis Kanavos Commented Mar 28 at 13:50