I expect the SQL statement to update a limited amounts of rows, instead of every row:
UPDATE Join_Test_1
SET ANumber = 3
FROM Join_Test_1 AS ST1
INNER JOIN Join_Test_2 AS ST2 ON ST1.PKID = ST2.PKID;
I expect the statement to behave the same as the statements:
UPDATE Join_Test_1
SET ANumber = 3
FROM Join_Test_2
WHERE Join_Test_1.PKID = Join_Test_2.PKID;
UPDATE Join_Test_1
SET ANumber = 3
WHERE EXISTS (SELECT * FROM Join_Test_2
WHERE Join_Test_2.PKID = Join_Test_1.PKID);
UPDATE Join_Test_1
SET ANumber = 3
WHERE PKID IN (SELECT PKID FROM Join_Test_2);
The statements using the WHERE
clause only update the ANumber
column where the PKID
matches.
The statement using the INNER JOIN
updates all rows in the table.
Why does the INNER JOIN
not limit the number of rows updated?
Can the statement with the INNER JOIN
be rewritten to use the JOIN
to limit the number of rows updated?
/* Expansive Example */
CREATE TABLE Join_Test_1 (PKID SERIAL, ANumber INTEGER);
CREATE TABLE Join_Test_2 (PKID SERIAL, ANumber INTEGER);
INSERT INTO Join_Test_1 (ANumber) VALUES (1), (1);
INSERT INTO Join_Test_2 (ANumber) VALUES (2);
UPDATE Join_Test_1
SET ANumber = 3
FROM Join_Test_1 AS ST1
INNER JOIN Join_Test_2 AS ST2 ON ST1.PKID = ST2.PKID; -- Updates 2
SELECT *
FROM Join_Test_1
ORDER BY PKID;
-- 1, 3
-- 2, 3
UPDATE Join_Test_1
SET ANumber = 1; -- Update 2
UPDATE Join_Test_1
SET ANumber = 3
FROM Join_Test_2
WHERE Join_Test_1.PKID = Join_Test_2.PKID;
SELECT *
FROM Join_Test_1
ORDER BY PKID;
-- 1, 1
-- 2, 3
UPDATE Join_Test_1 SET ANumber = 1; -- Update 2
UPDATE Join_Test_1
SET ANumber = 3
WHERE PKID IN (SELECT PKID FROM Join_Test_2);
SELECT *
FROM Join_Test_1
ORDER BY PKID;
-- 1, 1
-- 2, 3
DROP TABLE IF EXISTS Join_Test_1;
DROP TABLE IF EXISTS Join_Test_2;
I expect the SQL statement to update a limited amounts of rows, instead of every row:
UPDATE Join_Test_1
SET ANumber = 3
FROM Join_Test_1 AS ST1
INNER JOIN Join_Test_2 AS ST2 ON ST1.PKID = ST2.PKID;
I expect the statement to behave the same as the statements:
UPDATE Join_Test_1
SET ANumber = 3
FROM Join_Test_2
WHERE Join_Test_1.PKID = Join_Test_2.PKID;
UPDATE Join_Test_1
SET ANumber = 3
WHERE EXISTS (SELECT * FROM Join_Test_2
WHERE Join_Test_2.PKID = Join_Test_1.PKID);
UPDATE Join_Test_1
SET ANumber = 3
WHERE PKID IN (SELECT PKID FROM Join_Test_2);
The statements using the WHERE
clause only update the ANumber
column where the PKID
matches.
The statement using the INNER JOIN
updates all rows in the table.
Why does the INNER JOIN
not limit the number of rows updated?
Can the statement with the INNER JOIN
be rewritten to use the JOIN
to limit the number of rows updated?
/* Expansive Example */
CREATE TABLE Join_Test_1 (PKID SERIAL, ANumber INTEGER);
CREATE TABLE Join_Test_2 (PKID SERIAL, ANumber INTEGER);
INSERT INTO Join_Test_1 (ANumber) VALUES (1), (1);
INSERT INTO Join_Test_2 (ANumber) VALUES (2);
UPDATE Join_Test_1
SET ANumber = 3
FROM Join_Test_1 AS ST1
INNER JOIN Join_Test_2 AS ST2 ON ST1.PKID = ST2.PKID; -- Updates 2
SELECT *
FROM Join_Test_1
ORDER BY PKID;
-- 1, 3
-- 2, 3
UPDATE Join_Test_1
SET ANumber = 1; -- Update 2
UPDATE Join_Test_1
SET ANumber = 3
FROM Join_Test_2
WHERE Join_Test_1.PKID = Join_Test_2.PKID;
SELECT *
FROM Join_Test_1
ORDER BY PKID;
-- 1, 1
-- 2, 3
UPDATE Join_Test_1 SET ANumber = 1; -- Update 2
UPDATE Join_Test_1
SET ANumber = 3
WHERE PKID IN (SELECT PKID FROM Join_Test_2);
SELECT *
FROM Join_Test_1
ORDER BY PKID;
-- 1, 1
-- 2, 3
DROP TABLE IF EXISTS Join_Test_1;
DROP TABLE IF EXISTS Join_Test_2;
Share
Improve this question
edited Nov 21, 2024 at 8:51
Jennifer
asked Nov 20, 2024 at 15:39
JenniferJennifer
492 silver badges9 bronze badges
10
|
Show 5 more comments
4 Answers
Reset to default 1In plpgSQL
there is an way to update a single table during joing as follows:
CREATE TABLE Join_Test_1 (PKID SERIAL,ANumber INTEGER);
CREATE TABLE Join_Test_2 (PKID SERIAL,ANumber INTEGER);
INSERT INTO Join_Test_1 (ANumber) VALUES (1),(1);
INSERT INTO Join_Test_2 (ANumber) VALUES (2);
select * from Join_Test_1;
-- 1 1
-- 2 1
select * from Join_Test_2;
-- 1 2
WITH TT AS (
UPDATE Join_Test_1
SET ANumber = 3
RETURNING *
)
SELECT *
FROM TT AS ST1
INNER JOIN
Join_Test_2 AS ST2
ON ST1.PKID = ST2.PKID;
-- 1 3 1 2
SELECT * FROM Join_Test_1 ORDER BY PKID;
-- 1 3
-- 2 3
Process-01: db<>fiddle - Update During joining
Process-02: db<>fiddle - Update During joining
Process-03: db<>fiddle - Normal way
Arguably, when dealing with updates that involve multiple or complex joins and/or conditions - the most readable and maintainable way to do it is using the MERGE INTO
syntax.
MERGE INTO Join_Test_1 target
USING ( Select t1.pkid
From Join_Test_1 t1
Inner Join Join_Test_2 t2 ON(t2.pkid = t1.pkid)
) src ON(src.pkid = target.pkid)
WHEN MATCHED THEN
UPDATE SET anumber = 3;
SELECT * FROM Join_Test_1 ORDER BY PKID;
Result:
pkid | anumber |
---|---|
1 | 3 |
2 | 1 |
fiddle
Just make sure that your USING query fetches unique rows to be matched...
Your UPDATE
statement is simply wrong. No connection is established between the target table Join_Test_1
and the second instance Join_Test_1 AS ST1
. You don't need that second instance of the target table in Postgres to begin with - except for the rare case of emulating a LEFT JOIN
. See:
- UPDATE statement with multiple joins in PostgreSQL
This would work:
UPDATE join_test_1
SET anumber = 3
FROM join_test_2 st2 USING (pkid);
I also simplified and did away with unhelpful CaMeL-case spelling. See:
- Are PostgreSQL column names case-sensitive?
But this is still a dubious statement. It all comes down to bare existence of a matching row in join_test_2
. With the join you run the risk of multiple matches, which produces arbitrary results. Does not show in the simplistic example with a constant assignment (unhelpful test!), but should be avoided in any case. Don't update the same row multiple times.
The right query for the purpose is your EXISTS
variant, which can be simplified to:
UPDATE join_test_1 t1
SET anumber = 3
WHERE EXISTS (SELECT FROM join_test_2 t2 WHERE t2.pkid = t1.pkid);
You alias the table Join_Test_1
, but you don't call the alias in the update. That means the FROM
statement is ignored. You statement is the same as UPDATE Join_Test_1 SET ANumber = 3
.
UPDATE ST1
SET ANumber = 3
FROM Join_Test_1 AS ST1
INNER JOIN Join_Test_2 AS ST2
ON ST1.PKID = ST2.PKID;
UPDATE Join_Test_1 AS ST1 SET ANumber = 3 FROM Join_Test_2 AS ST2 WHERE ST1.PKID = ST2.PKID;
. This works as expected, at least in this sample fiddle – Jonas Metzler Commented Nov 20, 2024 at 15:57UPDATE Join_Test_1 SET ANumber = 3 FROM Join_Test_1 AS ST1 INNER JOIN Join_Test_2 AS ST2 ON ST1.PKID = ST2.PKID;
Are you trying to updateJoin_Test_1
? or are you trying to showJoin_Test_1 INNER JOIN Join_Test_2
data withJoin_Test_1
modified data ? There are no sense of your question. @Jennifer – ArtBindu Commented Nov 20, 2024 at 16:11EXAMPLES
section at the above link for how you can get around this doing a sub-select. – Adrian Klaver Commented Nov 20, 2024 at 16:20