We trigger a stored procedure that inserts into table1
, but then we want to remove extra rows, so there are no duplicates. The problem is when we try to remove these dups, it locks the table until it times out.
We perform the following insert:
INSERT INTO table1 (momt, message_log_id, sender, receiver, msgdata, smsc_id, sms_type, coding, dlr_mask, dlr_url, validity, boxc_id, carrier_id, destination)
SELECT 'mt', @messageBatchKey, @sender, rtrim(ltrim(e.country_code))+e.sub_user_number, msgdata =
CASE
WHEN s.lang = 'english' AND A.sms_limit = '1' THEN LEFT(@messageValuesms, 160)
WHEN s.lang = 'english' AND A.sms_limit = '2' THEN LEFT(@messageValuesms, 360)
WHEN s.lang = 'spanish' THEN (SELECT message from message_log_lang where message_log_id = @messageBatchKey and language = 'ES')
/***WHEN s.lang = 'spanish' THEN LEFT(@messageValuesp, 160)***/
WHEN s.lang = 'creole' THEN LEFT(@messageValuecr, 160)
ELSE LEFT(@messageValuesms, 160)
END
, @smscID, '2', '0', '3',
@dlr_url + 'sub_id=' + CAST(s.sub_id AS varchar(50)) + '&carrierID=' + ltrim(rtrim(CAST(e.sub_carrier_id AS varchar(10)))), NULL,
'inspironMT', c.carrier_id,
CASE
WHEN @accountDestination <> 3 THEN @accountDestination
ELSE 0
END
FROM
SUBSCRIPTION s
INNER JOIN
#distinctSubscriptionKeys d ON s.sub_id = d.distinctSubscriptionKey
INNER JOIN
PHONENUMBERS e ON s.sub_id = e.sub_id
INNER JOIN
CARRIERS c ON e.sub_carrier_id = c.carrier_id
INNER JOIN
ACCOUNTS A ON s.account_id = a.account_id
WHERE
em_type = 't'
AND s.active = '1'
AND a.sms_send_method = '0'
AND s.send_email IN ('0', '2')
AND c.active = 1
AND (rtrim(ltrim(e.country_code)) = 1 OR rtrim(ltrim(e.country_code)) = '')
/*AND c.carrier_id IN
(
SELECT carrier_id
FROM CarrierDestinationProperties
WHERE send_method = 1
AND (destination = @accountDestination OR @accountDestination = 3)
)
*/
AND ((e.sub_carrier_id = '5' OR e.sub_carrier_id = '4')
OR
((NOT 0 = (SELECT COUNT(*) FROM #subgroups)
AND 0 = (SELECT COUNT(a.sub_group_id)
FROM Wens.dbo.ACCOUNT_SUB_GROUPS AS a
INNER JOIN #subgroups AS b ON a.sub_group_id = b.subgroupKey
WHERE (a.account_id = @accountKey)
AND (a.active = '1')
AND (a.sms_send_method = '1')))
))
ORDER BY
d.distinctSubscriptionKey
Then right after that we perform the following to remove the dups:
DELETE FROM table1
WHERE sql_id NOT IN (SELECT MIN(sql_id)
FROM table1
WHERE message_log_id = @messageBatchKey
GROUP BY receiver);
It's this second call that locks. And it only happens when we are dealing with 20K+ records. Anything below that works.
Is there a way to group by receiver within the select statement in the insert? I have been trying to find ways to figure this out and each road I go down locks.
Anyone that has a suggestion would be greatly appreciated.
We trigger a stored procedure that inserts into table1
, but then we want to remove extra rows, so there are no duplicates. The problem is when we try to remove these dups, it locks the table until it times out.
We perform the following insert:
INSERT INTO table1 (momt, message_log_id, sender, receiver, msgdata, smsc_id, sms_type, coding, dlr_mask, dlr_url, validity, boxc_id, carrier_id, destination)
SELECT 'mt', @messageBatchKey, @sender, rtrim(ltrim(e.country_code))+e.sub_user_number, msgdata =
CASE
WHEN s.lang = 'english' AND A.sms_limit = '1' THEN LEFT(@messageValuesms, 160)
WHEN s.lang = 'english' AND A.sms_limit = '2' THEN LEFT(@messageValuesms, 360)
WHEN s.lang = 'spanish' THEN (SELECT message from message_log_lang where message_log_id = @messageBatchKey and language = 'ES')
/***WHEN s.lang = 'spanish' THEN LEFT(@messageValuesp, 160)***/
WHEN s.lang = 'creole' THEN LEFT(@messageValuecr, 160)
ELSE LEFT(@messageValuesms, 160)
END
, @smscID, '2', '0', '3',
@dlr_url + 'sub_id=' + CAST(s.sub_id AS varchar(50)) + '&carrierID=' + ltrim(rtrim(CAST(e.sub_carrier_id AS varchar(10)))), NULL,
'inspironMT', c.carrier_id,
CASE
WHEN @accountDestination <> 3 THEN @accountDestination
ELSE 0
END
FROM
SUBSCRIPTION s
INNER JOIN
#distinctSubscriptionKeys d ON s.sub_id = d.distinctSubscriptionKey
INNER JOIN
PHONENUMBERS e ON s.sub_id = e.sub_id
INNER JOIN
CARRIERS c ON e.sub_carrier_id = c.carrier_id
INNER JOIN
ACCOUNTS A ON s.account_id = a.account_id
WHERE
em_type = 't'
AND s.active = '1'
AND a.sms_send_method = '0'
AND s.send_email IN ('0', '2')
AND c.active = 1
AND (rtrim(ltrim(e.country_code)) = 1 OR rtrim(ltrim(e.country_code)) = '')
/*AND c.carrier_id IN
(
SELECT carrier_id
FROM CarrierDestinationProperties
WHERE send_method = 1
AND (destination = @accountDestination OR @accountDestination = 3)
)
*/
AND ((e.sub_carrier_id = '5' OR e.sub_carrier_id = '4')
OR
((NOT 0 = (SELECT COUNT(*) FROM #subgroups)
AND 0 = (SELECT COUNT(a.sub_group_id)
FROM Wens.dbo.ACCOUNT_SUB_GROUPS AS a
INNER JOIN #subgroups AS b ON a.sub_group_id = b.subgroupKey
WHERE (a.account_id = @accountKey)
AND (a.active = '1')
AND (a.sms_send_method = '1')))
))
ORDER BY
d.distinctSubscriptionKey
Then right after that we perform the following to remove the dups:
DELETE FROM table1
WHERE sql_id NOT IN (SELECT MIN(sql_id)
FROM table1
WHERE message_log_id = @messageBatchKey
GROUP BY receiver);
It's this second call that locks. And it only happens when we are dealing with 20K+ records. Anything below that works.
Is there a way to group by receiver within the select statement in the insert? I have been trying to find ways to figure this out and each road I go down locks.
Anyone that has a suggestion would be greatly appreciated.
Share Improve this question edited Feb 17 at 21:04 marc_s 755k184 gold badges1.4k silver badges1.5k bronze badges asked Feb 17 at 20:44 ScottScott 91 bronze badge 13 | Show 8 more comments2 Answers
Reset to default 0If the problem is that there are dups of phone numbers maybe this will help:
change
JOIN PHONENUMBERS e ON s.sub_id = e.sub_id
to
JOIN (SELECT sub_id,
MAX(comm_type) as comm_type,
MAX(country_code) as country_code,
MAX(sub_carrier_id) as sub_carrier_id,
MAX(sub_user_number) as sub_user_number
FROM PHONENUMBERS
GROUP BY sub_id) as e ON s.sub_id = e.sub_id
NOTE: This assumes that every row with the same sub_id also has the same values for all the over fields selected. IF IT DOES NOT this will create hard to find bugs.... since I don't use any criteria for selecting the rows. If you had a criteria (like last modified date) you could select the data based on that criteria. That would look like this:
JOIN (SELECT sub_id, comm_type, country_code, sub_carrier_id, sub_user_number
ROW_NUMBER() OVER PARTITION BY sub_id ORDER BY modified_date DESC) AS RN
FROM PHONENUMBERS) as e ON s.sub_id = e.sub_id AND e.RN = 1
This is much more likely to give valid results as it uses all fields from the same row. In the first one using GROUP BY with MAX can have values from different rows since it is the max of all rows with that id.
I would suggest that the delete is causing issues.
Firstly, the delete statement itself may be doing much more than you expect?
DELETE FROM table1
WHERE sql_id NOT IN (SELECT MIN(sql_id)
FROM table1
WHERE message_log_id = @messageBatchKey
GROUP BY receiver);
Say you have 20k rows, with mixed message_log_id
values. The SELECT MIN(sql_Id)
subquery will only find rows with message_log_id
= @messageBatchKey
. Therefore, the delete
statement will delete all rows in table1 where message_log_id
is different from @messageBatchKey, whether or not they are the minimum value (i.e., they won't appear in the subquery).
Therefore, at a minimum, I would suggest changing the above to exlicitly filter the deleted rows to only be those where message_log_id = @messageBatchKey.
DELETE FROM table1
WHERE message_log_id = @messageBatchKey
AND sql_id NOT IN (SELECT MIN(sql_id)
FROM table1
WHERE message_log_id = @messageBatchKey
GROUP BY receiver);
Secondly, if you don't have a decent index on there, then every time you run the delete statement, it will need to read the whole table to identify the relevant rows - and then sort them.
I would suggest adding an index to make identification of the rows to delete easier
CREATE NONCLUSTERED INDEX IX_table1_messagelogid ON table1
(message_log_id, receiver, sql_id)
This would help identification of rows in both parts of the statement - in the subquery (identifying the MIN(sql_id)
values) and limiting the rows in table1 to delete via the message_log_id
.
rtrim(ltrim(e.country_code)) = 1
is going to fail in interesting ways, better to dortrim(ltrim(e.country_code)) = '1'
. Also thoseSELECT COUNT
subqueries should beEXISTS
subqueries. Also theORDER BY
is pointless in anINSERT
unless you are inserting millions of rows and want to insert in clustered index order, or you need anIDENTITY
column in the correct order (why?). – Charlieface Commented 2 days ago