We have a requirement to convert a column value with multiple country names with semi-colon separated to replace these field values with 2 digit ISO code.
For example:
CountryList
Denmark;Germany;United Kingdom
Should be turned into an output of:
Dk;DE;UK
We have achieved it using CONNECTBY
(Normalization) and left outer join with country_code table then XMLAGG
ing the output. But we are looking for any simpler solution to achieve the same.
Thank you in advance.
We have a requirement to convert a column value with multiple country names with semi-colon separated to replace these field values with 2 digit ISO code.
For example:
CountryList
Denmark;Germany;United Kingdom
Should be turned into an output of:
Dk;DE;UK
We have achieved it using CONNECTBY
(Normalization) and left outer join with country_code table then XMLAGG
ing the output. But we are looking for any simpler solution to achieve the same.
Thank you in advance.
Share Improve this question edited Mar 28 at 10:22 marc_s 756k184 gold badges1.4k silver badges1.5k bronze badges asked Mar 28 at 10:17 Sanjay DubeySanjay Dubey 11 bronze badge 3 |3 Answers
Reset to default 1There is a possibility to do it without CONNECT BY (or a recursive cte) if you join your list of country names to the country codes table using Instr() function. This depends on consistency of your data.
WITH -- S a m p l e D a t a :
country_list (country_names_list) AS
( Select 'Denmark;France;Germany;Spain;United Kingdom'
From Dual
),
-- S a m p l e C o d e s T a b l e :
country_codes (country_code, country_name) AS
( Select 'DK', 'Denmark' From Dual UNION ALL
Select 'DE', 'Germany' From Dual UNION ALL
Select 'ES', 'Spain' From Dual UNION ALL
Select 'FR', 'France' From Dual UNION ALL
Select 'HR', 'Croatia' From Dual UNION ALL
Select 'UK', 'United Kingdom' From Dual
)
-- S Q L :
Select LISTAGG(DISTINCT country_code, ';') WITHIN GROUP (Order By country_name) Over() as country_codes_list
From ( Select cc.country_code, cc.country_name, cl.country_names_list
From country_list cl
Inner Join country_codes cc ON( Instr(';' || cl.country_names_list || ';', ';' || cc.country_name || ';') > 0 )
)
FETCH FIRST ROW ONLY
R e s u l t :
COUNTRY_CODES_LIST |
---|
DK;FR;DE;ES;UK |
You, most probably, did something like below which I think is a better way to do it. Here I just used LISTAGG() instead of XMLAGG.
WITH
country_names (ID, country_name) AS
( Select LEVEL as rn,
SubStr( ';' || cl.country_names_list || ';',
Instr(';' || cl.country_names_list || ';', ';', 1, LEVEL) + 1,
Instr(';' || cl.country_names_list || ';', ';', 1, LEVEL + 1)
- Instr(';' || cl.country_names_list || ';', ';', 1, LEVEL) - 1
) as country_name
From country_list cl
Connect By LEVEL <= Length(cl.country_names_list) - Length(Replace(cl.country_names_list, ';', '')) + 1
)
Select LISTAGG(cc.country_code, ';') WITHIN GROUP (Order By cn.ID) Over() as country_codes_list
From country_names cn
Inner Join country_codes cc ON(cc.country_name = cn.country_name)
FETCH FIRST ROW ONLY
R e s u l t :
COUNTRY_CODES_LIST |
---|
DK;FR;DE;ES;UK |
fiddle
I suggest use JSON_TABLE and LISTAGG.
First, covert country list to JSON format
Denmark;Germany;United Kingdom
to
{"countries":["Denmark","Germany","United Kingdom"]}
Then expand array by JSON_TABLE.
Finally JOIN country_codes
table and aggregate back by LISTAGG
See example:
CREATE TABLE country_codes (country_name,A2,A3,ISO_Num) AS (
SELECT 'Denmark' ,'DK','DNK',208 from dual
UNION ALL SELECT 'Germany' ,'DE','DEU',276 from dual
UNION ALL SELECT 'United Kingdom','GB','GBR',826 from dual
);
COUNTRY_NAME | A2 | A3 | ISO_NUM |
---|---|---|---|
Denmark | DK | DNK | 208 |
Germany | DE | DEU | 276 |
United Kingdom | GB | GBR | 826 |
Test data source
ID | COUNTRY_LIST |
---|---|
1 | Denmark;Germany;United Kingdom |
Query
select t.*, j.*
from test t
cross join lateral(
select listagg(cc.a2,';')within group (order by rn) country_codes
from json_table(concat('{"countries":["',replace(country_list,';','","'),'"]}')
,'$.countries[*]'
columns (rn for ordinality,
cn varchar2(50) path '$' )
) jt
inner join country_codes cc on cc.country_name=jt.cn
)j
Output is
ID | COUNTRY_LIST | country_codes |
---|---|---|
1 | Denmark;Germany;United Kingdom | DK;DE;GB |
For interest, before aggregate we have
ID | COUNTRY_LIST | RN | CN | COUNTRY_NAME | A2 | A3 | ISO_NUM |
---|---|---|---|---|---|---|---|
1 | Denmark;Germany;United Kingdom | 1 | Denmark | Denmark | DK | DNK | 208 |
1 | Denmark;Germany;United Kingdom | 2 | Germany | Germany | DE | DEU | 276 |
1 | Denmark;Germany;United Kingdom | 3 | United Kingdom | United Kingdom | GB | GBR | 826 |
fiddle
Mine is similar to the one above. It's about verticalising a semicolon delimited list, then using a table (taken from the internet here ....
https://www.cia.gov/the-world-factbook/references/country-data-codes/ )
... to perform the Country Name to abbreviation conversion.
The ISO standard still says 'GB' to Great Britain, but the file you find in the link above at least has a column internet
with the value '.uk' for the United Kingdom, so I took that one.
In the example below, I just create a three-row table with some relevant data, as a cutout from the CSV file data of that file.
And I verticalise by using a series of integers - i(i)
- a bit more than necessary - to cross join with, and searching for the i-th occurrence of any consecutive string that does not contain any semicolons, and filtering for that REGEXP_SUBSTR()
to not be null.
And , finally use LISTAGG()
to re-horizontalise.
CREATE TABLE ctry_dat_codes (nam,genc,iso_3166,stanag,internet) AS (
SELECT 'Denmark' ,'DNK','DK|DNK|208','DNK','.dk'
UNION ALL SELECT 'Germany' ,'DEU','DE|DEU|276','DEU','.de'
UNION ALL SELECT 'United Kingdom','GBR','GB|GBR|826','GBR','.uk'
);
WITH
indata(ctrylist) AS (
SELECT 'Denmark;Germany;United Kingdom' FROM dual
)
,
i(i) AS (
SELECT 1 FROM dual
UNION ALL SELECT 2 FROM dual
UNION ALL SELECT 3 FROM dual
UNION ALL SELECT 4 FROM dual
)
,
vertical_orig AS (
SELECT
i
, REGEXP_SUBSTR(ctrylist,'[^;]+',1,i) AS ctry
FROM indata CROSS JOIN i
WHERE REGEXP_SUBSTR(ctrylist,'[^;]+',1,i) IS NOT NULL
)
,
vertical_chg AS (
SELECT
i
, UPPER(SUBSTR(internet,2)) AS cc_internet
FROM vertical_orig
JOIN ctry_dat_codes ON ctry = nam
)
SELECT
LISTAGG(cc_internet,'; ') WITHIN GROUP (ORDER BY i) AS ctrylist
FROM vertical_chg;
CTRYLIST |
---|
DK; DE; UK |
LISTAGG
rather thanXMLAGG
. – Paul W Commented Mar 28 at 12:26