I've a string that has key value pairs with _
as their delimiter. The Problem is, there can be underscores within keys too.
What regex can I use to split these into key value pairs?
Intput:
C_OPS_CITY=SFO_C_OPS_SITE_INDICATOR=Office_IDENTITYCATEGORY=Employee_C_OPS_COMPANY=My Company_
Expected Output:
- C_OPS_CITY=SFO
- C_OPS_SITE_INDICATOR=Office
- IDENTITYCATEGORY=Employee
- C_OPS_COMPANY=My Company
If not regex, what other logic can I use to split this string into array?
I've a string that has key value pairs with _
as their delimiter. The Problem is, there can be underscores within keys too.
What regex can I use to split these into key value pairs?
Intput:
C_OPS_CITY=SFO_C_OPS_SITE_INDICATOR=Office_IDENTITYCATEGORY=Employee_C_OPS_COMPANY=My Company_
Expected Output:
- C_OPS_CITY=SFO
- C_OPS_SITE_INDICATOR=Office
- IDENTITYCATEGORY=Employee
- C_OPS_COMPANY=My Company
If not regex, what other logic can I use to split this string into array?
Share Improve this question edited Feb 6 at 8:08 bobble bubble 18.5k4 gold badges31 silver badges50 bronze badges asked Feb 6 at 3:03 SudhikSudhik 14710 bronze badges 1 |2 Answers
Reset to default 2The values can be unambiguously identified only if they do not contain underscores. If that's indeed the case for you, you can then use a character set that excludes underscores to match values:
[^=_][^=]*=[^_]+
Demo: https://regex101.com/r/7W1WmX/4
here is something I Conjured up, its not the most elegant code but it does the trick so far:
my_str ="C_OPS_CITY=SFO_C_OPS_SITE_INDICATOR=Office_IDENTITYCATEGORY=Employee_C_OPS_COMPANY=My Company_"
response = my_str.split("=")
for x in range(1,response.__len__()):
split = response[x].split("_", 1)
response[x-1] = [response[x-1], split[0]]
response[x] = split[1]
print(response)
The above code produces the result:
[['C_OPS_CITY', 'SFO'], ['C_OPS_SITE_INDICATOR', 'Office'], ['IDENTITYCATEGORY', 'Employee'], ['C_OPS_COMPANY', 'My Company'], '']
The Idea is that we first split by the equal signs as we know those will always exist to delimit the keys and values and then we simply split the values on the first underscore ('_') that we see, any text after that underscore is a key for the next term and we go through each element doing this.
(EDIT) Also I noticed that I just used python without knowing which language you were using but this solution should be easily reproducible in other languages albeit with minor adjustments.
_
you could extract key & value with(.+?)=([^_]*)_
– jhnc Commented Feb 6 at 4:59