I have a nested json
, but I can't understand how to work with them.
{
"return": {
"status_processing": "3",
"status": "OK",
"order": {
"id": "872102042",
"number": "123831",
"date_order": "dd/mm/yyyy",
"items": [
{
"item": {
"id_product": "684451795",
"code": "VPOR",
"description": "Product 1",
"unit": "Un",
"quantity": "1.00",
"value": "31.76"
}
},
{
"item": {
"id_product": "684451091",
"code": "VSAP",
"description": "Product 2",
"unit": "Un",
"quantity": "1.00",
"value": "31.76"
}
}
]
}
}
}
I searched on stackoverflow questions, and try some resolutions that people passed, but don't work for me.
Here an sample that I used to accessing the data from json:
df = pd.json_normalize(
order_list,
record_path=["return", "order", "itens"],
meta=[
["return", "order", "id"],
["return", "order", "date_order"],
["return", "order", "number"],
],
)
But don't work, they duplicating the data when I send to dataframe.
Anyone can help me?
EDIT
Here an example that I used:
Convert nested JSON to pandas DataFrame
And what I expected:
I have a nested json
, but I can't understand how to work with them.
{
"return": {
"status_processing": "3",
"status": "OK",
"order": {
"id": "872102042",
"number": "123831",
"date_order": "dd/mm/yyyy",
"items": [
{
"item": {
"id_product": "684451795",
"code": "VPOR",
"description": "Product 1",
"unit": "Un",
"quantity": "1.00",
"value": "31.76"
}
},
{
"item": {
"id_product": "684451091",
"code": "VSAP",
"description": "Product 2",
"unit": "Un",
"quantity": "1.00",
"value": "31.76"
}
}
]
}
}
}
I searched on stackoverflow questions, and try some resolutions that people passed, but don't work for me.
Here an sample that I used to accessing the data from json:
df = pd.json_normalize(
order_list,
record_path=["return", "order", "itens"],
meta=[
["return", "order", "id"],
["return", "order", "date_order"],
["return", "order", "number"],
],
)
But don't work, they duplicating the data when I send to dataframe.
Anyone can help me?
EDIT
Here an example that I used:
Convert nested JSON to pandas DataFrame
And what I expected:
Share Improve this question edited Mar 31 at 16:53 Blaztix 1,3001 gold badge21 silver badges29 bronze badges asked Mar 30 at 14:04 Cesar AugustoCesar Augusto 256 bronze badges 7 | Show 2 more comments2 Answers
Reset to default 1You're code is fine. You are getting the data, perhaps you just wanted to specify which columns to keep (or maybe rename)?
import json
import pandas as pd
data = ''' { "return": {
"status_processing": "3",
"status": "OK",
"order": {
"id": "872102042",
"number": "123831",
"date_order": "dd/mm/yyyy",
"itens": [
{
"item": {
"id_product": "684451795",
"code": "VPOR",
"description": "Product 1",
"unit": "Un",
"quantity": "1.00",
"value": "31.76"
}
},
{
"item": {
"id_product": "684451091",
"code": "VSAP",
"description": "Product 2",
"unit": "Un",
"quantity": "1.00",
"value": "31.76"
}
}
]
}
}
}'''
order_list = json.loads(data)
df = pd.json_normalize(order_list,
record_path= ["return", "order", "itens"],
meta=[["return", "order", "id"], ["return", "order", "number"], ["return", "order", "date_order"]])
df = df[['return.order.id', 'return.order.number', 'return.order.date_order', 'item.id_product']]
Output:
print(df)
return.order.id return.order.number return.order.date_order item.description
0 872102042 123831 dd/mm/yyyy Product 1
1 872102042 123831 dd/mm/yyyy Product 2
I don't know what exactly you expect in output but if you want every item in new row then you could use normal code with for
-loop for this.
order_list = {
"return": {
"status_processing": "3",
"status": "OK",
"order": {
"id": "872102042",
"number": "123831",
"date_order": "dd/mm/yyyy",
"itens": [
{
"item": {
"id_product": "684451795",
"code": "VPOR",
"description": "Product 1",
"unit": "Un",
"quantity": "1.00",
"value": "31.76"
}
},
{
"item": {
"id_product": "684451091",
"code": "VSAP",
"description": "Product 2",
"unit": "Un",
"quantity": "1.00",
"value": "31.76"
}
}
]
}
}
}
import pandas as pd
data = []
order = order_list['return']['order']
for iten in order['itens']:
for key, val in iten.items():
row = {
#'key': key,
'id': order['id'],
'date_order': order['date_order'],
'number': order['number'],
'id_product': val['id_product'],
#'code': val['code'],
#'description': val['description'],
#'quantity': val['quantity'],
#'value': val['value'],
}
data.append(row)
df = pd.DataFrame(data)
print(df)
Result:
id date_order number id_product
0 872102042 dd/mm/yyyy 123831 684451795
1 872102042 dd/mm/yyyy 123831 684451091
If you need other information in rows then you should show it in question.
expand
insteadnormalize
– furas Commented Mar 30 at 14:22