I'm making a deneb visual using vega about daily values for my hospital of up to 20 patients.
The current dataset being passed to deneb has one row per day and 30 columns of daily values about the hospital, and I'm trying to figure out the best way to add the patient values without increasing the row count or adding a column for each patient.
I could concatenate all the patient names into a string with commas and have that as a patient_names column. Then do the same for their temperatures, blood pressures, etc. Each day there can be different patients.
Here's a simplified dataset, hospital columns in blue, patient columns in yellow
If using straight vega I would import two datasets, one for hospital data, one for patient data, with the patient data having one row per date/patient combination. Power BI won't allow this, so I have to combine the data into a single dataset.
This keeps the dataset to one row per day. I can then use vega transform to select only the date and concatenated patient columns.
{
"data": [
{
"name": "dataset"
},
{
"name": "patient_data",
"source": "dataset",
"transform":[
{
"type": "project",
"fields": ["date", "patient_names", "temperatures", "pressure_top", "pressure_bottom"]
}
]
}
],
"marks": []
}
With the result:
How can I then transform the concatenated columns into normal unconcatenated columns? Like using Text-to-columns in excel.
Finally I'd like to transform this to a narrow dataset that has one row per date/patient name.
Attempts
To create the 'text-to-columns' one idea is using the split
formula to transform each concatenated string into an array, then using another formula to create each column. This assumes I have a fixed number of patients (and makes the code very long). Here's a sample of this approach that creates the patient1 and temp1 columns:
{
"data": [
{
"name": "dataset"
},
{
"name": "patient_data",
"source": "dataset",
"transform":[
{
"type": "project",
"fields": ["date", "patient_names", "temperatures", "pressure_top", "pressure_bottom"]
},
{
"type": "formula",
"expr": "split(datum.patient_names, ',')",
"as": "patient_names"
},
{
"type": "formula",
"expr": "split(datum.temperatures, ',')",
"as": "temperatures"
},
{
"type": "formula",
"expr": "datum.patient_names[0]",
"as": "patient1"
},
{
"type": "formula",
"expr": "datum.temperatures[0]",
"as": "temp1"
}
]
}
],
"marks": []
}