I am trying to have a chart in my webapp that plots tens of millions of points. For this purpose I started out with Dash and to efficiently depict large datasets I used Holoviews combined with datashader.
The thing is if I follow all the tutorials from dash, I can have a graph that is interactive and updates the rastering on zoom. This of course expects that you provide the dataframe before you run the Dash app. However, in my case, I would need to specify the depicted data on the fly as I would send a request to show a file content on one tab, on another tab I could then upload a new file and depict that. However, when you update your Dash app layout from a callback, by calling to_dash()
and returning the resulting graph, it indeed updates with the new graph but I lose all the interactions that were handled before i.e. on zoom I only enlarge the rastered image. I also checked the two approaches, and once I generate the data in the callback and overwrite the original one, I no longer see anything in my browsers Network tab.
I tried to play around with the graph-id and store-id that Dash attaches to the components it creates, with little success. I think I am breaking some sort of attached callbacks to the originally created components that the app handles on its own. I include an example on how to reproduce the problem in the code below: on load it works fine, when you click on the Generate new data button, the interactions are not present anymore.
import dash
from dash import dcc, html
import holoviews as hv
import pandas as pd
import numpy as np
import plotly.graph_objects as go
from dash.dependencies import Input, Output
from holoviews.operation.datashader import datashade
from holoviews.plotting.plotly.dash import to_dash
# Enable Holoviews Plotly extension
hv.extension('plotly')
# Generate synthetic dataset
def first_dataset():
np.random.seed(42)
n = 1_000_000 # Number of points
return pd.DataFrame({
'x': np.random.normal(size=n),
'y': np.random.normal(size=n),
'value': np.random.random(size=n),
})
def second_dataset():
np.random.seed(42)
n = 1_000_000 # Number of points
return pd.DataFrame({
'x': np.random.uniform(size=n),
'y': np.random.normal(size=n),
'value': np.random.random(size=n),
})
# Create Holoviews + Datashader plot
def create_hv_plot(data):
points = hv.Points(data, ['x', 'y'])
shaded = datashade(points, cmap=['lightblue', 'blue', 'purple'])
return shaded.opts(width=800, height=400)
# Initialize Dash app
app = dash.Dash(__name__)
# Convert Holoviews plot to Plotly
def holoviews_to_plotly(hv_plot):
# Render the Holoviews plot to a Plotly figure
plotly_fig = to_dash(app, [hv_plot], responsive=False)
return plotly_fig
# Generate the data and create the plot
data = first_dataset()
hv_plot = create_hv_plot(data)
plotly_fig = holoviews_to_plotly(hv_plot)
# Layout for Dash app
app.layout = html.Div([
html.H1("Dash + Plotly + Holoviews + Datashader"),
html.Button('Generate New Data', id='generate-button', n_clicks=0),
html.Div(plotly_fig.children, id='graph'),
])
@app.callback(
Output('graph', 'children'),
[Input('generate-button', 'n_clicks')]
)
def update_graph(n_clicks):
if n_clicks > 0:
# Generate new data on button click
new_data = second_dataset()
# Create a new plot with the new data
new_hv_plot = create_hv_plot(new_data)
# Convert the new Holoviews plot to Plotly figure
new_plotly_fig = holoviews_to_plotly(new_hv_plot)
return new_plotly_fig.children # Return the updated Plotly figure
return plotly_fig.children # Return the initial plot if no click yet
# Run the app
if __name__ == "__main__":
app.run_server(debug=True)
I saw couple of people looking for the exact same solution on forums but usually they are ignored.
I am also opent to any other alternatives. What I tried is maintaining the layout for each session on my own. This somewhat worked, it's a lot of work but possible. There is a headache with Dash being stateless and me still tryinng to work with sessions. This way I can insert my plot using an <iframe>
, although I cannot really embed it in a real application due to how cookie settings work via non-https connection (this is another problem).