I have a data frame with products and their variances that I want to plot as a histogram:
import pandas as pd
data = {'product_name': ['link', 'zelda', 'impa', 'rauru', 'saria', 'darunia'], 'variance': [0.95, -0.85, 0.3, 0.2, 0.03, 0.02]}
df = pd.DataFrame(data)
I can use plotly to create a histogram:
px.histogram(df, 'variance')
But I want to create five bins, with custom start and ends (like in the graph below) rather than what plotly defaults to. This seems weirdly difficult to do using plotly express - is there a way to customize the bin sizes at all?
Thanks
I have a data frame with products and their variances that I want to plot as a histogram:
import pandas as pd
data = {'product_name': ['link', 'zelda', 'impa', 'rauru', 'saria', 'darunia'], 'variance': [0.95, -0.85, 0.3, 0.2, 0.03, 0.02]}
df = pd.DataFrame(data)
I can use plotly to create a histogram:
px.histogram(df, 'variance')
But I want to create five bins, with custom start and ends (like in the graph below) rather than what plotly defaults to. This seems weirdly difficult to do using plotly express - is there a way to customize the bin sizes at all?
Thanks
Share Improve this question asked Mar 27 at 21:34 Tyler MooreTyler Moore 2034 silver badges10 bronze badges1 Answer
Reset to default 2A way to deal with this is to use pandas' cut
method to bin the variance
columns into user-defined categories:
bins = [-1, -0.25, -.05, 0.05, 0.25, 1]
names = ['-25%+', '-5 to -25%', '-5 to 5%', '5 to 25%', '25%+']
df['MyCategories'] = pd.cut(df['variance'], bins, labels=names)
You may then create your histogram as follows:
px.histogram(df, 'MyCategories', category_orders=dict(variance = names))
...which returns: