r plot of NIH grantee data
Hello!
I am an R beginner and really enjoying it. I was hoping to create a plot of National Institutes of Health (NIH) grants to research anizations in Massachusetts.
Everything works except for one thing -- my plot keeps getting ordered alphabetically by the anization's name, not by the total value of the grants received. Here's the code I'm using -- what am I doing wrong?
#let's filter out grants under $3M
mass_nih_big_grants <- filter(mass_NIH_data_2023_desc, TOTAL_COST >= 3000000)
# let's group them by anization.
# many anizations have more than 1 NIH grant, we want to see the total value
# of grants to an anization.
mass_nih_big_grantees <- mass_nih_big_grants %>%
group_by(ORG_NAME)
# let's put that in descending order before we plot it
mass_nih_big_grantees_desc <- mass_nih_big_grantees[order(-mass_nih_big_grantees$TOTAL_COST),]
# do we have to group them again?
mass_nih_big_grantees_desc %>%
group_by(ORG_NAME)
# let's plot what we've got.
plot_mass_nih_big_grantees_desc <- ggplot(mass_nih_big_grantees_desc, aes(x=ORG_NAME, y=TOTAL_COST)) +
geom_bar(stat="identity", fill="dodgerblue") +
coord_flip()
plot_mass_nih_big_grantees_desc + theme_light()
# this works; it creates a plot but rearder is not working. It's sorting
# by name descending (so, Z-A).
r plot of NIH grantee data
Hello!
I am an R beginner and really enjoying it. I was hoping to create a plot of National Institutes of Health (NIH) grants to research anizations in Massachusetts.
Everything works except for one thing -- my plot keeps getting ordered alphabetically by the anization's name, not by the total value of the grants received. Here's the code I'm using -- what am I doing wrong?
#let's filter out grants under $3M
mass_nih_big_grants <- filter(mass_NIH_data_2023_desc, TOTAL_COST >= 3000000)
# let's group them by anization.
# many anizations have more than 1 NIH grant, we want to see the total value
# of grants to an anization.
mass_nih_big_grantees <- mass_nih_big_grants %>%
group_by(ORG_NAME)
# let's put that in descending order before we plot it
mass_nih_big_grantees_desc <- mass_nih_big_grantees[order(-mass_nih_big_grantees$TOTAL_COST),]
# do we have to group them again?
mass_nih_big_grantees_desc %>%
group_by(ORG_NAME)
# let's plot what we've got.
plot_mass_nih_big_grantees_desc <- ggplot(mass_nih_big_grantees_desc, aes(x=ORG_NAME, y=TOTAL_COST)) +
geom_bar(stat="identity", fill="dodgerblue") +
coord_flip()
plot_mass_nih_big_grantees_desc + theme_light()
# this works; it creates a plot but rearder is not working. It's sorting
# by name descending (so, Z-A).
Share
Improve this question
edited Mar 30 at 5:29
M--
29.7k10 gold badges70 silver badges106 bronze badges
asked Nov 17, 2024 at 4:35
Lisa WilliamsLisa Williams
1
1 Answer
Reset to default 1Reordering the data will not reorder the plot and group_by() is for grouping data during summary calculations. The plot will be ordered by the order of the factor on the x axis. The default is ordering alphabetically, but you can either manually set the order using the levels
parameter of the factor function or you can use the fct_reorder() function from forcats to order the factor according to the value of a function. The default function for ordering is the median, which works if the data are already summarized.
Compare these two plots where I reorder the factor in the second case.
library(forcats)
library(ggplot2)
DF <- data.frame(Org = c("A","C","B"), Cost = c(4,3,6))
ggplot(DF, aes(x = Org, y = Cost)) + geom_col()
#Now reorder the Org by the Cost
ggplot(DF, aes(x = fct_reorder(Org, Cost), y = Cost)) + geom_col()
Created on 2024-11-16 with reprex v2.1.1