I need some help with a figure. I need a boxplot in which the y axis is total number of brain cells, and x is ordered by mean body mass, grouped by species and is coloured according to the sociality.
Please find the R code I am using and the data below:
ggplot(Data, aes(x=mean_bodymass, y = Total.brain.cell, group=Species))+
geom_boxplot (aes(fill=Sociality),width=5)+
geom_point (aes(fill=Sociality),shape=21,size=4)+
scale_y_continuous(trans='log10', n.breaks = 8,labels = comma)+
scale_x_continuous(trans='log10',labels = comma)+
theme_classic()+
scale_fill_manual(values=c("#FF1493","#BF3EFF","#FF3030", "#FF7F00","#0066ff"))+
guides(fill="none")+
theme(base_size=10, text = element_text(size = 30))+
labs(x = "body mass(g)(log)", y = "total brain cells(log)")
Small sample of my data:
Data <- structure(list(Sociality = c("Semi-social", "Semi-social", "Semi-social",
"Solitary", "Solitary", "Solitary", "Solitary", "Solitary", "Solitary",
"Solitary", "Solitary", "Solitary", "Solitary", "Solitary", "Solitary",
"Solitary", "Solitary", "Solitary", "Kleptoparasitism", "Kleptoparasitism",
"Kleptoparasitism"), Species = c("Lipotriches australica", "Lipotriches australica",
"Lipotriches australica", "Hylaeus euxanthus", "Hylaeus euxanthus",
"Hylaeus euxanthus", "Leioproctus amabilis", "Leioproctus amabilis",
"Leioproctus amabilis", "Leioproctus recusus", "Leioproctus recusus",
"Leioproctus recusus", "Megachile (Hacteriapis)", "Megachile (Hacteriapis)",
"Megachile (Hacteriapis)", "Pseudoanthidium repetitum", "Pseudoanthidium repetitum",
"Pseudoanthidium repetitum", "Thyreus nitidulus", "Thyreus nitidulus",
"Thyreus nitidulus"), Total.brain.cell = c(1468750, 1441250,
1496250, 906250, 853750, 870000, 961250, 975000, 947500, 855000,
958750, 988750, 1952500, 1926250, 1993750, 1411250, 1075000,
1450000, 1781250, 1757500, 1742500), mean_bodymass = c(0.065933333,
0.065933333, 0.065933333, 0.008333333, 0.008333333, 0.008333333,
0.0377, 0.0377, 0.0377, 0.027433333, 0.027433333, 0.027433333,
0.053966667, 0.053966667, 0.053966667, 0.043, 0.043, 0.043, 0.084533333,
0.084533333, 0.084533333)), row.names = 4:24, class = "data.frame")
My issue is the width of the boxplot. This is the result without adjusting the width of the boxplot:
And this is the result when adjusting the boxplot width (see code below), where the points are not not inline with the boxplot.
Is there any way I can adjust the width of the boxplot without affecting the points? Or do I need to do two separate plots and combine them? Any help would be appreciated.
I need some help with a figure. I need a boxplot in which the y axis is total number of brain cells, and x is ordered by mean body mass, grouped by species and is coloured according to the sociality.
Please find the R code I am using and the data below:
ggplot(Data, aes(x=mean_bodymass, y = Total.brain.cell, group=Species))+
geom_boxplot (aes(fill=Sociality),width=5)+
geom_point (aes(fill=Sociality),shape=21,size=4)+
scale_y_continuous(trans='log10', n.breaks = 8,labels = comma)+
scale_x_continuous(trans='log10',labels = comma)+
theme_classic()+
scale_fill_manual(values=c("#FF1493","#BF3EFF","#FF3030", "#FF7F00","#0066ff"))+
guides(fill="none")+
theme(base_size=10, text = element_text(size = 30))+
labs(x = "body mass(g)(log)", y = "total brain cells(log)")
Small sample of my data:
Data <- structure(list(Sociality = c("Semi-social", "Semi-social", "Semi-social",
"Solitary", "Solitary", "Solitary", "Solitary", "Solitary", "Solitary",
"Solitary", "Solitary", "Solitary", "Solitary", "Solitary", "Solitary",
"Solitary", "Solitary", "Solitary", "Kleptoparasitism", "Kleptoparasitism",
"Kleptoparasitism"), Species = c("Lipotriches australica", "Lipotriches australica",
"Lipotriches australica", "Hylaeus euxanthus", "Hylaeus euxanthus",
"Hylaeus euxanthus", "Leioproctus amabilis", "Leioproctus amabilis",
"Leioproctus amabilis", "Leioproctus recusus", "Leioproctus recusus",
"Leioproctus recusus", "Megachile (Hacteriapis)", "Megachile (Hacteriapis)",
"Megachile (Hacteriapis)", "Pseudoanthidium repetitum", "Pseudoanthidium repetitum",
"Pseudoanthidium repetitum", "Thyreus nitidulus", "Thyreus nitidulus",
"Thyreus nitidulus"), Total.brain.cell = c(1468750, 1441250,
1496250, 906250, 853750, 870000, 961250, 975000, 947500, 855000,
958750, 988750, 1952500, 1926250, 1993750, 1411250, 1075000,
1450000, 1781250, 1757500, 1742500), mean_bodymass = c(0.065933333,
0.065933333, 0.065933333, 0.008333333, 0.008333333, 0.008333333,
0.0377, 0.0377, 0.0377, 0.027433333, 0.027433333, 0.027433333,
0.053966667, 0.053966667, 0.053966667, 0.043, 0.043, 0.043, 0.084533333,
0.084533333, 0.084533333)), row.names = 4:24, class = "data.frame")
My issue is the width of the boxplot. This is the result without adjusting the width of the boxplot:
And this is the result when adjusting the boxplot width (see code below), where the points are not not inline with the boxplot.
Is there any way I can adjust the width of the boxplot without affecting the points? Or do I need to do two separate plots and combine them? Any help would be appreciated.
Share Improve this question edited Mar 27 at 13:31 stefan 127k6 gold badges38 silver badges76 bronze badges Recognized by R Language Collective asked Mar 27 at 7:45 FumblebeeFaeFumblebeeFae 231 silver badge4 bronze badges 4 |1 Answer
Reset to default 1Note that the default action is geom_boxplot(position="dodge2")
, which means it uses position_dodge2()
. I couldn't find a width=
greater than width=0.05
that kept the points aligned with the boxplots, it always nudged them out of the way.
One alternative is to use position_dodge()
instead.
Original,
geom_boxplot(aes(fill=Sociality), width=5) +
Shifting to position_dodge()
and reducing the width:
geom_boxplot(aes(fill=Sociality), position=position_dodge(), width=0.1) +
position_dodge()
allows for overlaps, you can get fairly crazy with it. Because your boxplots are positioned on a continuous scale and you have unique values that are relatively close to each other, you may have to play with the width=
to get them exactly how you like.
dput(Data)
would be ideal. – Limey Commented Mar 27 at 9:44dput(head(x,20))
for unambiguous representative data makes it the easiest and most direct way for somebody to play with your actual data, not how it may look on the console (the two can be very different). – r2evans Commented Mar 27 at 14:14