最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

graph - Plotting average Z-Score of samples or analytes - Stack Overflow

programmeradmin1浏览0评论

I am trying to plot a panel of different analytes that are on different orders of magnitude for 5 different solid malignancies. To be able to better compile these analytes into those panels I calculated the z scores of each sample using the mean level of expression across all samples.

Now I have the following situation where I want to plot the values in a way where the different cancer types are on the x axis and the average z-score is on the y axis.

I think I have 2 options and I am not sure which is correct or answers my question better. Option 1: take the average Z-score of all Ovarian samples of analyte X and use this value as a datapoint for my figure, repeat for all malignancies and analytes. This will result in the figure having a number of points for each group that is equal to the number of analytes (which is identical for all groups) Graph created using Option 1

Here I see the risk that one outlier sample with high values across the board would drive all analytes' averages to be higher.

Option 2: calculate the average Z-score of all analytes of sample A and use this value as a datapoint for my figure, repeat for all malignanies. This will result in the figure having a different number of points per malignancy as each malignancy has a different n. Graph created using Option 2

Here an individual analyte that is greatly increased in all of the samples of one group could skew the data and defeat the point of looking at only certain analytes.

Obviously Option 1 gives me the "nicer" graph but I just want to confirm that I am not showing some weird artifact or something. I am really not sure which way visualizes my question "Is there a difference between the cancer types in this panel of analytes?" better.

Bonus question: is it more common to use mean or median z-score for a situation like this, as my data is somewhat skewed I kind of want to use median as this would allow me to mitigate the potential risks I have described underneath each image.

Thank you for your feedback. If I need to change something about this question let me know. I can also prepare a reprex, but i don't need any actual coding help, it is more about confirming that what I am doing isn't completly wrong.

I tried both Options and I am not sure which one is the right one to use in this case.

发布评论

评论列表(0)

  1. 暂无评论