export to csv - My txt. file can't be read by R for Gene Onthology analysis

So, I have 3 .txt files according to the three categories of gene enrichment I downloaded from the GO platform and they just can't be read in R, I think it's due to the inconsistent columns.

First I tried using skip:

BP_results <- read.table("Data/analysisBP.txt", header = TRUE, sep = "\t", stringsAsFactors = FALSE, skip = 10, fill = TRUE)

It didn't work, so I tried converting the file to .csv and then separate the data in columns, but instead it separated each word of the categories by columns. I think the problem relies on the inconsistent columns from the .txt files I downloaded in GO. I also looked if there is any other options to download this data in a different type of file in GPO, but I'm unfamiliar with the XML and JSON options. How can I fix this? Do I change the files manually?

Any help is appreciated, thanks.

So, I have 3 .txt files according to the three categories of gene enrichment I downloaded from the GO platform and they just can't be read in R, I think it's due to the inconsistent columns.

First I tried using skip:

BP_results <- read.table("Data/analysisBP.txt", header = TRUE, sep = "\t", stringsAsFactors = FALSE, skip = 10, fill = TRUE)

Any help is appreciated, thanks.

Share Improve this question edited Feb 5 at 6:59 jay.sf 73k8 gold badges63 silver badges125 bronze badges asked Feb 5 at 6:10 Julieta González 111 silver badge2 bronze badges

2 If it's tab-separated as it looks like, there's nothing inconsistent, it's just not really human readable. In your image you didn't use skip= to skip the meta data lines. If it still doesn't work maybe you skip the wrong number of lines, try with different parameters, e.g. skip=11. – jay.sf Commented Feb 5 at 7:01
2 The file seems corrupted, you can see some rows have the newline in the wrong place. The header is at 7th row. Please paste the first 20 rows of file as text, not image. Or if it is public data, provide web link. – zx8754 Commented Feb 5 at 8:01
@zx8754 I think you've been misled and it's just a line wrap. – jay.sf Commented Feb 5 at 9:38
@jay.sf check the screenshot, after read.table, it is a newline. In any case, OP must give us example text file. – zx8754 Commented Feb 5 at 9:52
skip=10 or skip=11? Why? It seems to me that it should be skip=7. And that the 7th text line is the column headers line, after some parsing. (in which case it should be header=FALSE). – Rui Barradas Commented Feb 5 at 10:47

Add a comment |

1 Answer 1

Sorted by: Reset to default 1

Everything is good. I just changed the skip argument to 11 and it worked. I used the example file from de Gene Ontology webpage:

read.table("DATA/analysis.txt", header = TRUE, sep = "\t", stringsAsFactors = FALSE, skip = 11, fill = TRUE)

And maybe you can keep it simple:

read.delim("DATA/analysis.txt", stringsAsFactors = FALSE, skip = 11)

Or if you have readr installed, from tidyverse:

readr::read_tsv("DATA/analysis.txt", skip = 11)

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

export to csv - My txt. file can't be read by R for Gene Onthology analysis - Stack Overflow

1 Answer 1

与本文相关的文章

评论列表(0)