For csv files, use read.csv as below.

read.csv("test.csv")

V1 V2

1 F -0.5786439

2 E 0.2472908

3 U 0.2748309

4 R 1.1791559

5 K -0.1258598

6 X -0.8898289

7 L 0.4627274

8 C -0.7088007

To select first 3 rows,

read.csv("test.csv", nrow = 3)

V1 V2

1 F -0.5786439

2 E 0.2472908

3 U 0.2748309

To skip first 2 rows and extract next 3 rows,

read.csv("test.csv", nrow = 3, skip = 2)

E X0.247290774914572

1 U 0.2748309

2 R 1.1791559

3 K -0.1258598

If the file is not csv, use read.table, but separator/delimiter needs to be specified.

read.table("test.csv", sep = ",", nrow = 3, skip = 2)

V1 V2

1 E 0.2472908

2 U 0.2748309

3 R 1.1791559

For large files, use fread() function in data.table package for improved speed of import.

data.table::fread("test.csv", sep = ",")

V1 V2

1: F -0.5786439

2: E 0.2472908

3: U 0.2748309

4: R 1.1791559

5: K -0.1258598

6: X -0.8898289

7: L 0.4627274

8: C -0.7088007

To read first 3 rows,

fread("test.csv", sep = ",", nrow = 3)

V1 V2

1: F -0.5786439

2: E 0.2472908

3: U 0.2748309

To skip first 2 lines and read next 3 rows do below. Note the fread will treat header as a row when skipping.

fread("test.csv", sep = ",", nrow = 3, skip = 2)

V1 V2

1: E 0.2472908

2: U 0.2748309

3: R 1.1791559

readLines() is best for checking the contents and delimiter of the file prior to importing, as it is not restricted by encoding or delimiters.

readLines("test.csv")

[1] "\"V1\",\"V2\"" "\"F\",-0.578643919152124"

[3] "\"E\",0.247290774914572" "\"U\",0.274830888945797"

[5] "\"R\",1.179155856395" "\"K\",-0.125859842900427"

[7] "\"X\",-0.889828858494609" "\"L\",0.462727351834403"

[9] "\"C\",-0.708800746374982"

To read first 4 lines,

readLines("test.csv", n = 4)

[1] "\"V1\",\"V2\"" "\"F\",-0.578643919152124"

[3] "\"E\",0.247290774914572" "\"U\",0.274830888945797"

scan() is similar to readLines() but treat each cell as an item, hence the output does not group elements by rows.

scan("test.csv", what = "list", nlines = 4)

Read 8 items

[1] "V1" ",\"V2\"" "F"

[4] ",-0.578643919152124" "E" ",0.247290774914572"

[7] "U" ",0.274830888945797"

To skip first 2 lines and read next 4 lines (note the header is treated as line 1 when skipping),

scan("test.csv", what = "list", nlines = 4, skip = 2)

Read 8 items

[1] "E" ",0.247290774914572" "U"

[4] ",0.274830888945797" "R" ",1.179155856395"

[7] "K" ",-0.125859842900427"

If file is compressed, e.g. gzip, use gzfile().

To read files in,

read.csv(gzfile("test.csv.gz", "r"))

To write into the gz file,

a <- gzfile("test.csv.gz", "w")

cat("New1, 1111 \n New2, 22222\n", file = a)

## No comments:

## Post a Comment