Sunday, January 23, 2022

ItsMyCode: How to Import CSV Files into R?

A comma-separated values (CSV) file is a delimited text file that uses a comma to separate the values. CSV files are popular formats for storing tabular data, i.e. data is composed of rows and columns.

In this article, we will learn how to import CSV files into R with the help of examples.

Importing CSV Files in R

There are 3 popular methods available to import CSV files into R. 

  • Using read.csv() method
  • Using read_csv() method
  • Using fread() method

In this tutorial, we will explore all the 3 methods and see how we can import the CSV file.

Using read.csv() method

The read.csv() method is used to import a CSV file, and it is best suitable for the small CSV files.

The contents of the CSV files are stored into a variable for further manipulation. We can even import multiple CSV files and store them into different variables.

The output returned will be in the format of DataFrame, where row numbers are assigned with integers.

Syntax: 

read.csv(path, header = TRUE, sep = “,”)

Arguments: 

  • path: CSV file path that needs to be imported.
  • header: Indicates whether to import headers in CSV. By default, it is set to TRUE.
  • sep: the field separator character

R often uses a concept of factors to re-encode strings. Hence it is recommended to set stringsAsFactors=FALSE so that R doesn’t convert character or categorical variables into factors.

# read the data from the CSV file
data <- read.csv("C:\\Personal\\IMS\\cricket_points.csv", header=TRUE)

# print the data variable (outputs as DataFrame)
data

Output

      ï..Teams Wins Lose Points
1        India    5     2     10
2 South Africa    3     4      6
3  West Indies    1     6      2
4      England    2     4      4
5    Australia    4     2      8
6  New Zealand    2     5      4

Method 2: Using read_csv() method

The read_csv() method is the most recommended way of reading the CSV file in R. It reads a CSV file one line at a time. 

The data is read in the form of Tibble, and only 10 rows are displayed at once, and the rest are available after expanding.

It also displays the percentage of the file read into the system making it more robust when compared to the read.csv() method.

If you are working with large CSV files, it’s recommended to use the read_csv() method. 

Syntax:

read_csv (path , col_names , n_max , col_types , progress )

Arguments : 

  • path: CSV file path that needs to be imported.
  • col_names: Indicates whether to import headers in CSV. By default, it is set to TRUE.
  • n_max: The maximum number of rows to read.
  • col_types: If any column succumbs to NULL, then the col_types can be specified in a compact string format.
  • progress: A progress meter to analyse the percentage of files read into the system
# import data.table library 
library(data.table)

#import data
data2 <- read_csv("C:\\Personal\\IMS\\cricket_points.csv")

Output

      ï..Teams Wins Lose Points
1        India    5     2     10
2 South Africa    3     4      6
3  West Indies    1     6      2
4      England    2     4      4
5    Australia    4     2      8
6  New Zealand    2     5      4

Method 3: Using fread() method

If the CSV files are extremely large, the best way to import into R is using the fread() method from the data.table package.

The output of the data will be in the form of Data table in this case.

# import data.table library 
library(data.table)

# read the CSV file
data3 <- fread("C:\\Personal\\IMS\\cricket_points.csv")
          Teams Wins Lose Points
1:        India    5     2     10
2: South Africa    3     4      6
3:  West Indies    1     6      2
4:      England    2     4      4
5:    Australia    4     2      8
6:  New Zealand    2     5      4

Note: It is recommended to use double backlashes (\\) while providing the file path. Else you may get below error.

Error: '\U' used without hex digits in character string starting ""C:\U"


from Planet Python
via read more

No comments:

Post a Comment

TestDriven.io: Working with Static and Media Files in Django

This article looks at how to work with static and media files in a Django project, locally and in production. from Planet Python via read...