The following packages are used in this section: readr, tibble and here. readr and tibble are part of the tidyverse family.
To make sure that the packages used in this section are installed and loaded, run the following code in your console.
Since you’ll be using R for data analysis you’re likely to have data files that have been generated by, for example, your experiment or survey. Before you can analyse these data, you’ll need to import them into R.
There are two ways to do this. The first is to use the RStudio menu by going to File > Import Dataset. By using the menu, RStudio will automatically write the appropriate code for you. While this will “work”, the code that RStudio generates will refer to your data file using the absolute file path unless the data file is located in a subfolder in your project folder. You never want to use absolute file paths, because they’re specific to your computer, which means that if you move your project to another computer your code won’t be able to find the correct file.
Although you can you can use the menu, a better way is just to write the code yourself. Writing the code yourself is easy, and if you’re going to load several data files it will turn out to be much more efficient than all the pointing and clicking.
Many of you will have your data organised into
.csv files. For example, if you’re running surveys on qualtrics, you can export the data as a
.csv file, or if you’re collecting experimental data with PsychToolBox or PsychoPy, you’ll be able to save the data as a
To read in
.csv files, we’ll use the
readr::read_csv() function from the readr package.
To see how to do this, we’ll first need a file. You can download a file here for us to practise on disaster.csv
Once you’ve clicked the link then, depending on the browser you’re using, you’ll either be shown a dialog box to select where to save the file, or the file will automatically download to your downloads folder.
If you’re shown a dialog box, then save the file to the
data subfolder in your project folder. If it’s saved to your downloads folder, then open both your downloads folder and your project folder in Finder or Windows Explorer and move the file
disaster.csv from your downloads folder to the
data subfolder of your project folder.
Now using the Files panes in RStudio, you can navigate to the
data subfolder and you should see the new
To read in this file, we’ll need to tell R the location of the file on your computer. The location of the file might be something like
c:\users\lc663\documents\r_project\data. These are the full, or absolute paths, and they’re specific to your computer. We don’t what to use the full path, because if we move our project to another computer, the full path will change. We need to use the relative path. To help use specify the relative path, we’ll use the here package.
The here package contains a function called
here() which we’ll use for specifying the location of our file. The
here() function takes a number of arguments with each argument being a step along the path to your file. All directions given to
here() start at your main project file. For example, if we wanted to give directions to our file called
disaster.csv located in the
data subfolder of our project folder, the directions would be specified as follows:
here() function will re-write these instructions to the correct path, so that R can always find the correct file.
To actually read in the file, we’ll just use the output of this function as the input to the
read_csv() function from the readr pacakge.
readr::read_csv(file = here::here("data","disaster.csv")) # our file
Parsed with column specification: cols( id = col_double(), frame = col_double(), donate = col_double(), justify = col_double(), skeptic = col_double() )
# A tibble: 211 x 5 id frame donate justify skeptic <dbl> <dbl> <dbl> <dbl> <dbl> 1 1 1 5.6 2.95 1.8 2 2 1 4.2 2.85 5.2 3 3 1 4.2 3 3.2 4 4 1 4.6 3.3 1 5 5 1 3 5 7.6 6 6 0 5 3.2 4.2 7 7 0 4.8 2.9 4.2 8 8 1 6 1.4 1.2 9 9 0 4.2 3.25 1.8 10 10 0 4.4 3.55 8.8 # … with 201 more rows
The data contains in the file will automatically print out to the console. But we can just save it to a variable for later use.
our_data <- readr::read_csv(file = here::here("data","disaster.csv"))
Now the data is contained in a data table (of type tibble) and we can use it in our analyses.
To read in files from SPSS you use the same general approach as you do for reading in
.csv files. The only difference is that you use the
haven::read_spss() function from the haven package. The haven is also part of the tidyverse family.
When it comes to reading in files from Matlab you have a few options. If you can save the files as
.csv files then you can just use
read_csv. You can also read
.mat files directly using the R.Matlab package. However, this is very slow and you can encounter problems with files saved with the
-v7.3 flag. The final open I recommend, particularly if you want to save Matlab
struct objects rather than tabular data (like tables or matrices) is to save them in a
.json file. However, this is a topic for the advanced course.