R Programming Notes on the Working Directory and Files
(c) 2013 kevin languedoc (klanguedoc)
R is programming language for statistical computing and visualization. R is based on the S language which is another statistical language development at Bell Laboratories. R is an open source project and is free to use is governed by the GNU license. You can download R from the R Project web site.
R provides an extensive API for programming statistical data analysis and graphical representation. You can perform linear and nonlinear data modelling and analysis. Classical statistical test can be performed as well as time series analysis, classification and clustering. Data be cleansed and subsetted from a variety of sources including open data data sets. R is highly extensible, allow developers to create new statistical functions. Finally R can be used to graphically represent your data and the results of your analysis in graphical plots and graphs.
How To Install R
R is organized into R packages which are stored in the Comprehensive R Archive Network or CRAN. From the R Project site at www.r-project.org click on the download R link or the CRAN and select an appropriate CRAN mirror.
Select the Windows or Mac binaries depending on your operating system. Launch the installer and follow the instructions like any other program. The R platform comes with an editor can be used to execute queries, write functions that can be saved and distributed or used in your scripts. You can perform all R operations from the integrated IDE or you can download one of the R development IDEs available on the market.
How To Install R Studio
R Studio is also available as a free download from http://www.rstudio.com/. R Studio is a comprehensive development environment. The IDE offers some interesting features like being to integrate R with C/C++ programs and also to create R Sweave script which makes it possible to integrate R in LaTex documents.
How To Set Your Working Directory
The Working directory is the directory that R uses to load its data and other files and also to save its scripts and analysis results. You can see your default working directory by typing the getwd() command in the R Console or by selecting the “Get Working Directory” command under the Misc menu on a Mac and Windows. The default is the user path on both Mac and Windows. To set a new path or a subdirectory in the default in R you will need to create the directory through the operating system’s file manager which is Windows Explorer or Finder.
To set the Working Directory, select the “Set Working Directory” under the Misc menu or use the setwd(path/to/your/directory) on Mac or setwd(c:\\path\\to\\your\\working\\directory) on Windows as the following demonstrates. You can also use the relative path like this : setwd(“./directory”) which is relative to the current working directory path. You will probably notice that these commands are similar to Unix and DOS commands
It is important to create a specific folder (directory) and not store everything in the default working directory because you will lose control over your contents as it gets mixed in with files from other programs.
List of Files in your Working Directory
Use dir() to get a listing of files in a directory. If the directory is empty, the console will output character(0) meaning that there is no files.
output with no files:
output with files:
Another useful method to list files in a directory is to use the list.files() function. Likewise the list.dir() performs a similar operation by listing all the sub-directories or directories at a given path. The list.files and list.dir has several parameters which described below:
Create Directories and Sub-Directories
To create a directory you can use the dir.create() function. This function has three arguments: path, showWarnings and recursive. The arguments definitions are provided below:
- path: The location where you want to create the directory
- showWarnings: This is a boolean parameter
- recursive: This is also a boolean value as TRUE or FALSE. Setting this value indicates if elements other can the first should be created.
Another useful method to list files in a directory is to use the list.files function. Likewise the list.dir performs a similar operation by listing all the sub-directories or directories at a given path. The list.files and list.dir has several parameters which described below:
- path = This is the relative or absolute path
- pattern = This is a regular expression to help filter the contents
- all.files = FALSE or TRUE. If TRUE then the all files are visible even the hidden files
- full.names = Specify FALSE if you don’t want to display the full names
- recursive = TRUE OR FALSE to list files or directories recursively
- ignore.case = TRUE to ignore case or FALSE to distinguish between upper and lowercase words
- include.dirs = Specify TRUE to include the “.” and “..” directories
These are some of the functions that I used frequently when working with R. However there hundreds if not thousands of functions. R has a very rich API that is worth learning and will become a mainstream language as we move deeper and deeper into advance analytics and data analysis.