B Setting Up the R Environment

This is a slightly older (distributed in the hope that it will be useful) version of the forthcoming textbook (ETA 2022) preliminarily entitled Machine Learning in R from Scratch by Marek Gagolewski, which is now undergoing a major revision (when I am not busy with other projects). There will be not much work on-going in this repository anymore, as its sources have moved elsewhere; however, if you happen to find any bugs or typos, please drop me an email. I will share a new draft once it’s ripe. Stay tuned.

B.1 Installing R

R and Python are the languages of modern data science. The former is slightly more oriented towards data modelling, analysis and visualisation as well as statistical computing. It has a gentle learning curve, which makes is very suitable even for beginners – just like us!

R is available for Windows as well as MacOS, Linux and other Unix-like operating systems. It can be downloaded from the R project website, see https://www.r-project.org/ (or installed through system-specific package repositories).


From now on we assume that you have installed the R environment.

B.2 Installing an IDE

As we wish to make our first steps with the R language as stress- and hassle-free as possible, let’s stick to a very user-friendly development environment called RStudio, which can be downloaded from https://rstudio.com/products/rstudio/ (choose RStudio Desktop Open Source Edition).


There are of course many other options for working with R, both interactive and non-interactive, including Jupyter Notebooks (see https://irkernel.github.io/), dynamically generated reports (see https://yihui.org/knitr/options/) and plain shell scripts executed from a terminal. However, for now let’s leave that to more advanced users.

B.4 First R Script in RStudio

Let’s open RStudio and perform the following steps:

  1. Create a New Project where we will store all the scripts related to this book. Click FileNew Project and then choose to start in a brand new working directory, in any location you like. Choose New Project as the project type.

    From now on, we are assuming that the project name is LMLCR and the project has been opened. All source files we create will be relative to the project directory.

  2. Create a new R source file, FileNew FileR Script. Save the file as, for example, sandbox_01.R.

    The source editor (top left pane) behaves just like any other text editor. Standard keyboard shortcuts are available, such as CTRL+C and CTRL+V (Cmd+C and Cmd+V on MacOS) for copy and paste, respectively.

    A list of keyboard shortcuts is available at https://support.rstudio.com/hc/en-us/articles/200711853-Keyboard-Shortcuts

  3. Input the following R code into the editor:

    # My first R script
    # This is a comment
    # Another comment
    # Everything from '#' to the end of the line
    #     is ignored by the R interpreter
    print("Hello world") # prints a given character string
    print(2+2) # evaluates the expression and prints the result
    x <- seq(0, 10, length.out=100) # a new numeric vector
    y <- x^2 # squares every element in x
    plot(x, y, las=1, type="l") # plots y as a function of x
  4. Execute the 5 above commands, line by line, by positioning the keyboard cursor accordingly and pressing Ctrl+Enter (Cmd+Return on MacOS).

    Each time, the command will be copied to the console (bottom-left pane) and evaluated.

    The last line generates a nice plot which will appear in the bottom-right pane.

While you learn, we recommend that you get used to writing your code in an R script and executing it just as we did above.

On a side note, you can execute (source) the whole script by pressing Ctrl+Shift+S (Cmd+Shift+S on MacOS).