Introduction to R: Functions and control structures

Malte Bonart

if- else conditions

Introduction

if(cond) {true.expr} else {false.expr}

  • Binary case switch: If cond is TRUE then run {true.expr} else run {false.expr}.
  • Applications
    • control random events
    • call special handlers in case of missing values, wrong data type, …
    • inside functions

Example

if (4 < 6){
  val <- 4
} else {
  val <- 4 + 2
}
val
[1] 4

Functions

Introduction

  • data structures represent information, functions ‘do’ something with information
  • practically no difference between user-defined and built-in functions
  • cleaner and less error-prone code
  • should be defined at the beginning of a script file or in an extra file

Example

hello <- function(name){
  words <- paste("Hello", name)
  return(words)
}
hello("Lisa")
[1] "Hello Lisa"
hello("Ben")
[1] "Hello Ben"

Definition

  • Keyword: function
  • Formal function arguments: (arg1, arg2, ...)
  • Function body: {...}
  • A return statement

Example

mySum <- function(a, b){
  sum <- a + b
  return(sum)
}
mySum(a = 4, b = 10)
[1] 14
mySum(a = 4, b = NA)
[1] NA

Visibility of variables

Input and output

  • When passing objects to a function a copy of this object is created
  • The copy only exists inside the function. The global object is not changed.
  • The return value of a function can be saved as a new variable

Example

addTwo <- function(x) {
  x <- x + 2
  return(x)
}
x <- 6
addTwo(x)
[1] 8
x                 # x is still as before
[1] 6
x <- addTwo(x)    # the result of the function is assigned to the variable x
x                 # now x has changed
[1] 8

Variables inside functions

Variables which are declared inside a function are not accessible from outside the function

f <- function(x, y) {
  z <- x + y
  return(x)
}
f(2, 3)
[1] 2
z
Error in eval(expr, envir, enclos): object 'z' not found

Functions and the global environment I

Variables which were defined outside of a function are also visible inside the function

y <- 5
h <- function(x) {
  return(x*y)
}
h(2)
[1] 10

Functions and the global environment II

But what happens, if the object y is removed from the global environment?

rm(y)
h(2)
Error in h(2): object 'y' not found

Never use global variables inside a function!

The sapply function

Introduction

  • sapply a function to each column of a data.frame
  • functional programming: pass a function as function argument to another function
  • family members: lapply, sapply, vapply, …

Example: Check for missing values, the bad way

countNA(titanic$pclass)
[1] 0
countNA(titanic$sex)
[1] 0
countNA(titanic$age)
[1] 263
countNA(titanic$name)
[1] 0

Example: Check for missing values, the sapply way

sapply(titanic, countNA)
     X.1        X   pclass survived     name      sex      age embarked 
       0        0        0        0        0        0      263        0 

Advantages

  • avoid repetitions
  • cleaner code and less errors
  • becomes powerful with user defined functions