Common R traps

I will show some R traps here.  The purpose of this page is not going to tell you how “crappy” R is. R is great indeed and also these kind of “traps” can happen in any other languages.

if(a<-5)

Assume you want make a condition to check if “a is smaller than negative five“, then do something. So you wrote

if (a<-5)
 {
      sin(pi/3) } 

but R will check if you “assign positive five to a”  since “<-” in R is an assignment operator. And of course this is always TRUE. As a result, R will always do the calculations within the condition.

Solutions
: use a better coding style. i.e. always put a space between the operator (either assignment operators or relations operators) and values e.g.,

if (a < -5)
 {
     sin(pi/3)
 }

or use the parentheses if you want to make sure what you are ding.

if (a<(-5))
 {
     sin(pi/3)
 }

break a long line

When you want to break a long expression into several lines in R, you don’t have to put a special notation at end of each line and R will check if your expression has finished. This makes thing convenient but also brings troubles.  Assume you have a very long expression and you want to break it into two lines, e.g.

myvalue <- sin(pi/3) + cos(pi/3) + 2*sin(pi/3)*cos(pi/3)

The result should be 2.232051.

But you wrote

myvalue <- sin(pi/3) + cos(pi/3)            + 2*sin(pi/3)*cos(pi/3)

R will think you have finished the expression at the end of first line and started a new expression from the second line.  You will find the result is 1.366025 since the second part is not included in at all.

Solutions: You can either put a pair of parentheses in your expression like this

myvalue <- (sin(pi/3) + cos(pi/3) 
             + 2*sin(pi/3)*cos(pi/3))

but too many parentheses make the code very hard to read. So you can do the trick that alway break the line after the arithmetic operators

myvalue <- sin(pi/3) + cos(pi/3) +            2*sin(pi/3)*cos(pi/3)

diag() function with a vector

As is described in R help document, using ‘diag(x)’ can have unexpected effects if ‘x’ is a vector could be of length one, like this example

> diag(7.4)
     [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]    1    0    0    0    0    0    0
[2,]    0    1    0    0    0    0    0
[3,]    0    0    1    0    0    0    0
[4,]    0    0    0    1    0    0    0
[5,]    0    0    0    0    1    0    0
[6,]    0    0    0    0    0    1    0
[7,]    0    0    0    0    0    0    1

 Solutions: To avoid this, use “diag(x, nrow = length(x))” for consistent behavior when “x” is a vector

 

> x = c(1,2,3)
> diag(x,length(x))
     [,1] [,2] [,3]
[1,]    1    0    0
[2,]    0    2    0
[3,]    0    0    3
> x = 2.4
> diag(x,length(x))
     [,1]
[1,]  2.4

sample(x) when length of x is 1 and x is an integer

The first argument of sample function has some inconsistent behaviors when the length of x is 1 and x is an integer, see this example

sample(x=3, n = 10, replace=TRUE) # same as sample(x=1:3, n = 10, replace=TRUE)

If you want to sample “3” ten times with replacement, i.e. you obtain a vector of ten 3, you have to check that condition explicitly.

 


Posted

in

,

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *