News

How to prevent dropping dimensions in a matrix/array?

When you create a matrix in the usual way like this,

> a <- matrix(rnorm(10),2,5)
> a
       [,1]      [,2]       [,3]       [,4]       [,5]
 [1,]  1.3488918 0.6225795 -0.7444514  1.3130491  1.7877849
 [2,] -0.2385392 0.5656759  0.9037435 -0.2217444 -0.2656875

the dimension dropped after picking up a single row or column in this way,

> b <- a[,1]
> b
 [1]  1.3488918 -0.2385392
 > dim(b)
 NULL

The solution is to try it with a parameter drop = FALSE,

> b <- a[,1,drop = FALSE]
 > b
            [,1]
 [1,]  1.3488918
 [2,] -0.2385392
 > dim(b)
 [1] 2 1

How to crop an eps file?

When you create a figure (in e.g. eps format) with R, the margins around the main context are always too wide. To save the space in the final documents, e.g. LaTeX generated pdf file. I have figured out two ways of reducing the margins.If you just want to shrink the white margin of an eps file, try Method A–C
If you want to crop an esp file, see Method D 

  • Method A (R users) Before you make your graph in R, use par(mar=c(bottom, left, top, right))to specify the margin you want to keep. The default value is c(5, 4, 4, 2) + 0.1. Try this example to see the differences.
    par(mar=c(5,4,4,2)+0.1) # The defualt margins plot(rnorm(100)) dev.copy2eps()          # Save as eps 
    par(mar=c(4,4,0,0)+0.1) # Figure with very tight margins plot(rnorm(100)) dev.copy2eps()
  • Method B (use epstool) Very handy tool that can handle the optimal bounding box
    epstool --copy --bbox file.eps file_new.eps
  • Method C (use ps2epsi)It automatically calculates the bounding box required for all encapsulated PostScript files, so most of the time it does a pretty good job
    ps2epsi <input.eps> <output.eps>
  • Method D (DIY for any eps )Use a text editor open your eps file and you will find a line like this
    %%BoundingBox: 0 0 503 503

    in the front lines of the file. Adjust these vales to proper integers. Save it and test if the margins are better. When you want to crop an eps file and include it into LaTeX with \includegraphics command, you should use  \includegraphics* instead. Because If * is present, then the graphic is ‘clipped’ to the size specified. If * is omitted, then any part of the graphic that is outside the specified ‘bounding box’ will over-print the surrounding text. By the way, the options trim, bb, viewport options in \includegraphics can do the same job in a different manner without editing the eps file, see the help document for details.

Common R traps

I will show some R traps here.  The purpose of this page is not going to tell you how “crappy” R is. R is great indeed and also these kind of “traps” can happen in any other languages.

if(a<-5)

Assume you want make a condition to check if “a is smaller than negative five“, then do something. So you wrote

if (a<-5)
 {
      sin(pi/3) } 

but R will check if you “assign positive five to a”  since “<-” in R is an assignment operator. And of course this is always TRUE. As a result, R will always do the calculations within the condition.

Solutions
: use a better coding style. i.e. always put a space between the operator (either assignment operators or relations operators) and values e.g.,

if (a < -5)
 {
     sin(pi/3)
 }

or use the parentheses if you want to make sure what you are ding.

if (a<(-5))
 {
     sin(pi/3)
 }

break a long line

When you want to break a long expression into several lines in R, you don’t have to put a special notation at end of each line and R will check if your expression has finished. This makes thing convenient but also brings troubles.  Assume you have a very long expression and you want to break it into two lines, e.g.

myvalue <- sin(pi/3) + cos(pi/3) + 2*sin(pi/3)*cos(pi/3)

The result should be 2.232051.

But you wrote

myvalue <- sin(pi/3) + cos(pi/3)            + 2*sin(pi/3)*cos(pi/3)

R will think you have finished the expression at the end of first line and started a new expression from the second line.  You will find the result is 1.366025 since the second part is not included in at all.

Solutions: You can either put a pair of parentheses in your expression like this

myvalue <- (sin(pi/3) + cos(pi/3) 
             + 2*sin(pi/3)*cos(pi/3))

but too many parentheses make the code very hard to read. So you can do the trick that alway break the line after the arithmetic operators

myvalue <- sin(pi/3) + cos(pi/3) +            2*sin(pi/3)*cos(pi/3)

diag() function with a vector

As is described in R help document, using ‘diag(x)’ can have unexpected effects if ‘x’ is a vector could be of length one, like this example

> diag(7.4)
     [,1] [,2] [,3] [,4] [,5] [,6] [,7]
[1,]    1    0    0    0    0    0    0
[2,]    0    1    0    0    0    0    0
[3,]    0    0    1    0    0    0    0
[4,]    0    0    0    1    0    0    0
[5,]    0    0    0    0    1    0    0
[6,]    0    0    0    0    0    1    0
[7,]    0    0    0    0    0    0    1

 Solutions: To avoid this, use “diag(x, nrow = length(x))” for consistent behavior when “x” is a vector

 

> x = c(1,2,3)
> diag(x,length(x))
     [,1] [,2] [,3]
[1,]    1    0    0
[2,]    0    2    0
[3,]    0    0    3
> x = 2.4
> diag(x,length(x))
     [,1]
[1,]  2.4

sample(x) when length of x is 1 and x is an integer

The first argument of sample function has some inconsistent behaviors when the length of x is 1 and x is an integer, see this example

sample(x=3, n = 10, replace=TRUE) # same as sample(x=1:3, n = 10, replace=TRUE)

If you want to sample “3” ten times with replacement, i.e. you obtain a vector of ten 3, you have to check that condition explicitly.