Categories
Default R

Using Intel compiler and Intel MKL in R

Linking R with external BLAS library can speed up the matrix calculations. This topic has been discussed many times in R Installation and Administration R mailing list and other places.

Here are two more official documents from Intel that I though might be useful

where the first document gives examples on how to link MKL with R for different situations. And the latter one gives very convenient way of configuring the correct linking parameters under various conditions which I found very useful.

For how to compile R with Intel compiler, please refer to the R Installation and Administration.

Below is a simple benchmark test on my Linux system showing how much one can gain by  having MKL linked and/or compiling R by Intel compiler.

Configure flags

Add the following lines to “config.site”. You may change the configure parameters depending on your own situation.

## Make sure intel compiler is installed and loaded which can be set in .bashrc
## as e.g.
## . /opt/intel/bin/compilervars.sh intel64

MKL_LIB_PATH=/opt/intel/mkl/lib/intel64

## Use intel compiler
CC='icc -std=c99'
CFLAGS='-g -O3 -wd188 -ip '

F77='ifort'
FFLAGS='-g -O3 '

CXX='icpc'
CXXFLAGS='-g -O3 '

FC='ifort'
FCFLAGS='-g -O3 '

## MKL with GNU version of Open MP threaded, GCC
# MKL=" -L${MKL_LIB_PATH}                         \
#       -Wl,--start-group                         \
#           -lmkl_gf_lp64                         \
#           -lmkl_intel_thread                    \
#           -lmkl_core                            \
#       -Wl,--end-group                           \
#       -lgomp -lpthread"

## MKL With Intel MP threaded , ICC
# MKL=" -L${MKL_LIB_PATH}                         \
#       -Wl,--start-group                         \
#           -lmkl_intel_lp64                      \
#           -lmkl_intel_thread                    \
#           -lmkl_core                            \
#       -Wl,--end-group                           \
#       -liomp5 -lpthread"

## MKL sequential, ICC
MKL=" -L${MKL_LIB_PATH}                         \
      -Wl,--start-group                         \
          -lmkl_intel_lp64                      \
          -lmkl_sequential                      \
          -lmkl_core                            \
      -Wl,--end-group"

BLAS_LIBS="$MKL"

And then compile and install R as follows

./configure --with-blas --with-lapack
make
make install

System information

  • Debian Wheezy AMD64
  • Intel(R) Core(TM) i7-2700K CPU @ 3.50GHz
  • 16G RAM

 Matrix calculation benchmark without MKL (gcc 4.7.2)

   R Benchmark 2.5
   ===============
Number of times each test is run__________________________:  3

   I. Matrix calculation
   ---------------------
Creation, transp., deformation of a 2500x2500 matrix (sec):  0.455000000000001 
2400x2400 normal distributed random matrix ^1000____ (sec):  0.383000000000002 
Sorting of 7,000,000 random values__________________ (sec):  0.647666666666667 
2800x2800 cross-product matrix (b = a' * a)_________ (sec):  10.75 
Linear regr. over a 3000x3000 matrix (c = a \ b')___ (sec):  5.02266666666667 
                      --------------------------------------------
                 Trimmed geom. mean (2 extremes eliminated):  1.13963496799737 

   II. Matrix functions
   --------------------
FFT over 2,400,000 random values____________________ (sec):  0.392666666666666 
Eigenvalues of a 640x640 random matrix______________ (sec):  0.73766666666666 
Determinant of a 2500x2500 random matrix____________ (sec):  3.30266666666667 
Cholesky decomposition of a 3000x3000 matrix________ (sec):  3.872 
Inverse of a 1600x1600 random matrix________________ (sec):  3.04166666666667 
                      --------------------------------------------
                Trimmed geom. mean (2 extremes eliminated):  1.94959995852139 

   III. Programmation
   ------------------
3,500,000 Fibonacci numbers calculation (vector calc)(sec):  0.663333333333346 
Creation of a 3000x3000 Hilbert matrix (matrix calc) (sec):  0.315333333333323 
Grand common divisors of 400,000 pairs (recursion)__ (sec):  1.74266666666667 
Creation of a 500x500 Toeplitz matrix (loops)_______ (sec):  0.471666666666674 
Escoufier's method on a 45x45 matrix (mixed)________ (sec):  0.381 
                      --------------------------------------------
                Trimmed geom. mean (2 extremes eliminated):  0.492149816487262 

Total time for all 15 tests_________________________ (sec):  32.179 
Overall mean (sum of I, II and III trimmed means/3)_ (sec):  1.03023476346231

 Matrix calculation benchmark with Intel compiler (without MKL)

   R Benchmark 2.5
   ===============
Number of times each test is run__________________________:  3

   I. Matrix calculation
   ---------------------
Creation, transp., deformation of a 2500x2500 matrix (sec):  0.438333333333333 
2400x2400 normal distributed random matrix ^1000____ (sec):  0.362666666666666 
Sorting of 7,000,000 random values__________________ (sec):  0.625666666666666 
2800x2800 cross-product matrix (b = a' * a)_________ (sec):  6.06 
Linear regr. over a 3000x3000 matrix (c = a \ b')___ (sec):  2.66333333333333 
                      --------------------------------------------
                 Trimmed geom. mean (2 extremes eliminated):  0.900584248749399 

   II. Matrix functions
   --------------------
FFT over 2,400,000 random values____________________ (sec):  0.372 
Eigenvalues of a 640x640 random matrix______________ (sec):  0.456999999999996 
Determinant of a 2500x2500 random matrix____________ (sec):  1.85666666666667 
Cholesky decomposition of a 3000x3000 matrix________ (sec):  1.44933333333334 
Inverse of a 1600x1600 random matrix________________ (sec):  1.85266666666667 
                      --------------------------------------------
                Trimmed geom. mean (2 extremes eliminated):  1.07060004219009 

   III. Programmation
   ------------------
3,500,000 Fibonacci numbers calculation (vector calc)(sec):  0.510333333333335 
Creation of a 3000x3000 Hilbert matrix (matrix calc) (sec):  0.308666666666667 
Grand common divisors of 400,000 pairs (recursion)__ (sec):  1.581 
Creation of a 500x500 Toeplitz matrix (loops)_______ (sec):  0.408000000000001 
Escoufier's method on a 45x45 matrix (mixed)________ (sec):  0.285000000000011 
                      --------------------------------------------
                Trimmed geom. mean (2 extremes eliminated):  0.400560336912059 

Total time for all 15 tests_________________________ (sec):  19.2306666666667 
Overall mean (sum of I, II and III trimmed means/3)_ (sec):  0.728237740489568

 Matrix calculation benchmark with sequential MKL (gcc 4.7.2)

   R Benchmark 2.5
   ===============
Number of times each test is run__________________________:  3

   I. Matrix calculation
   ---------------------
Creation, transp., deformation of a 2500x2500 matrix (sec):  0.458333333333333 
2400x2400 normal distributed random matrix ^1000____ (sec):  0.378 
Sorting of 7,000,000 random values__________________ (sec):  0.643666666666666 
2800x2800 cross-product matrix (b = a' * a)_________ (sec):  0.922666666666667 
Linear regr. over a 3000x3000 matrix (c = a \ b')___ (sec):  0.482999999999999 
                      --------------------------------------------
                 Trimmed geom. mean (2 extremes eliminated):  0.522311832408545 

   II. Matrix functions
   --------------------
FFT over 2,400,000 random values____________________ (sec):  0.406666666666666 
Eigenvalues of a 640x640 random matrix______________ (sec):  0.288999999999997 
Determinant of a 2500x2500 random matrix____________ (sec):  0.497 
Cholesky decomposition of a 3000x3000 matrix________ (sec):  0.438000000000002 
Inverse of a 1600x1600 random matrix________________ (sec):  0.37866666666667 
                      --------------------------------------------
                Trimmed geom. mean (2 extremes eliminated):  0.407058274866339 

   III. Programmation
   ------------------
3,500,000 Fibonacci numbers calculation (vector calc)(sec):  0.648999999999996 
Creation of a 3000x3000 Hilbert matrix (matrix calc) (sec):  0.306000000000002 
Grand common divisors of 400,000 pairs (recursion)__ (sec):  1.785 
Creation of a 500x500 Toeplitz matrix (loops)_______ (sec):  0.455333333333328 
Escoufier's method on a 45x45 matrix (mixed)________ (sec):  0.375 
                      --------------------------------------------
                Trimmed geom. mean (2 extremes eliminated):  0.480324939124224 

Total time for all 15 tests_________________________ (sec):  8.46533333333333 
Overall mean (sum of I, II and III trimmed means/3)_ (sec):  0.467419897853855

 Matrix calculation benchmark with Intel compiler and sequential MKL

   R Benchmark 2.5
   ===============
Number of times each test is run__________________________:  3

   I. Matrix calculation
   ---------------------
Creation, transp., deformation of a 2500x2500 matrix (sec):  0.475333333333333 
2400x2400 normal distributed random matrix ^1000____ (sec):  0.369 
Sorting of 7,000,000 random values__________________ (sec):  0.637000000000002 
2800x2800 cross-product matrix (b = a' * a)_________ (sec):  0.884666666666665 
Linear regr. over a 3000x3000 matrix (c = a \ b')___ (sec):  0.451333333333332 
                      --------------------------------------------
                 Trimmed geom. mean (2 extremes eliminated):  0.515084369178734 

   II. Matrix functions
   --------------------
FFT over 2,400,000 random values____________________ (sec):  0.372666666666667 
Eigenvalues of a 640x640 random matrix______________ (sec):  0.285999999999999 
Determinant of a 2500x2500 random matrix____________ (sec):  0.504 
Cholesky decomposition of a 3000x3000 matrix________ (sec):  0.429 
Inverse of a 1600x1600 random matrix________________ (sec):  0.370333333333332 
                      --------------------------------------------
                Trimmed geom. mean (2 extremes eliminated):  0.389753671609465 

   III. Programmation
   ------------------
3,500,000 Fibonacci numbers calculation (vector calc)(sec):  0.474000000000001 
Creation of a 3000x3000 Hilbert matrix (matrix calc) (sec):  0.309333333333332 
Grand common divisors of 400,000 pairs (recursion)__ (sec):  1.522 
Creation of a 500x500 Toeplitz matrix (loops)_______ (sec):  0.431000000000002 
Escoufier's method on a 45x45 matrix (mixed)________ (sec):  0.267999999999994 
                      --------------------------------------------
                Trimmed geom. mean (2 extremes eliminated):  0.398315717938976 

Total time for all 15 tests_________________________ (sec):  7.78366666666666 
Overall mean (sum of I, II and III trimmed means/3)_ (sec):  0.430822797869348
Categories
LaTeX

latexdiff with git

The program latexdiff is a simple program that can compare the changes of two versions of TeX files and generate a new TeX file with highlights of the changes. The program is shipped with most tex distributions.

There is a lot discussions on how to integrate latexdiff with version control systems like git. If you have your tex documents git-controlled. You want to check the change of two revisions visually (not the standard git-diff for text files but you want see the difference in a pdf file for two tex files). For sophisticated solution integrating with git, you may consider using  git-latexdiff.

A simple solution is to run the command

latexdiff <(git show oldcommit:file.tex) file.tex > diff.tex

and then simply run e.g.

pdflatex -interaction=nonstopmode diff.tex

to see the changes in diff.pdf.

UPDATE: If you run into troubles that some tex environments (e.g. tables) do not diff properly with latexdiff, you could provide a custmized config file. e.g. ld.cfg containing the following information

PICTUREENV=(?:picture|DIFnomarkup|table)[\w\d@]

and run latexdiff as

latexdiff -c ld.cfg <(git show oldcommit:file.tex) file.tex > diff.tex
Categories
LaTeX

BibTeX with abbreviations in journal names

Once upon a time, I was asked by the journal to provide references with journal names in abbreviations. i.e., Journal of the American Statistical Association should be J. Amer. Statist. Assoc. This is annoying because the journal entries in my BibTeX database are in full names. If I do a simple search and replace, it will be problematic again when other journals ask for full names.

I found a solution (or a workaround, to be precise) with BibTeX. See my BibTeX database page for a detailed explanation.

Others suggest using Biber (A BibTeX replacement for users of BibLaTeX) but I have not spent time on that yet.