{ "cells": [ { "cell_type": "markdown", "id": "c57b2ce5", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# R Functions and Packages\n", "\n", "\n", "Feng Li\n", "\n", "School of Statistics and Mathematics\n", "\n", "Central University of Finance and Economics\n", "\n", "[feng.li@cufe.edu.cn](mailto:feng.li@cufe.edu.cn)\n", "\n", "[https://feng.li/statcomp](https://feng.li/statcomp)\n", "\n", "_>>> Link to Python version_ [1](https://feng.li/files/python/P01-Python-from-Scratch/L01.3-Python-Functions-and-Modules.slides.html)" ] }, { "cell_type": "markdown", "id": "a533c945", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Control-flow constructs\n", "\n", "### The `if` condition\n", "\n", "- Binary comparison\n", "\n", " x == y all.equal() identical()\n", " x != y\n", " x > y\n", " x < y\n", " x >= y\n", " x <= y\n", " x %in% y\n", " \n", "- What would you expect when $x$ and $y$ are vectors, matrices ? ...\n" ] }, { "cell_type": "markdown", "id": "9ca1e3c6", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- The if condition statement in R\n", "\n", " if (condition){\n", " do something\n", " }\n", " else{\n", " do something else\n", " }" ] }, { "cell_type": "markdown", "id": "57f0dd29", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Lab: Leap year check\n", "\n", "- February 29, known as a leap day in the calendar, is a date that occurs in most years that are evenly divisible by $4$, such as $2004$, $2008$, $2012$ and $2016$. Years that are evenly divisible by $100$ do not contain a leap day, with the exception of years that are evenly divisible by $400$, which do contain a leap day; thus $1900$ did not contain a leap day while $2000$ did.\n", "\n", "- Write a function called `is.leapday` to check if a given year has February 29 \\[Hint: you may need `?%%`.\\].\n", "\n", "- Test your function for some years.\n", "\n", "- What can you do to improve for the function in terms of error tolerance?\n", "\n", "- If I want to check which year has a leap day for a sequence of given years. Modify your function to implement it." ] }, { "cell_type": "markdown", "id": "a8e135d8", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Loops\n", "\n", "- The for loop" ] }, { "cell_type": "code", "execution_count": 2, "id": "4c1a5c36", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "
A matrix: 2 × 5 of type int
101105109113117
103107111115119
\n" ], "text/latex": [ "A matrix: 2 × 5 of type int\n", "\\begin{tabular}{lllll}\n", "\t 101 & 105 & 109 & 113 & 117\\\\\n", "\t 103 & 107 & 111 & 115 & 119\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A matrix: 2 × 5 of type int\n", "\n", "| 101 | 105 | 109 | 113 | 117 |\n", "| 103 | 107 | 111 | 115 | 119 |\n", "\n" ], "text/plain": [ " [,1] [,2] [,3] [,4] [,5]\n", "[1,] 101 105 109 113 117 \n", "[2,] 103 107 111 115 119 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "B = matrix(1:10,2,5)\n", "C = matrix(100:109,2,5)\n", "A = matrix(NA,2,5)\n", "for(i in 1:length(A))\n", "{\n", " A[i] = B[i] + C[i]\n", "}\n", "A" ] }, { "cell_type": "markdown", "id": "41bcb10d", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- The `while` loop" ] }, { "cell_type": "code", "execution_count": 4, "id": "42c57e02", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "
A matrix: 2 × 5 of type int
101105109113117
103107111115119
\n" ], "text/latex": [ "A matrix: 2 × 5 of type int\n", "\\begin{tabular}{lllll}\n", "\t 101 & 105 & 109 & 113 & 117\\\\\n", "\t 103 & 107 & 111 & 115 & 119\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A matrix: 2 × 5 of type int\n", "\n", "| 101 | 105 | 109 | 113 | 117 |\n", "| 103 | 107 | 111 | 115 | 119 |\n", "\n" ], "text/plain": [ " [,1] [,2] [,3] [,4] [,5]\n", "[1,] 101 105 109 113 117 \n", "[2,] 103 107 111 115 119 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "i = 0\n", "while(i < length(A)){ \n", " i = i + 1\n", " A[i] = B[i] + C[i]\n", "}\n", "A" ] }, { "cell_type": "markdown", "id": "b4d61a77", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### `apply()` type loops\n", "\n", "- Calculate row sums for a matrix with a loop.\n", "\n", "- Apply `sum()` function to each row of the matrix.\n", "\n", "- `apply()` to an array with higher dimension.\n", "\n", "- Apply your own function to each row of the matrix.\n", "\n", "- `lapply()` Apply a function to a list\n", "\n", "- `mapply()` Apply a function to multiple list or vector arguments.\n", "\n", "- The \\... arguments in a function.\n", "\n", "- Supply more arguments to `apply()` type functions." ] }, { "cell_type": "code", "execution_count": 2, "id": "8932ed0b", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A matrix: 20 × 5 of type int
1214161 81
2224262 82
3234363 83
4244464 84
5254565 85
6264666 86
7274767 87
8284868 88
9294969 89
10305070 90
11315171 91
12325272 92
13335373 93
14345474 94
15355575 95
16365676 96
17375777 97
18385878 98
19395979 99
20406080100
\n" ], "text/latex": [ "A matrix: 20 × 5 of type int\n", "\\begin{tabular}{lllll}\n", "\t 1 & 21 & 41 & 61 & 81\\\\\n", "\t 2 & 22 & 42 & 62 & 82\\\\\n", "\t 3 & 23 & 43 & 63 & 83\\\\\n", "\t 4 & 24 & 44 & 64 & 84\\\\\n", "\t 5 & 25 & 45 & 65 & 85\\\\\n", "\t 6 & 26 & 46 & 66 & 86\\\\\n", "\t 7 & 27 & 47 & 67 & 87\\\\\n", "\t 8 & 28 & 48 & 68 & 88\\\\\n", "\t 9 & 29 & 49 & 69 & 89\\\\\n", "\t 10 & 30 & 50 & 70 & 90\\\\\n", "\t 11 & 31 & 51 & 71 & 91\\\\\n", "\t 12 & 32 & 52 & 72 & 92\\\\\n", "\t 13 & 33 & 53 & 73 & 93\\\\\n", "\t 14 & 34 & 54 & 74 & 94\\\\\n", "\t 15 & 35 & 55 & 75 & 95\\\\\n", "\t 16 & 36 & 56 & 76 & 96\\\\\n", "\t 17 & 37 & 57 & 77 & 97\\\\\n", "\t 18 & 38 & 58 & 78 & 98\\\\\n", "\t 19 & 39 & 59 & 79 & 99\\\\\n", "\t 20 & 40 & 60 & 80 & 100\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A matrix: 20 × 5 of type int\n", "\n", "| 1 | 21 | 41 | 61 | 81 |\n", "| 2 | 22 | 42 | 62 | 82 |\n", "| 3 | 23 | 43 | 63 | 83 |\n", "| 4 | 24 | 44 | 64 | 84 |\n", "| 5 | 25 | 45 | 65 | 85 |\n", "| 6 | 26 | 46 | 66 | 86 |\n", "| 7 | 27 | 47 | 67 | 87 |\n", "| 8 | 28 | 48 | 68 | 88 |\n", "| 9 | 29 | 49 | 69 | 89 |\n", "| 10 | 30 | 50 | 70 | 90 |\n", "| 11 | 31 | 51 | 71 | 91 |\n", "| 12 | 32 | 52 | 72 | 92 |\n", "| 13 | 33 | 53 | 73 | 93 |\n", "| 14 | 34 | 54 | 74 | 94 |\n", "| 15 | 35 | 55 | 75 | 95 |\n", "| 16 | 36 | 56 | 76 | 96 |\n", "| 17 | 37 | 57 | 77 | 97 |\n", "| 18 | 38 | 58 | 78 | 98 |\n", "| 19 | 39 | 59 | 79 | 99 |\n", "| 20 | 40 | 60 | 80 | 100 |\n", "\n" ], "text/plain": [ " [,1] [,2] [,3] [,4] [,5]\n", " [1,] 1 21 41 61 81 \n", " [2,] 2 22 42 62 82 \n", " [3,] 3 23 43 63 83 \n", " [4,] 4 24 44 64 84 \n", " [5,] 5 25 45 65 85 \n", " [6,] 6 26 46 66 86 \n", " [7,] 7 27 47 67 87 \n", " [8,] 8 28 48 68 88 \n", " [9,] 9 29 49 69 89 \n", "[10,] 10 30 50 70 90 \n", "[11,] 11 31 51 71 91 \n", "[12,] 12 32 52 72 92 \n", "[13,] 13 33 53 73 93 \n", "[14,] 14 34 54 74 94 \n", "[15,] 15 35 55 75 95 \n", "[16,] 16 36 56 76 96 \n", "[17,] 17 37 57 77 97 \n", "[18,] 18 38 58 78 98 \n", "[19,] 19 39 59 79 99 \n", "[20,] 20 40 60 80 100 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mat = matrix(1:100,20,5)\n", "mat" ] }, { "cell_type": "code", "execution_count": 5, "id": "b55c3787", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "
  1. 10.5
  2. 30.5
  3. 50.5
  4. 70.5
  5. 90.5
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 10.5\n", "\\item 30.5\n", "\\item 50.5\n", "\\item 70.5\n", "\\item 90.5\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 10.5\n", "2. 30.5\n", "3. 50.5\n", "4. 70.5\n", "5. 90.5\n", "\n", "\n" ], "text/plain": [ "[1] 10.5 30.5 50.5 70.5 90.5" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
  1. 41
  2. 42
  3. 43
  4. 44
  5. 45
  6. 46
  7. 47
  8. 48
  9. 49
  10. 50
  11. 51
  12. 52
  13. 53
  14. 54
  15. 55
  16. 56
  17. 57
  18. 58
  19. 59
  20. 60
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 41\n", "\\item 42\n", "\\item 43\n", "\\item 44\n", "\\item 45\n", "\\item 46\n", "\\item 47\n", "\\item 48\n", "\\item 49\n", "\\item 50\n", "\\item 51\n", "\\item 52\n", "\\item 53\n", "\\item 54\n", "\\item 55\n", "\\item 56\n", "\\item 57\n", "\\item 58\n", "\\item 59\n", "\\item 60\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 41\n", "2. 42\n", "3. 43\n", "4. 44\n", "5. 45\n", "6. 46\n", "7. 47\n", "8. 48\n", "9. 49\n", "10. 50\n", "11. 51\n", "12. 52\n", "13. 53\n", "14. 54\n", "15. 55\n", "16. 56\n", "17. 57\n", "18. 58\n", "19. 59\n", "20. 60\n", "\n", "\n" ], "text/plain": [ " [1] 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
  1. 5.91607978309962
  2. 5.91607978309962
  3. 5.91607978309962
  4. 5.91607978309962
  5. 5.91607978309962
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 5.91607978309962\n", "\\item 5.91607978309962\n", "\\item 5.91607978309962\n", "\\item 5.91607978309962\n", "\\item 5.91607978309962\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 5.91607978309962\n", "2. 5.91607978309962\n", "3. 5.91607978309962\n", "4. 5.91607978309962\n", "5. 5.91607978309962\n", "\n", "\n" ], "text/plain": [ "[1] 5.91608 5.91608 5.91608 5.91608 5.91608" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "apply(mat, 2, mean)\n", "apply(mat, 1, mean)\n", "apply(mat, 2, sd)" ] }, { "cell_type": "code", "execution_count": 6, "id": "e821abba", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "arr = array(1:240, c(20,3,4))" ] }, { "cell_type": "code", "execution_count": 9, "id": "b22453d7", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "
  1. 30.5
  2. 90.5
  3. 150.5
  4. 210.5
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 30.5\n", "\\item 90.5\n", "\\item 150.5\n", "\\item 210.5\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 30.5\n", "2. 90.5\n", "3. 150.5\n", "4. 210.5\n", "\n", "\n" ], "text/plain": [ "[1] 30.5 90.5 150.5 210.5" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
A matrix: 20 × 3 of type dbl
91111131
92112132
93113133
94114134
95115135
96116136
97117137
98118138
99119139
100120140
101121141
102122142
103123143
104124144
105125145
106126146
107127147
108128148
109129149
110130150
\n" ], "text/latex": [ "A matrix: 20 × 3 of type dbl\n", "\\begin{tabular}{lll}\n", "\t 91 & 111 & 131\\\\\n", "\t 92 & 112 & 132\\\\\n", "\t 93 & 113 & 133\\\\\n", "\t 94 & 114 & 134\\\\\n", "\t 95 & 115 & 135\\\\\n", "\t 96 & 116 & 136\\\\\n", "\t 97 & 117 & 137\\\\\n", "\t 98 & 118 & 138\\\\\n", "\t 99 & 119 & 139\\\\\n", "\t 100 & 120 & 140\\\\\n", "\t 101 & 121 & 141\\\\\n", "\t 102 & 122 & 142\\\\\n", "\t 103 & 123 & 143\\\\\n", "\t 104 & 124 & 144\\\\\n", "\t 105 & 125 & 145\\\\\n", "\t 106 & 126 & 146\\\\\n", "\t 107 & 127 & 147\\\\\n", "\t 108 & 128 & 148\\\\\n", "\t 109 & 129 & 149\\\\\n", "\t 110 & 130 & 150\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A matrix: 20 × 3 of type dbl\n", "\n", "| 91 | 111 | 131 |\n", "| 92 | 112 | 132 |\n", "| 93 | 113 | 133 |\n", "| 94 | 114 | 134 |\n", "| 95 | 115 | 135 |\n", "| 96 | 116 | 136 |\n", "| 97 | 117 | 137 |\n", "| 98 | 118 | 138 |\n", "| 99 | 119 | 139 |\n", "| 100 | 120 | 140 |\n", "| 101 | 121 | 141 |\n", "| 102 | 122 | 142 |\n", "| 103 | 123 | 143 |\n", "| 104 | 124 | 144 |\n", "| 105 | 125 | 145 |\n", "| 106 | 126 | 146 |\n", "| 107 | 127 | 147 |\n", "| 108 | 128 | 148 |\n", "| 109 | 129 | 149 |\n", "| 110 | 130 | 150 |\n", "\n" ], "text/plain": [ " [,1] [,2] [,3]\n", " [1,] 91 111 131 \n", " [2,] 92 112 132 \n", " [3,] 93 113 133 \n", " [4,] 94 114 134 \n", " [5,] 95 115 135 \n", " [6,] 96 116 136 \n", " [7,] 97 117 137 \n", " [8,] 98 118 138 \n", " [9,] 99 119 139 \n", "[10,] 100 120 140 \n", "[11,] 101 121 141 \n", "[12,] 102 122 142 \n", "[13,] 103 123 143 \n", "[14,] 104 124 144 \n", "[15,] 105 125 145 \n", "[16,] 106 126 146 \n", "[17,] 107 127 147 \n", "[18,] 108 128 148 \n", "[19,] 109 129 149 \n", "[20,] 110 130 150 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "apply(arr, 3, mean)\n", "apply(arr, c(1, 2), mean)" ] }, { "cell_type": "code", "execution_count": 10, "id": "fe676a69", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "
  1. 19
  2. 19
  3. 19
  4. 19
  5. 19
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 19\n", "\\item 19\n", "\\item 19\n", "\\item 19\n", "\\item 19\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 19\n", "2. 19\n", "3. 19\n", "4. 19\n", "5. 19\n", "\n", "\n" ], "text/plain": [ "[1] 19 19 19 19 19" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "apply(mat, 2, function (x) max(x)-min(x))" ] }, { "cell_type": "code", "execution_count": 11, "id": "e29dd4be", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "
  1. 19
  2. 19
  3. 19
  4. 19
  5. 19
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 19\n", "\\item 19\n", "\\item 19\n", "\\item 19\n", "\\item 19\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 19\n", "2. 19\n", "3. 19\n", "4. 19\n", "5. 19\n", "\n", "\n" ], "text/plain": [ "[1] 19 19 19 19 19" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "maxmin = function (x) {\n", " max(x)-min(x)\n", "}\n", "\n", "apply(mat, 2, maxmin)" ] }, { "cell_type": "markdown", "id": "5768ca55", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- The advantages of `()apply`.\n", "\n", " - Easy construct\n", "\n", " - Less coding\n", "\n", " - `apply()` type loops is essentially a more efficient version loop in R." ] }, { "cell_type": "markdown", "id": "468f45d0", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Write efficient loops in R\n", "\n", "- Avoid loops as much as possible. We should always try to vectorize the calculations.\n", "\n", "- Use `()apply` type loop if possible.\n", "\n", "- Think a lot about under- and over-flow\n", "\n", "- Allocate the memory space before looping. This is a much slower\n", " loop.\n", "\n", " B = matrix(1:10,2,5)\n", " C = matrix(100:109,2,5)\n", " A = NULL\n", " for(i in 1:n)\n", " {\n", " A[i] = B[i] + C[i]\n", " }" ] }, { "cell_type": "markdown", "id": "d1a1d292", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### List Arithmetics\n", "\n", "- Apply a function to the elements of a list\n", "\n", " lapply(X, FUN, ...)\n", " rapply(object, f, how = c(\"unlist\",\"replace\", \"list\"), ...)" ] }, { "cell_type": "markdown", "id": "0b58c9f5", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- Operators with many lists\n", "\n", " mapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE)\n", " \n", " mapply(\"+\", list1, list2, list3, SIMPLIFY = FALSE)\n", " mapply(function(x, y) abs(x)*log(abs(y)), list1, list2, SIMPLIFY = FALSE)" ] }, { "cell_type": "code", "execution_count": 13, "id": "0f52f710", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\t
$a
\n", "\t\t
\n", "
  1. -1.26589644825265
  2. -0.068573308896311
  3. 0.658409475454285
  4. -0.221881865339709
  5. 0.58070039713217
  6. 0.630319389017115
  7. -0.242213569358942
  8. 0.97023453146676
  9. 0.351854369172614
  10. -1.33262148487691
\n", "
\n", "\t
$b
\n", "\t\t
\n", "
  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
\n", "
\n", "\t
$c
\n", "\t\t
\n", "
  1. 0.181591403437778
  2. 0.164184056222439
  3. 0.390080413082615
  4. 0.657432082109153
  5. 0.256644821958616
  6. 0.564072602661327
  7. 0.525960478000343
  8. 0.604819754138589
  9. 0.951666688779369
  10. 0.97638466511853
  11. 0.508973858784884
  12. 0.284509904216975
  13. 0.563456729985774
  14. 0.968111847294495
  15. 0.453518448630348
  16. 0.633869176264852
  17. 0.606082778424025
  18. 0.908481978345662
  19. 0.129783247830346
  20. 0.170731674181297
\n", "
\n", "
\n" ], "text/latex": [ "\\begin{description}\n", "\\item[\\$a] \\begin{enumerate*}\n", "\\item -1.26589644825265\n", "\\item -0.068573308896311\n", "\\item 0.658409475454285\n", "\\item -0.221881865339709\n", "\\item 0.58070039713217\n", "\\item 0.630319389017115\n", "\\item -0.242213569358942\n", "\\item 0.97023453146676\n", "\\item 0.351854369172614\n", "\\item -1.33262148487691\n", "\\end{enumerate*}\n", "\n", "\\item[\\$b] \\begin{enumerate*}\n", "\\item 1\n", "\\item 2\n", "\\item 3\n", "\\item 4\n", "\\item 5\n", "\\item 6\n", "\\item 7\n", "\\item 8\n", "\\item 9\n", "\\end{enumerate*}\n", "\n", "\\item[\\$c] \\begin{enumerate*}\n", "\\item 0.181591403437778\n", "\\item 0.164184056222439\n", "\\item 0.390080413082615\n", "\\item 0.657432082109153\n", "\\item 0.256644821958616\n", "\\item 0.564072602661327\n", "\\item 0.525960478000343\n", "\\item 0.604819754138589\n", "\\item 0.951666688779369\n", "\\item 0.97638466511853\n", "\\item 0.508973858784884\n", "\\item 0.284509904216975\n", "\\item 0.563456729985774\n", "\\item 0.968111847294495\n", "\\item 0.453518448630348\n", "\\item 0.633869176264852\n", "\\item 0.606082778424025\n", "\\item 0.908481978345662\n", "\\item 0.129783247830346\n", "\\item 0.170731674181297\n", "\\end{enumerate*}\n", "\n", "\\end{description}\n" ], "text/markdown": [ "$a\n", ": 1. -1.26589644825265\n", "2. -0.068573308896311\n", "3. 0.658409475454285\n", "4. -0.221881865339709\n", "5. 0.58070039713217\n", "6. 0.630319389017115\n", "7. -0.242213569358942\n", "8. 0.97023453146676\n", "9. 0.351854369172614\n", "10. -1.33262148487691\n", "\n", "\n", "\n", "$b\n", ": 1. 1\n", "2. 2\n", "3. 3\n", "4. 4\n", "5. 5\n", "6. 6\n", "7. 7\n", "8. 8\n", "9. 9\n", "\n", "\n", "\n", "$c\n", ": 1. 0.181591403437778\n", "2. 0.164184056222439\n", "3. 0.390080413082615\n", "4. 0.657432082109153\n", "5. 0.256644821958616\n", "6. 0.564072602661327\n", "7. 0.525960478000343\n", "8. 0.604819754138589\n", "9. 0.951666688779369\n", "10. 0.97638466511853\n", "11. 0.508973858784884\n", "12. 0.284509904216975\n", "13. 0.563456729985774\n", "14. 0.968111847294495\n", "15. 0.453518448630348\n", "16. 0.633869176264852\n", "17. 0.606082778424025\n", "18. 0.908481978345662\n", "19. 0.129783247830346\n", "20. 0.170731674181297\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "$a\n", " [1] -1.26589645 -0.06857331 0.65840948 -0.22188187 0.58070040 0.63031939\n", " [7] -0.24221357 0.97023453 0.35185437 -1.33262148\n", "\n", "$b\n", "[1] 1 2 3 4 5 6 7 8 9\n", "\n", "$c\n", " [1] 0.1815914 0.1641841 0.3900804 0.6574321 0.2566448 0.5640726 0.5259605\n", " [8] 0.6048198 0.9516667 0.9763847 0.5089739 0.2845099 0.5634567 0.9681118\n", "[15] 0.4535184 0.6338692 0.6060828 0.9084820 0.1297832 0.1707317\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "lst = list (a= rnorm(10), b= 1:9, c=runif(20))\n", "lst" ] }, { "cell_type": "code", "execution_count": 16, "id": "ce7ebd9e", "metadata": { "scrolled": true, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "3" ], "text/latex": [ "3" ], "text/markdown": [ "3" ], "text/plain": [ "[1] 3" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\t
$a
\n", "\t\t
10
\n", "\t
$b
\n", "\t\t
9
\n", "\t
$c
\n", "\t\t
20
\n", "
\n" ], "text/latex": [ "\\begin{description}\n", "\\item[\\$a] 10\n", "\\item[\\$b] 9\n", "\\item[\\$c] 20\n", "\\end{description}\n" ], "text/markdown": [ "$a\n", ": 10\n", "$b\n", ": 9\n", "$c\n", ": 20\n", "\n", "\n" ], "text/plain": [ "$a\n", "[1] 10\n", "\n", "$b\n", "[1] 9\n", "\n", "$c\n", "[1] 20\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\t
$a
\n", "\t\t
0.00603314855184166
\n", "\t
$b
\n", "\t\t
5
\n", "\t
$c
\n", "\t\t
0.525017830473371
\n", "
\n" ], "text/latex": [ "\\begin{description}\n", "\\item[\\$a] 0.00603314855184166\n", "\\item[\\$b] 5\n", "\\item[\\$c] 0.525017830473371\n", "\\end{description}\n" ], "text/markdown": [ "$a\n", ": 0.00603314855184166\n", "$b\n", ": 5\n", "$c\n", ": 0.525017830473371\n", "\n", "\n" ], "text/plain": [ "$a\n", "[1] 0.006033149\n", "\n", "$b\n", "[1] 5\n", "\n", "$c\n", "[1] 0.5250178\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "length(lst)\n", "lapply(lst, length)\n", "lapply(lst, mean)" ] }, { "cell_type": "code", "execution_count": 17, "id": "deb15dc4", "metadata": { "scrolled": false, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\t
$a
\n", "\t\t
\n", "\t
$a
\n", "\t\t
\n", "
  1. -1.26589644825265
  2. -0.068573308896311
  3. 0.658409475454285
  4. -0.221881865339709
  5. 0.58070039713217
  6. 0.630319389017115
  7. -0.242213569358942
  8. 0.97023453146676
  9. 0.351854369172614
  10. -1.33262148487691
\n", "
\n", "\t
$b
\n", "\t\t
\n", "
  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
\n", "
\n", "\t
$c
\n", "\t\t
\n", "
  1. 0.181591403437778
  2. 0.164184056222439
  3. 0.390080413082615
  4. 0.657432082109153
  5. 0.256644821958616
  6. 0.564072602661327
  7. 0.525960478000343
  8. 0.604819754138589
  9. 0.951666688779369
  10. 0.97638466511853
  11. 0.508973858784884
  12. 0.284509904216975
  13. 0.563456729985774
  14. 0.968111847294495
  15. 0.453518448630348
  16. 0.633869176264852
  17. 0.606082778424025
  18. 0.908481978345662
  19. 0.129783247830346
  20. 0.170731674181297
\n", "
\n", "
\n", "
\n", "\t
$b
\n", "\t\t
\n", "\t
$a
\n", "\t\t
0.00603314855184166
\n", "\t
$b
\n", "\t\t
5
\n", "\t
$c
\n", "\t\t
0.525017830473371
\n", "
\n", "
\n", "
\n" ], "text/latex": [ "\\begin{description}\n", "\\item[\\$a] \\begin{description}\n", "\\item[\\$a] \\begin{enumerate*}\n", "\\item -1.26589644825265\n", "\\item -0.068573308896311\n", "\\item 0.658409475454285\n", "\\item -0.221881865339709\n", "\\item 0.58070039713217\n", "\\item 0.630319389017115\n", "\\item -0.242213569358942\n", "\\item 0.97023453146676\n", "\\item 0.351854369172614\n", "\\item -1.33262148487691\n", "\\end{enumerate*}\n", "\n", "\\item[\\$b] \\begin{enumerate*}\n", "\\item 1\n", "\\item 2\n", "\\item 3\n", "\\item 4\n", "\\item 5\n", "\\item 6\n", "\\item 7\n", "\\item 8\n", "\\item 9\n", "\\end{enumerate*}\n", "\n", "\\item[\\$c] \\begin{enumerate*}\n", "\\item 0.181591403437778\n", "\\item 0.164184056222439\n", "\\item 0.390080413082615\n", "\\item 0.657432082109153\n", "\\item 0.256644821958616\n", "\\item 0.564072602661327\n", "\\item 0.525960478000343\n", "\\item 0.604819754138589\n", "\\item 0.951666688779369\n", "\\item 0.97638466511853\n", "\\item 0.508973858784884\n", "\\item 0.284509904216975\n", "\\item 0.563456729985774\n", "\\item 0.968111847294495\n", "\\item 0.453518448630348\n", "\\item 0.633869176264852\n", "\\item 0.606082778424025\n", "\\item 0.908481978345662\n", "\\item 0.129783247830346\n", "\\item 0.170731674181297\n", "\\end{enumerate*}\n", "\n", "\\end{description}\n", "\n", "\\item[\\$b] \\begin{description}\n", "\\item[\\$a] 0.00603314855184166\n", "\\item[\\$b] 5\n", "\\item[\\$c] 0.525017830473371\n", "\\end{description}\n", "\n", "\\end{description}\n" ], "text/markdown": [ "$a\n", ": $a\n", ": 1. -1.26589644825265\n", "2. -0.068573308896311\n", "3. 0.658409475454285\n", "4. -0.221881865339709\n", "5. 0.58070039713217\n", "6. 0.630319389017115\n", "7. -0.242213569358942\n", "8. 0.97023453146676\n", "9. 0.351854369172614\n", "10. -1.33262148487691\n", "\n", "\n", "\n", "$b\n", ": 1. 1\n", "2. 2\n", "3. 3\n", "4. 4\n", "5. 5\n", "6. 6\n", "7. 7\n", "8. 8\n", "9. 9\n", "\n", "\n", "\n", "$c\n", ": 1. 0.181591403437778\n", "2. 0.164184056222439\n", "3. 0.390080413082615\n", "4. 0.657432082109153\n", "5. 0.256644821958616\n", "6. 0.564072602661327\n", "7. 0.525960478000343\n", "8. 0.604819754138589\n", "9. 0.951666688779369\n", "10. 0.97638466511853\n", "11. 0.508973858784884\n", "12. 0.284509904216975\n", "13. 0.563456729985774\n", "14. 0.968111847294495\n", "15. 0.453518448630348\n", "16. 0.633869176264852\n", "17. 0.606082778424025\n", "18. 0.908481978345662\n", "19. 0.129783247830346\n", "20. 0.170731674181297\n", "\n", "\n", "\n", "\n", "\n", "\n", "$b\n", ": $a\n", ": 0.00603314855184166\n", "$b\n", ": 5\n", "$c\n", ": 0.525017830473371\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "$a\n", "$a$a\n", " [1] -1.26589645 -0.06857331 0.65840948 -0.22188187 0.58070040 0.63031939\n", " [7] -0.24221357 0.97023453 0.35185437 -1.33262148\n", "\n", "$a$b\n", "[1] 1 2 3 4 5 6 7 8 9\n", "\n", "$a$c\n", " [1] 0.1815914 0.1641841 0.3900804 0.6574321 0.2566448 0.5640726 0.5259605\n", " [8] 0.6048198 0.9516667 0.9763847 0.5089739 0.2845099 0.5634567 0.9681118\n", "[15] 0.4535184 0.6338692 0.6060828 0.9084820 0.1297832 0.1707317\n", "\n", "\n", "$b\n", "$b$a\n", "[1] 0.006033149\n", "\n", "$b$b\n", "[1] 5\n", "\n", "$b$c\n", "[1] 0.5250178\n", "\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "lst2 = list (a = lst, b = lapply(lst, mean))\n", "lst2" ] }, { "cell_type": "code", "execution_count": 20, "id": "a8ca29d9", "metadata": { "scrolled": true, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
a.a
10
a.b
9
a.c
20
b.a
1
b.b
1
b.c
1
\n" ], "text/latex": [ "\\begin{description*}\n", "\\item[a.a] 10\n", "\\item[a.b] 9\n", "\\item[a.c] 20\n", "\\item[b.a] 1\n", "\\item[b.b] 1\n", "\\item[b.c] 1\n", "\\end{description*}\n" ], "text/markdown": [ "a.a\n", ": 10a.b\n", ": 9a.c\n", ": 20b.a\n", ": 1b.b\n", ": 1b.c\n", ": 1\n", "\n" ], "text/plain": [ "a.a a.b a.c b.a b.b b.c \n", " 10 9 20 1 1 1 " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\t
$a
\n", "\t\t
\n", "\t
$a
\n", "\t\t
10
\n", "\t
$b
\n", "\t\t
9
\n", "\t
$c
\n", "\t\t
20
\n", "
\n", "
\n", "\t
$b
\n", "\t\t
\n", "\t
$a
\n", "\t\t
1
\n", "\t
$b
\n", "\t\t
1
\n", "\t
$c
\n", "\t\t
1
\n", "
\n", "
\n", "
\n" ], "text/latex": [ "\\begin{description}\n", "\\item[\\$a] \\begin{description}\n", "\\item[\\$a] 10\n", "\\item[\\$b] 9\n", "\\item[\\$c] 20\n", "\\end{description}\n", "\n", "\\item[\\$b] \\begin{description}\n", "\\item[\\$a] 1\n", "\\item[\\$b] 1\n", "\\item[\\$c] 1\n", "\\end{description}\n", "\n", "\\end{description}\n" ], "text/markdown": [ "$a\n", ": $a\n", ": 10\n", "$b\n", ": 9\n", "$c\n", ": 20\n", "\n", "\n", "\n", "$b\n", ": $a\n", ": 1\n", "$b\n", ": 1\n", "$c\n", ": 1\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "$a\n", "$a$a\n", "[1] 10\n", "\n", "$a$b\n", "[1] 9\n", "\n", "$a$c\n", "[1] 20\n", "\n", "\n", "$b\n", "$b$a\n", "[1] 1\n", "\n", "$b$b\n", "[1] 1\n", "\n", "$b$c\n", "[1] 1\n", "\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "rapply(lst2, length)\n", "rapply(lst2, length, how=\"list\")" ] }, { "cell_type": "code", "execution_count": 21, "id": "cda044fc", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "lst1 = list (a = 4:6, b = 5:7)\n", "lst2 = list (a = 3:5, b = 8:10)" ] }, { "cell_type": "code", "execution_count": 22, "id": "ef20eed7", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "ERROR", "evalue": "Error in lst1 + lst2: non-numeric argument to binary operator\n", "output_type": "error", "traceback": [ "Error in lst1 + lst2: non-numeric argument to binary operator\nTraceback:\n" ] } ], "source": [ "lst1 + lst2 " ] }, { "cell_type": "code", "execution_count": 25, "id": "d241f4c0", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\n", "
A matrix: 3 × 2 of type int
ab
713
915
1117
\n" ], "text/latex": [ "A matrix: 3 × 2 of type int\n", "\\begin{tabular}{ll}\n", " a & b\\\\\n", "\\hline\n", "\t 7 & 13\\\\\n", "\t 9 & 15\\\\\n", "\t 11 & 17\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A matrix: 3 × 2 of type int\n", "\n", "| a | b |\n", "|---|---|\n", "| 7 | 13 |\n", "| 9 | 15 |\n", "| 11 | 17 |\n", "\n" ], "text/plain": [ " a b \n", "[1,] 7 13\n", "[2,] 9 15\n", "[3,] 11 17" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mapply(\"+\", lst1, lst2)" ] }, { "cell_type": "code", "execution_count": 26, "id": "e007d2c3", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
a
27
b
45
\n" ], "text/latex": [ "\\begin{description*}\n", "\\item[a] 27\n", "\\item[b] 45\n", "\\end{description*}\n" ], "text/markdown": [ "a\n", ": 27b\n", ": 45\n", "\n" ], "text/plain": [ " a b \n", "27 45 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mapply(sum, lst1, lst2)" ] }, { "cell_type": "markdown", "id": "245b135c", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Functions\n", "\n", "- Create a function object\n", "\n", " myFun = function (par)\n", " {\n", " out = max(par1) - min(par2)\n", " return(out)\n", " }\n", "\n", "- Load the function: \n", " - Copy and paste to the R console.\n", " - Save the function to a file and load it with the `source()` function.\n", "\n", "- Execute your function.\n", "\n" ] }, { "cell_type": "code", "execution_count": 28, "id": "9fb114ea", "metadata": {}, "outputs": [ { "data": { "text/html": [ "99" ], "text/latex": [ "99" ], "text/markdown": [ "99" ], "text/plain": [ "[1] 99" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "maxmin = function (x){\n", " out = max(x)-min(x)\n", " return(out)\n", "}\n", "\n", "maxmin(1:100)" ] }, { "cell_type": "code", "execution_count": 31, "id": "af028ea6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "99" ], "text/latex": [ "99" ], "text/markdown": [ "99" ], "text/plain": [ "[1] 99" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "rm(list=ls())\n", "\n", "source(\"code/myfun.R\")\n", "maxmin(1:100)" ] }, { "cell_type": "markdown", "id": "41599e82", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## What is a good function?\n", "\n", "- Validating the input parameter type\n", "- Simple in logic and implementation\n", "- Error catching\n", "- Using `return()`\n", "- Speed and performance matter." ] }, { "cell_type": "markdown", "id": "a25ab3a5", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Lab: a summary function\n", "\n", "- Write a function `mySummary` where the input argument is `x` can be any vector and the output should contain the basic summary (mean, variance, length, max and minimum values, type) of the vector you have supplied to the function.\n", "\n", "- Test your function with some vectors (that you make up by yourself).\n", "\n", "- What will happen if your input is not a vector (e.g. a data frame `weekPlanNew`) in our previous example?" ] }, { "cell_type": "markdown", "id": "9bdd2d4f", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Lab: Roots function for the quadratic equation\n", "\n", "- The roots for the quadratic equation $ax^2+bx+c=0$ are of the form\n", " $$\\label{eq:1}\n", " x_1=\\frac{-b + \\sqrt {b^2-4ac}}{2a} \\quad \\text{and} \\quad\n", " x_2=\\frac{-b - \\sqrt {b^2-4ac}}{2a}$$\n", "\n", "- Write a function named `quaroot` to solve the roots of given\n", " quadratic equation with `a ,b , c,` as input arguments. \\[Hint: you\n", " may need the `sqrt()` function\\]\n", "\n", "- Test your function on the following equations $$\\label{eq:2}\n", " \\begin{split}\n", " x^2+4x-1=0\\\\\n", " -2x^2+2x=0\\\\\n", " 3x^2-9x+1=0\\\\\n", " x^2 -4 = 0\\\\\n", " \\end{split}$$\n", "\n", "- Test your function with the equation $5x^2+2x+1=0$. What are the\n", " results? Why? \\[Hint: check $b^2-4ac$\\]?\n", "\n", "- Modify your function and return `NA` if $b^2-4ac < 0$.\n" ] }, { "cell_type": "code", "execution_count": 5, "id": "cbbffd63", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "quaroot <- function(a, b, c)\n", "{\n", " x1 <- (-b+sqrt(b^2-4*a*c))/(2*a)\n", " x2 <- (-b-sqrt(b^2-4*a*c))/(2*a)\n", "\n", " out <- c(x1, x2)\n", " return(out)\n", "}" ] }, { "cell_type": "code", "execution_count": 6, "id": "f8d00547", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "quarootm <- function(a, b, c)\n", "{\n", " d <- b^2-4*a*c\n", "\n", " if(d<0)\n", " {\n", " x1 <- NA\n", " x2 <- NA\n", " }\n", " else\n", " {\n", " x1 <- (-b+sqrt(d))/(2*a)\n", " x2 <- (-b-sqrt(d))/(2*a)\n", " }\n", "\n", " out <- c(x1, x2)\n", " return(out)\n", "}" ] }, { "cell_type": "markdown", "id": "ebb47936", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Using R packages\n", "\n", "- Load an R package: `library(\"PackageName\")` or\n", " `require(\"PackageName\")`\n", "\n", "- Install an R package from the Internet (CRAN):\n", " `install.packages(\"PackageName\")`" ] }, { "cell_type": "markdown", "id": "34f22bcd", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- Install an R package from GitHub with the `devtools` package:\n", " \n", " install.packages(\"devtools\")\n", " devtools::install_github(\"ykang/gratis\")\n", " \n", "- If you have Windows system and sometimes you R package needs to compile, you need the \"Rtools\" software for building packages for R under Microsoft Windows, available at [https://cran.r-project.org/bin/windows/Rtools/](https://cran.r-project.org/bin/windows/Rtools/)." ] }, { "cell_type": "markdown", "id": "64c9276b", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Coding Style\n", "\n", "- Good coding style is like correct punctuation.\n", "- Bottom line: your coding style should make other people easy to understand you code.\n", "- Suggested reading the tidyverse style guide: [https://style.tidyverse.org/](https://style.tidyverse.org/)" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "4.3.2" } }, "nbformat": 4, "nbformat_minor": 5 }