{ "cells": [ { "cell_type": "markdown", "id": "c57b2ce5", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "# R Functions and Packages\n", "\n", "\n", "Feng Li\n", "\n", "School of Statistics and Mathematics\n", "\n", "Central University of Finance and Economics\n", "\n", "[feng.li@cufe.edu.cn](mailto:feng.li@cufe.edu.cn)\n", "\n", "[https://feng.li/statcomp](https://feng.li/statcomp)\n", "\n", "_>>> Link to Python version_ [1](https://feng.li/files/python/P01-Python-from-Scratch/L01.3-Python-Functions-and-Modules.slides.html)" ] }, { "cell_type": "markdown", "id": "a533c945", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Control-flow constructs\n", "\n", "### The if condition\n", "\n", "- Binary comparison\n", "\n", " x == y all.equal() identical()\n", " x != y\n", " x > y\n", " x < y\n", " x >= y\n", " x <= y\n", " x %in% y\n", " \n", "- What would you expect when $x$ and $y$ are vectors, matrices ? ...\n" ] }, { "cell_type": "markdown", "id": "9ca1e3c6", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- The if condition statement in R\n", "\n", " if (condition){\n", " do something\n", " }\n", " else{\n", " do something else\n", " }" ] }, { "cell_type": "markdown", "id": "57f0dd29", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Lab: Leap year check\n", "\n", "- February 29, known as a leap day in the calendar, is a date that occurs in most years that are evenly divisible by $4$, such as $2004$, $2008$, $2012$ and $2016$. Years that are evenly divisible by $100$ do not contain a leap day, with the exception of years that are evenly divisible by $400$, which do contain a leap day; thus $1900$ did not contain a leap day while $2000$ did.\n", "\n", "- Write a function called is.leapday to check if a given year has February 29 \$Hint: you may need ?%%.\$.\n", "\n", "- Test your function for some years.\n", "\n", "- What can you do to improve for the function in terms of error tolerance?\n", "\n", "- If I want to check which year has a leap day for a sequence of given years. Modify your function to implement it." ] }, { "cell_type": "markdown", "id": "a8e135d8", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Loops\n", "\n", "- The for loop" ] }, { "cell_type": "code", "execution_count": 2, "id": "4c1a5c36", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "
 101 105 109 113 117 103 107 111 115 119
\n" ], "text/latex": [ "A matrix: 2 × 5 of type int\n", "\\begin{tabular}{lllll}\n", "\t 101 & 105 & 109 & 113 & 117\\\\\n", "\t 103 & 107 & 111 & 115 & 119\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A matrix: 2 × 5 of type int\n", "\n", "| 101 | 105 | 109 | 113 | 117 |\n", "| 103 | 107 | 111 | 115 | 119 |\n", "\n" ], "text/plain": [ " [,1] [,2] [,3] [,4] [,5]\n", "[1,] 101 105 109 113 117 \n", "[2,] 103 107 111 115 119 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "B = matrix(1:10,2,5)\n", "C = matrix(100:109,2,5)\n", "A = matrix(NA,2,5)\n", "for(i in 1:length(A))\n", "{\n", " A[i] = B[i] + C[i]\n", "}\n", "A" ] }, { "cell_type": "markdown", "id": "41bcb10d", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- The while loop" ] }, { "cell_type": "code", "execution_count": 4, "id": "42c57e02", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\n", "
 101 105 109 113 117 103 107 111 115 119
\n" ], "text/latex": [ "A matrix: 2 × 5 of type int\n", "\\begin{tabular}{lllll}\n", "\t 101 & 105 & 109 & 113 & 117\\\\\n", "\t 103 & 107 & 111 & 115 & 119\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A matrix: 2 × 5 of type int\n", "\n", "| 101 | 105 | 109 | 113 | 117 |\n", "| 103 | 107 | 111 | 115 | 119 |\n", "\n" ], "text/plain": [ " [,1] [,2] [,3] [,4] [,5]\n", "[1,] 101 105 109 113 117 \n", "[2,] 103 107 111 115 119 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "i = 0\n", "while(i < length(A)){ \n", " i = i + 1\n", " A[i] = B[i] + C[i]\n", "}\n", "A" ] }, { "cell_type": "markdown", "id": "b4d61a77", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### apply() type loops\n", "\n", "- Calculate row sums for a matrix with a loop.\n", "\n", "- Apply sum() function to each row of the matrix.\n", "\n", "- apply() to an array with higher dimension.\n", "\n", "- Apply your own function to each row of the matrix.\n", "\n", "- lapply() Apply a function to a list\n", "\n", "- mapply() Apply a function to multiple list or vector arguments.\n", "\n", "- The \\... arguments in a function.\n", "\n", "- Supply more arguments to apply() type functions." ] }, { "cell_type": "code", "execution_count": 2, "id": "8932ed0b", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
 1 21 41 61 81 2 22 42 62 82 3 23 43 63 83 4 24 44 64 84 5 25 45 65 85 6 26 46 66 86 7 27 47 67 87 8 28 48 68 88 9 29 49 69 89 10 30 50 70 90 11 31 51 71 91 12 32 52 72 92 13 33 53 73 93 14 34 54 74 94 15 35 55 75 95 16 36 56 76 96 17 37 57 77 97 18 38 58 78 98 19 39 59 79 99 20 40 60 80 100
\n" ], "text/latex": [ "A matrix: 20 × 5 of type int\n", "\\begin{tabular}{lllll}\n", "\t 1 & 21 & 41 & 61 & 81\\\\\n", "\t 2 & 22 & 42 & 62 & 82\\\\\n", "\t 3 & 23 & 43 & 63 & 83\\\\\n", "\t 4 & 24 & 44 & 64 & 84\\\\\n", "\t 5 & 25 & 45 & 65 & 85\\\\\n", "\t 6 & 26 & 46 & 66 & 86\\\\\n", "\t 7 & 27 & 47 & 67 & 87\\\\\n", "\t 8 & 28 & 48 & 68 & 88\\\\\n", "\t 9 & 29 & 49 & 69 & 89\\\\\n", "\t 10 & 30 & 50 & 70 & 90\\\\\n", "\t 11 & 31 & 51 & 71 & 91\\\\\n", "\t 12 & 32 & 52 & 72 & 92\\\\\n", "\t 13 & 33 & 53 & 73 & 93\\\\\n", "\t 14 & 34 & 54 & 74 & 94\\\\\n", "\t 15 & 35 & 55 & 75 & 95\\\\\n", "\t 16 & 36 & 56 & 76 & 96\\\\\n", "\t 17 & 37 & 57 & 77 & 97\\\\\n", "\t 18 & 38 & 58 & 78 & 98\\\\\n", "\t 19 & 39 & 59 & 79 & 99\\\\\n", "\t 20 & 40 & 60 & 80 & 100\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A matrix: 20 × 5 of type int\n", "\n", "| 1 | 21 | 41 | 61 | 81 |\n", "| 2 | 22 | 42 | 62 | 82 |\n", "| 3 | 23 | 43 | 63 | 83 |\n", "| 4 | 24 | 44 | 64 | 84 |\n", "| 5 | 25 | 45 | 65 | 85 |\n", "| 6 | 26 | 46 | 66 | 86 |\n", "| 7 | 27 | 47 | 67 | 87 |\n", "| 8 | 28 | 48 | 68 | 88 |\n", "| 9 | 29 | 49 | 69 | 89 |\n", "| 10 | 30 | 50 | 70 | 90 |\n", "| 11 | 31 | 51 | 71 | 91 |\n", "| 12 | 32 | 52 | 72 | 92 |\n", "| 13 | 33 | 53 | 73 | 93 |\n", "| 14 | 34 | 54 | 74 | 94 |\n", "| 15 | 35 | 55 | 75 | 95 |\n", "| 16 | 36 | 56 | 76 | 96 |\n", "| 17 | 37 | 57 | 77 | 97 |\n", "| 18 | 38 | 58 | 78 | 98 |\n", "| 19 | 39 | 59 | 79 | 99 |\n", "| 20 | 40 | 60 | 80 | 100 |\n", "\n" ], "text/plain": [ " [,1] [,2] [,3] [,4] [,5]\n", " [1,] 1 21 41 61 81 \n", " [2,] 2 22 42 62 82 \n", " [3,] 3 23 43 63 83 \n", " [4,] 4 24 44 64 84 \n", " [5,] 5 25 45 65 85 \n", " [6,] 6 26 46 66 86 \n", " [7,] 7 27 47 67 87 \n", " [8,] 8 28 48 68 88 \n", " [9,] 9 29 49 69 89 \n", "[10,] 10 30 50 70 90 \n", "[11,] 11 31 51 71 91 \n", "[12,] 12 32 52 72 92 \n", "[13,] 13 33 53 73 93 \n", "[14,] 14 34 54 74 94 \n", "[15,] 15 35 55 75 95 \n", "[16,] 16 36 56 76 96 \n", "[17,] 17 37 57 77 97 \n", "[18,] 18 38 58 78 98 \n", "[19,] 19 39 59 79 99 \n", "[20,] 20 40 60 80 100 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mat = matrix(1:100,20,5)\n", "mat" ] }, { "cell_type": "code", "execution_count": 5, "id": "b55c3787", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "
1. 10.5
2. 30.5
3. 50.5
4. 70.5
5. 90.5
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 10.5\n", "\\item 30.5\n", "\\item 50.5\n", "\\item 70.5\n", "\\item 90.5\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 10.5\n", "2. 30.5\n", "3. 50.5\n", "4. 70.5\n", "5. 90.5\n", "\n", "\n" ], "text/plain": [ "[1] 10.5 30.5 50.5 70.5 90.5" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
1. 41
2. 42
3. 43
4. 44
5. 45
6. 46
7. 47
8. 48
9. 49
10. 50
11. 51
12. 52
13. 53
14. 54
15. 55
16. 56
17. 57
18. 58
19. 59
20. 60
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 41\n", "\\item 42\n", "\\item 43\n", "\\item 44\n", "\\item 45\n", "\\item 46\n", "\\item 47\n", "\\item 48\n", "\\item 49\n", "\\item 50\n", "\\item 51\n", "\\item 52\n", "\\item 53\n", "\\item 54\n", "\\item 55\n", "\\item 56\n", "\\item 57\n", "\\item 58\n", "\\item 59\n", "\\item 60\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 41\n", "2. 42\n", "3. 43\n", "4. 44\n", "5. 45\n", "6. 46\n", "7. 47\n", "8. 48\n", "9. 49\n", "10. 50\n", "11. 51\n", "12. 52\n", "13. 53\n", "14. 54\n", "15. 55\n", "16. 56\n", "17. 57\n", "18. 58\n", "19. 59\n", "20. 60\n", "\n", "\n" ], "text/plain": [ " [1] 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "
1. 5.91607978309962
2. 5.91607978309962
3. 5.91607978309962
4. 5.91607978309962
5. 5.91607978309962
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 5.91607978309962\n", "\\item 5.91607978309962\n", "\\item 5.91607978309962\n", "\\item 5.91607978309962\n", "\\item 5.91607978309962\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 5.91607978309962\n", "2. 5.91607978309962\n", "3. 5.91607978309962\n", "4. 5.91607978309962\n", "5. 5.91607978309962\n", "\n", "\n" ], "text/plain": [ "[1] 5.91608 5.91608 5.91608 5.91608 5.91608" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "apply(mat, 2, mean)\n", "apply(mat, 1, mean)\n", "apply(mat, 2, sd)" ] }, { "cell_type": "code", "execution_count": 6, "id": "e821abba", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "arr = array(1:240, c(20,3,4))" ] }, { "cell_type": "code", "execution_count": 9, "id": "b22453d7", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "
1. 30.5
2. 90.5
3. 150.5
4. 210.5
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 30.5\n", "\\item 90.5\n", "\\item 150.5\n", "\\item 210.5\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 30.5\n", "2. 90.5\n", "3. 150.5\n", "4. 210.5\n", "\n", "\n" ], "text/plain": [ "[1] 30.5 90.5 150.5 210.5" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\t\n", "\n", "
 91 111 131 92 112 132 93 113 133 94 114 134 95 115 135 96 116 136 97 117 137 98 118 138 99 119 139 100 120 140 101 121 141 102 122 142 103 123 143 104 124 144 105 125 145 106 126 146 107 127 147 108 128 148 109 129 149 110 130 150
\n" ], "text/latex": [ "A matrix: 20 × 3 of type dbl\n", "\\begin{tabular}{lll}\n", "\t 91 & 111 & 131\\\\\n", "\t 92 & 112 & 132\\\\\n", "\t 93 & 113 & 133\\\\\n", "\t 94 & 114 & 134\\\\\n", "\t 95 & 115 & 135\\\\\n", "\t 96 & 116 & 136\\\\\n", "\t 97 & 117 & 137\\\\\n", "\t 98 & 118 & 138\\\\\n", "\t 99 & 119 & 139\\\\\n", "\t 100 & 120 & 140\\\\\n", "\t 101 & 121 & 141\\\\\n", "\t 102 & 122 & 142\\\\\n", "\t 103 & 123 & 143\\\\\n", "\t 104 & 124 & 144\\\\\n", "\t 105 & 125 & 145\\\\\n", "\t 106 & 126 & 146\\\\\n", "\t 107 & 127 & 147\\\\\n", "\t 108 & 128 & 148\\\\\n", "\t 109 & 129 & 149\\\\\n", "\t 110 & 130 & 150\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A matrix: 20 × 3 of type dbl\n", "\n", "| 91 | 111 | 131 |\n", "| 92 | 112 | 132 |\n", "| 93 | 113 | 133 |\n", "| 94 | 114 | 134 |\n", "| 95 | 115 | 135 |\n", "| 96 | 116 | 136 |\n", "| 97 | 117 | 137 |\n", "| 98 | 118 | 138 |\n", "| 99 | 119 | 139 |\n", "| 100 | 120 | 140 |\n", "| 101 | 121 | 141 |\n", "| 102 | 122 | 142 |\n", "| 103 | 123 | 143 |\n", "| 104 | 124 | 144 |\n", "| 105 | 125 | 145 |\n", "| 106 | 126 | 146 |\n", "| 107 | 127 | 147 |\n", "| 108 | 128 | 148 |\n", "| 109 | 129 | 149 |\n", "| 110 | 130 | 150 |\n", "\n" ], "text/plain": [ " [,1] [,2] [,3]\n", " [1,] 91 111 131 \n", " [2,] 92 112 132 \n", " [3,] 93 113 133 \n", " [4,] 94 114 134 \n", " [5,] 95 115 135 \n", " [6,] 96 116 136 \n", " [7,] 97 117 137 \n", " [8,] 98 118 138 \n", " [9,] 99 119 139 \n", "[10,] 100 120 140 \n", "[11,] 101 121 141 \n", "[12,] 102 122 142 \n", "[13,] 103 123 143 \n", "[14,] 104 124 144 \n", "[15,] 105 125 145 \n", "[16,] 106 126 146 \n", "[17,] 107 127 147 \n", "[18,] 108 128 148 \n", "[19,] 109 129 149 \n", "[20,] 110 130 150 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "apply(arr, 3, mean)\n", "apply(arr, c(1, 2), mean)" ] }, { "cell_type": "code", "execution_count": 10, "id": "fe676a69", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "
1. 19
2. 19
3. 19
4. 19
5. 19
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 19\n", "\\item 19\n", "\\item 19\n", "\\item 19\n", "\\item 19\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 19\n", "2. 19\n", "3. 19\n", "4. 19\n", "5. 19\n", "\n", "\n" ], "text/plain": [ "[1] 19 19 19 19 19" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "apply(mat, 2, function (x) max(x)-min(x))" ] }, { "cell_type": "code", "execution_count": 11, "id": "e29dd4be", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "
1. 19
2. 19
3. 19
4. 19
5. 19
\n" ], "text/latex": [ "\\begin{enumerate*}\n", "\\item 19\n", "\\item 19\n", "\\item 19\n", "\\item 19\n", "\\item 19\n", "\\end{enumerate*}\n" ], "text/markdown": [ "1. 19\n", "2. 19\n", "3. 19\n", "4. 19\n", "5. 19\n", "\n", "\n" ], "text/plain": [ "[1] 19 19 19 19 19" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "maxmin = function (x) {\n", " max(x)-min(x)\n", "}\n", "\n", "apply(mat, 2, maxmin)" ] }, { "cell_type": "markdown", "id": "5768ca55", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- The advantages of ()apply.\n", "\n", " - Easy construct\n", "\n", " - Less coding\n", "\n", " - apply() type loops is essentially a more efficient version loop in R." ] }, { "cell_type": "markdown", "id": "468f45d0", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Write efficient loops in R\n", "\n", "- Avoid loops as much as possible. We should always try to vectorize the calculations.\n", "\n", "- Use ()apply type loop if possible.\n", "\n", "- Think a lot about under- and over-flow\n", "\n", "- Allocate the memory space before looping. This is a much slower\n", " loop.\n", "\n", " B = matrix(1:10,2,5)\n", " C = matrix(100:109,2,5)\n", " A = NULL\n", " for(i in 1:n)\n", " {\n", " A[i] = B[i] + C[i]\n", " }" ] }, { "cell_type": "markdown", "id": "d1a1d292", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### List Arithmetics\n", "\n", "- Apply a function to the elements of a list\n", "\n", " lapply(X, FUN, ...)\n", " rapply(object, f, how = c(\"unlist\",\"replace\", \"list\"), ...)" ] }, { "cell_type": "markdown", "id": "0b58c9f5", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- Operators with many lists\n", "\n", " mapply(FUN, ..., MoreArgs = NULL, SIMPLIFY = TRUE, USE.NAMES = TRUE)\n", " \n", " mapply(\"+\", list1, list2, list3, SIMPLIFY = FALSE)\n", " mapply(function(x, y) abs(x)*log(abs(y)), list1, list2, SIMPLIFY = FALSE)" ] }, { "cell_type": "code", "execution_count": 13, "id": "0f52f710", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\t
$a \n", "\t\t \n", " 1. -1.26589644825265 2. -0.068573308896311 3. 0.658409475454285 4. -0.221881865339709 5. 0.58070039713217 6. 0.630319389017115 7. -0.242213569358942 8. 0.97023453146676 9. 0.351854369172614 10. -1.33262148487691 \n", " \n", "\t$b
\n", "\t\t
\n", "
1. 1
2. 2
3. 3
4. 4
5. 5
6. 6
7. 7
8. 8
9. 9
\n", "
\n", "\t
$c \n", "\t\t \n", " 1. 0.181591403437778 2. 0.164184056222439 3. 0.390080413082615 4. 0.657432082109153 5. 0.256644821958616 6. 0.564072602661327 7. 0.525960478000343 8. 0.604819754138589 9. 0.951666688779369 10. 0.97638466511853 11. 0.508973858784884 12. 0.284509904216975 13. 0.563456729985774 14. 0.968111847294495 15. 0.453518448630348 16. 0.633869176264852 17. 0.606082778424025 18. 0.908481978345662 19. 0.129783247830346 20. 0.170731674181297 \n", " \n", " \n" ], "text/latex": [ "\\begin{description}\n", "\\item[\\$a] \\begin{enumerate*}\n", "\\item -1.26589644825265\n", "\\item -0.068573308896311\n", "\\item 0.658409475454285\n", "\\item -0.221881865339709\n", "\\item 0.58070039713217\n", "\\item 0.630319389017115\n", "\\item -0.242213569358942\n", "\\item 0.97023453146676\n", "\\item 0.351854369172614\n", "\\item -1.33262148487691\n", "\\end{enumerate*}\n", "\n", "\\item[\\$b] \\begin{enumerate*}\n", "\\item 1\n", "\\item 2\n", "\\item 3\n", "\\item 4\n", "\\item 5\n", "\\item 6\n", "\\item 7\n", "\\item 8\n", "\\item 9\n", "\\end{enumerate*}\n", "\n", "\\item[\\$c] \\begin{enumerate*}\n", "\\item 0.181591403437778\n", "\\item 0.164184056222439\n", "\\item 0.390080413082615\n", "\\item 0.657432082109153\n", "\\item 0.256644821958616\n", "\\item 0.564072602661327\n", "\\item 0.525960478000343\n", "\\item 0.604819754138589\n", "\\item 0.951666688779369\n", "\\item 0.97638466511853\n", "\\item 0.508973858784884\n", "\\item 0.284509904216975\n", "\\item 0.563456729985774\n", "\\item 0.968111847294495\n", "\\item 0.453518448630348\n", "\\item 0.633869176264852\n", "\\item 0.606082778424025\n", "\\item 0.908481978345662\n", "\\item 0.129783247830346\n", "\\item 0.170731674181297\n", "\\end{enumerate*}\n", "\n", "\\end{description}\n" ], "text/markdown": [ "$a\n", ": 1. -1.26589644825265\n", "2. -0.068573308896311\n", "3. 0.658409475454285\n", "4. -0.221881865339709\n", "5. 0.58070039713217\n", "6. 0.630319389017115\n", "7. -0.242213569358942\n", "8. 0.97023453146676\n", "9. 0.351854369172614\n", "10. -1.33262148487691\n", "\n", "\n", "\n", "$b\n", ": 1. 1\n", "2. 2\n", "3. 3\n", "4. 4\n", "5. 5\n", "6. 6\n", "7. 7\n", "8. 8\n", "9. 9\n", "\n", "\n", "\n", "$c\n", ": 1. 0.181591403437778\n", "2. 0.164184056222439\n", "3. 0.390080413082615\n", "4. 0.657432082109153\n", "5. 0.256644821958616\n", "6. 0.564072602661327\n", "7. 0.525960478000343\n", "8. 0.604819754138589\n", "9. 0.951666688779369\n", "10. 0.97638466511853\n", "11. 0.508973858784884\n", "12. 0.284509904216975\n", "13. 0.563456729985774\n", "14. 0.968111847294495\n", "15. 0.453518448630348\n", "16. 0.633869176264852\n", "17. 0.606082778424025\n", "18. 0.908481978345662\n", "19. 0.129783247830346\n", "20. 0.170731674181297\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "$a\n", " [1] -1.26589645 -0.06857331 0.65840948 -0.22188187 0.58070040 0.63031939\n", " [7] -0.24221357 0.97023453 0.35185437 -1.33262148\n", "\n", "$b\n", "[1] 1 2 3 4 5 6 7 8 9\n", "\n", "$c\n", " [1] 0.1815914 0.1641841 0.3900804 0.6574321 0.2566448 0.5640726 0.5259605\n", " [8] 0.6048198 0.9516667 0.9763847 0.5089739 0.2845099 0.5634567 0.9681118\n", "[15] 0.4535184 0.6338692 0.6060828 0.9084820 0.1297832 0.1707317\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "lst = list (a= rnorm(10), b= 1:9, c=runif(20))\n", "lst" ] }, { "cell_type": "code", "execution_count": 16, "id": "ce7ebd9e", "metadata": { "scrolled": true, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "3" ], "text/latex": [ "3" ], "text/markdown": [ "3" ], "text/plain": [ "[1] 3" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\t
$a \n", "\t\t 10 \n", "\t$b
\n", "\t\t
9
\n", "\t
$c \n", "\t\t 20 \n", " \n" ], "text/latex": [ "\\begin{description}\n", "\\item[\\$a] 10\n", "\\item[\\$b] 9\n", "\\item[\\$c] 20\n", "\\end{description}\n" ], "text/markdown": [ "$a\n", ": 10\n", "$b\n", ": 9\n", "$c\n", ": 20\n", "\n", "\n" ], "text/plain": [ "$a\n", "[1] 10\n", "\n", "$b\n", "[1] 9\n", "\n", "$c\n", "[1] 20\n" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\t
$a \n", "\t\t 0.00603314855184166 \n", "\t$b
\n", "\t\t
5
\n", "\t
$c \n", "\t\t 0.525017830473371 \n", " \n" ], "text/latex": [ "\\begin{description}\n", "\\item[\\$a] 0.00603314855184166\n", "\\item[\\$b] 5\n", "\\item[\\$c] 0.525017830473371\n", "\\end{description}\n" ], "text/markdown": [ "$a\n", ": 0.00603314855184166\n", "$b\n", ": 5\n", "$c\n", ": 0.525017830473371\n", "\n", "\n" ], "text/plain": [ "$a\n", "[1] 0.006033149\n", "\n", "$b\n", "[1] 5\n", "\n", "$c\n", "[1] 0.5250178\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "length(lst)\n", "lapply(lst, length)\n", "lapply(lst, mean)" ] }, { "cell_type": "code", "execution_count": 17, "id": "deb15dc4", "metadata": { "scrolled": false, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
\n", "\t
$a \n", "\t\t \n", "\t$a
\n", "\t\t
\n", "
1. -1.26589644825265
2. -0.068573308896311
3. 0.658409475454285
4. -0.221881865339709
5. 0.58070039713217
6. 0.630319389017115
7. -0.242213569358942
8. 0.97023453146676
9. 0.351854369172614
10. -1.33262148487691
\n", "
\n", "\t
$b \n", "\t\t \n", " 1. 1 2. 2 3. 3 4. 4 5. 5 6. 6 7. 7 8. 8 9. 9 \n", " \n", "\t$c
\n", "\t\t
\n", "
1. 0.181591403437778
2. 0.164184056222439
3. 0.390080413082615
4. 0.657432082109153
5. 0.256644821958616
6. 0.564072602661327
7. 0.525960478000343
8. 0.604819754138589
9. 0.951666688779369
10. 0.97638466511853
11. 0.508973858784884
12. 0.284509904216975
13. 0.563456729985774
14. 0.968111847294495
15. 0.453518448630348
16. 0.633869176264852
17. 0.606082778424025
18. 0.908481978345662
19. 0.129783247830346
20. 0.170731674181297
\n", "
\n", "
\n", "
\n", "\t
$b \n", "\t\t \n", "\t$a
\n", "\t\t
0.00603314855184166
\n", "\t
$b \n", "\t\t 5 \n", "\t$c
\n", "\t\t
0.525017830473371
\n", "
\n", "
\n", "
\n" ], "text/latex": [ "\\begin{description}\n", "\\item[\\$a] \\begin{description}\n", "\\item[\\$a] \\begin{enumerate*}\n", "\\item -1.26589644825265\n", "\\item -0.068573308896311\n", "\\item 0.658409475454285\n", "\\item -0.221881865339709\n", "\\item 0.58070039713217\n", "\\item 0.630319389017115\n", "\\item -0.242213569358942\n", "\\item 0.97023453146676\n", "\\item 0.351854369172614\n", "\\item -1.33262148487691\n", "\\end{enumerate*}\n", "\n", "\\item[\\$b] \\begin{enumerate*}\n", "\\item 1\n", "\\item 2\n", "\\item 3\n", "\\item 4\n", "\\item 5\n", "\\item 6\n", "\\item 7\n", "\\item 8\n", "\\item 9\n", "\\end{enumerate*}\n", "\n", "\\item[\\$c] \\begin{enumerate*}\n", "\\item 0.181591403437778\n", "\\item 0.164184056222439\n", "\\item 0.390080413082615\n", "\\item 0.657432082109153\n", "\\item 0.256644821958616\n", "\\item 0.564072602661327\n", "\\item 0.525960478000343\n", "\\item 0.604819754138589\n", "\\item 0.951666688779369\n", "\\item 0.97638466511853\n", "\\item 0.508973858784884\n", "\\item 0.284509904216975\n", "\\item 0.563456729985774\n", "\\item 0.968111847294495\n", "\\item 0.453518448630348\n", "\\item 0.633869176264852\n", "\\item 0.606082778424025\n", "\\item 0.908481978345662\n", "\\item 0.129783247830346\n", "\\item 0.170731674181297\n", "\\end{enumerate*}\n", "\n", "\\end{description}\n", "\n", "\\item[\\$b] \\begin{description}\n", "\\item[\\$a] 0.00603314855184166\n", "\\item[\\$b] 5\n", "\\item[\\$c] 0.525017830473371\n", "\\end{description}\n", "\n", "\\end{description}\n" ], "text/markdown": [ "$a\n", ":$a\n", ": 1. -1.26589644825265\n", "2. -0.068573308896311\n", "3. 0.658409475454285\n", "4. -0.221881865339709\n", "5. 0.58070039713217\n", "6. 0.630319389017115\n", "7. -0.242213569358942\n", "8. 0.97023453146676\n", "9. 0.351854369172614\n", "10. -1.33262148487691\n", "\n", "\n", "\n", "$b\n", ": 1. 1\n", "2. 2\n", "3. 3\n", "4. 4\n", "5. 5\n", "6. 6\n", "7. 7\n", "8. 8\n", "9. 9\n", "\n", "\n", "\n", "$c\n", ": 1. 0.181591403437778\n", "2. 0.164184056222439\n", "3. 0.390080413082615\n", "4. 0.657432082109153\n", "5. 0.256644821958616\n", "6. 0.564072602661327\n", "7. 0.525960478000343\n", "8. 0.604819754138589\n", "9. 0.951666688779369\n", "10. 0.97638466511853\n", "11. 0.508973858784884\n", "12. 0.284509904216975\n", "13. 0.563456729985774\n", "14. 0.968111847294495\n", "15. 0.453518448630348\n", "16. 0.633869176264852\n", "17. 0.606082778424025\n", "18. 0.908481978345662\n", "19. 0.129783247830346\n", "20. 0.170731674181297\n", "\n", "\n", "\n", "\n", "\n", "\n", "$b\n", ":$a\n", ": 0.00603314855184166\n", "$b\n", ": 5\n", "$c\n", ": 0.525017830473371\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "$a\n", "$a$a\n", " [1] -1.26589645 -0.06857331 0.65840948 -0.22188187 0.58070040 0.63031939\n", " [7] -0.24221357 0.97023453 0.35185437 -1.33262148\n", "\n", "$a$b\n", "[1] 1 2 3 4 5 6 7 8 9\n", "\n", "$a$c\n", " [1] 0.1815914 0.1641841 0.3900804 0.6574321 0.2566448 0.5640726 0.5259605\n", " [8] 0.6048198 0.9516667 0.9763847 0.5089739 0.2845099 0.5634567 0.9681118\n", "[15] 0.4535184 0.6338692 0.6060828 0.9084820 0.1297832 0.1707317\n", "\n", "\n", "$b\n", "$b$a\n", "[1] 0.006033149\n", "\n", "$b$b\n", "[1] 5\n", "\n", "$b$c\n", "[1] 0.5250178\n", "\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "lst2 = list (a = lst, b = lapply(lst, mean))\n", "lst2" ] }, { "cell_type": "code", "execution_count": 20, "id": "a8ca29d9", "metadata": { "scrolled": true, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
a.a
10
a.b
9
a.c
20
b.a
1
b.b
1
b.c
1
\n" ], "text/latex": [ "\\begin{description*}\n", "\\item[a.a] 10\n", "\\item[a.b] 9\n", "\\item[a.c] 20\n", "\\item[b.a] 1\n", "\\item[b.b] 1\n", "\\item[b.c] 1\n", "\\end{description*}\n" ], "text/markdown": [ "a.a\n", ": 10a.b\n", ": 9a.c\n", ": 20b.a\n", ": 1b.b\n", ": 1b.c\n", ": 1\n", "\n" ], "text/plain": [ "a.a a.b a.c b.a b.b b.c \n", " 10 9 20 1 1 1 " ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/html": [ "
\n", "\t
$a \n", "\t\t \n", "\t$a
\n", "\t\t
10
\n", "\t
$b \n", "\t\t 9 \n", "\t$c
\n", "\t\t
20
\n", "
\n", "
\n", "\t
$b \n", "\t\t \n", "\t$a
\n", "\t\t
1
\n", "\t
$b \n", "\t\t 1 \n", "\t$c
\n", "\t\t
1
\n", "
\n", "
\n", "
\n" ], "text/latex": [ "\\begin{description}\n", "\\item[\\$a] \\begin{description}\n", "\\item[\\$a] 10\n", "\\item[\\$b] 9\n", "\\item[\\$c] 20\n", "\\end{description}\n", "\n", "\\item[\\$b] \\begin{description}\n", "\\item[\\$a] 1\n", "\\item[\\$b] 1\n", "\\item[\\$c] 1\n", "\\end{description}\n", "\n", "\\end{description}\n" ], "text/markdown": [ "$a\n", ":$a\n", ": 10\n", "$b\n", ": 9\n", "$c\n", ": 20\n", "\n", "\n", "\n", "$b\n", ":$a\n", ": 1\n", "$b\n", ": 1\n", "$c\n", ": 1\n", "\n", "\n", "\n", "\n", "\n" ], "text/plain": [ "$a\n", "$a$a\n", "[1] 10\n", "\n", "$a$b\n", "[1] 9\n", "\n", "$a$c\n", "[1] 20\n", "\n", "\n", "$b\n", "$b$a\n", "[1] 1\n", "\n", "$b$b\n", "[1] 1\n", "\n", "$b$c\n", "[1] 1\n", "\n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "rapply(lst2, length)\n", "rapply(lst2, length, how=\"list\")" ] }, { "cell_type": "code", "execution_count": 21, "id": "cda044fc", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "lst1 = list (a = 4:6, b = 5:7)\n", "lst2 = list (a = 3:5, b = 8:10)" ] }, { "cell_type": "code", "execution_count": 22, "id": "ef20eed7", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "ename": "ERROR", "evalue": "Error in lst1 + lst2: non-numeric argument to binary operator\n", "output_type": "error", "traceback": [ "Error in lst1 + lst2: non-numeric argument to binary operator\nTraceback:\n" ] } ], "source": [ "lst1 + lst2 " ] }, { "cell_type": "code", "execution_count": 25, "id": "d241f4c0", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "\n", "\n", "\n", "\t\n", "\n", "\n", "\t\n", "\t\n", "\t\n", "\n", "
A matrix: 3 × 2 of type int
ab
713
915
1117
\n" ], "text/latex": [ "A matrix: 3 × 2 of type int\n", "\\begin{tabular}{ll}\n", " a & b\\\\\n", "\\hline\n", "\t 7 & 13\\\\\n", "\t 9 & 15\\\\\n", "\t 11 & 17\\\\\n", "\\end{tabular}\n" ], "text/markdown": [ "\n", "A matrix: 3 × 2 of type int\n", "\n", "| a | b |\n", "|---|---|\n", "| 7 | 13 |\n", "| 9 | 15 |\n", "| 11 | 17 |\n", "\n" ], "text/plain": [ " a b \n", "[1,] 7 13\n", "[2,] 9 15\n", "[3,] 11 17" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mapply(\"+\", lst1, lst2)" ] }, { "cell_type": "code", "execution_count": 26, "id": "e007d2c3", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/html": [ "
a
27
b
45
\n" ], "text/latex": [ "\\begin{description*}\n", "\\item[a] 27\n", "\\item[b] 45\n", "\\end{description*}\n" ], "text/markdown": [ "a\n", ": 27b\n", ": 45\n", "\n" ], "text/plain": [ " a b \n", "27 45 " ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "mapply(sum, lst1, lst2)" ] }, { "cell_type": "markdown", "id": "245b135c", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Functions\n", "\n", "- Create a function object\n", "\n", " myFun = function (par)\n", " {\n", " out = max(par1) - min(par2)\n", " return(out)\n", " }\n", "\n", "- Load the function: \n", " - Copy and paste to the R console.\n", " - Save the function to a file and load it with the source() function.\n", "\n", "- Execute your function.\n", "\n" ] }, { "cell_type": "code", "execution_count": 28, "id": "9fb114ea", "metadata": {}, "outputs": [ { "data": { "text/html": [ "99" ], "text/latex": [ "99" ], "text/markdown": [ "99" ], "text/plain": [ "[1] 99" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "maxmin = function (x){\n", " out = max(x)-min(x)\n", " return(out)\n", "}\n", "\n", "maxmin(1:100)" ] }, { "cell_type": "code", "execution_count": 31, "id": "af028ea6", "metadata": {}, "outputs": [ { "data": { "text/html": [ "99" ], "text/latex": [ "99" ], "text/markdown": [ "99" ], "text/plain": [ "[1] 99" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "rm(list=ls())\n", "\n", "source(\"code/myfun.R\")\n", "maxmin(1:100)" ] }, { "cell_type": "markdown", "id": "41599e82", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## What is a good function?\n", "\n", "- Validating the input parameter type\n", "- Simple in logic and implementation\n", "- Error catching\n", "- Using return()\n", "- Speed and performance matter." ] }, { "cell_type": "markdown", "id": "a25ab3a5", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Lab: a summary function\n", "\n", "- Write a function mySummary where the input argument is x can be any vector and the output should contain the basic summary (mean, variance, length, max and minimum values, type) of the vector you have supplied to the function.\n", "\n", "- Test your function with some vectors (that you make up by yourself).\n", "\n", "- What will happen if your input is not a vector (e.g. a data frame weekPlanNew) in our previous example?" ] }, { "cell_type": "markdown", "id": "9bdd2d4f", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Lab: Roots function for the quadratic equation\n", "\n", "- The roots for the quadratic equation $ax^2+bx+c=0$ are of the form\n", " $$\\label{eq:1}\n", " x_1=\\frac{-b + \\sqrt {b^2-4ac}}{2a} \\quad \\text{and} \\quad\n", " x_2=\\frac{-b - \\sqrt {b^2-4ac}}{2a}$$\n", "\n", "- Write a function named quaroot to solve the roots of given\n", " quadratic equation with a ,b , c, as input arguments. \$Hint: you\n", " may need the sqrt() function\$\n", "\n", "- Test your function on the following equations $$\\label{eq:2}\n", " \\begin{split}\n", " x^2+4x-1=0\\\\\n", " -2x^2+2x=0\\\\\n", " 3x^2-9x+1=0\\\\\n", " x^2 -4 = 0\\\\\n", " \\end{split}$$\n", "\n", "- Test your function with the equation $5x^2+2x+1=0$. What are the\n", " results? Why? \$Hint: check b^2-4ac\$?\n", "\n", "- Modify your function and return NA if $b^2-4ac < 0$.\n" ] }, { "cell_type": "code", "execution_count": 5, "id": "cbbffd63", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "quaroot <- function(a, b, c)\n", "{\n", " x1 <- (-b+sqrt(b^2-4*a*c))/(2*a)\n", " x2 <- (-b-sqrt(b^2-4*a*c))/(2*a)\n", "\n", " out <- c(x1, x2)\n", " return(out)\n", "}" ] }, { "cell_type": "code", "execution_count": 6, "id": "f8d00547", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "quarootm <- function(a, b, c)\n", "{\n", " d <- b^2-4*a*c\n", "\n", " if(d<0)\n", " {\n", " x1 <- NA\n", " x2 <- NA\n", " }\n", " else\n", " {\n", " x1 <- (-b+sqrt(d))/(2*a)\n", " x2 <- (-b-sqrt(d))/(2*a)\n", " }\n", "\n", " out <- c(x1, x2)\n", " return(out)\n", "}" ] }, { "cell_type": "markdown", "id": "ebb47936", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Using R packages\n", "\n", "- Load an R package: library(\"PackageName\") or\n", " require(\"PackageName\")\n", "\n", "- Install an R package from the Internet (CRAN):\n", " install.packages(\"PackageName\")" ] }, { "cell_type": "markdown", "id": "34f22bcd", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "- Install an R package from GitHub with the devtools package:\n", " \n", " install.packages(\"devtools\")\n", " devtools::install_github(\"ykang/gratis\")\n", " \n", "- If you have Windows system and sometimes you R package needs to compile, you need the \"Rtools\" software for building packages for R under Microsoft Windows, available at [https://cran.r-project.org/bin/windows/Rtools/](https://cran.r-project.org/bin/windows/Rtools/)." ] }, { "cell_type": "markdown", "id": "64c9276b", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Coding Style\n", "\n", "- Good coding style is like correct punctuation.\n", "- Bottom line: your coding style should make other people easy to understand you code.\n", "- Suggested reading the tidyverse style guide: [https://style.tidyverse.org/](https://style.tidyverse.org/)" ] } ], "metadata": { "celltoolbar": "Slideshow", "kernelspec": { "display_name": "R", "language": "R", "name": "ir" }, "language_info": { "codemirror_mode": "r", "file_extension": ".r", "mimetype": "text/x-r-source", "name": "R", "pygments_lexer": "r", "version": "4.3.2" } }, "nbformat": 4, "nbformat_minor": 5 }