-
Notifications
You must be signed in to change notification settings - Fork 3
/
04-rmarkdown.html
348 lines (305 loc) · 15.5 KB
/
04-rmarkdown.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="utf-8">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="generator" content="pandoc" />
<title>Reproducible reports with R Markdown</title>
<script src="libs/jquery-1.11.3/jquery.min.js"></script>
<meta name="viewport" content="width=device-width, initial-scale=1" />
<link href="libs/bootstrap-3.3.5/css/bootstrap.min.css" rel="stylesheet" />
<script src="libs/bootstrap-3.3.5/js/bootstrap.min.js"></script>
<script src="libs/bootstrap-3.3.5/shim/html5shiv.min.js"></script>
<script src="libs/bootstrap-3.3.5/shim/respond.min.js"></script>
<script src="libs/navigation-1.1/tabsets.js"></script>
<style type="text/css">
h1 {
font-size: 34px;
}
h1.title {
font-size: 38px;
}
h2 {
font-size: 30px;
}
h3 {
font-size: 24px;
}
h4 {
font-size: 18px;
}
h5 {
font-size: 16px;
}
h6 {
font-size: 12px;
}
.table th:not([align]) {
text-align: left;
}
</style>
</head>
<body>
<style type = "text/css">
.main-container {
max-width: 940px;
margin-left: auto;
margin-right: auto;
}
code {
color: inherit;
background-color: rgba(0, 0, 0, 0.04);
}
img {
max-width:100%;
height: auto;
}
.tabbed-pane {
padding-top: 12px;
}
button.code-folding-btn:focus {
outline: none;
}
</style>
<div class="container-fluid main-container">
<!-- tabsets -->
<script>
$(document).ready(function () {
window.buildTabsets("TOC");
});
</script>
<!-- code folding -->
<div class="fluid-row" id="header">
<h1 class="title toc-ignore">Reproducible reports with R Markdown</h1>
</div>
<div id="TOC">
<ul>
<li><a href="#data-analysis-reports">Data analysis reports</a></li>
<li><a href="#literate-programming">Literate programming</a></li>
<li><a href="#creating-an-r-markdown-file">Creating an R Markdown file</a></li>
<li><a href="#basic-components-of-r-markdown">Basic components of R Markdown</a></li>
<li><a href="#markdown">Markdown</a></li>
<li><a href="#challenge">Challenge</a></li>
<li><a href="#a-bit-more-markdown">A bit more Markdown</a></li>
<li><a href="#r-code-chunks">R code chunks</a></li>
<li><a href="#challenge-1">Challenge</a></li>
<li><a href="#how-things-get-compiled">How things get compiled</a></li>
<li><a href="#chunk-options">Chunk options</a></li>
<li><a href="#challenge-2">Challenge</a></li>
<li><a href="#inline-r-code">Inline R code</a></li>
<li><a href="#challenge-3">Challenge</a></li>
<li><a href="#other-output-options">Other output options</a></li>
<li><a href="#resources">Resources</a></li>
</ul>
</div>
<blockquote>
<h3 id="learning-objectives" class="objectives">Learning Objectives</h3>
<ul>
<li>Value of reproducible reports</li>
<li>Basics of Markdown</li>
<li>R code chunks</li>
<li>Chunk options</li>
<li>Inline R code</li>
<li>Other output formats</li>
</ul>
</blockquote>
<div id="data-analysis-reports" class="section level3">
<h3>Data analysis reports</h3>
<p>Data analysts tend to write a lot of reports, describing their analyses and results, for their collaborators or to document their work for future reference.</p>
<p>When I was first starting out, I’d write an R script with all of my work, and would just send an email to my collaborator, describing the results and attaching various graphs. In discussing the results, there would often be confusion about which graph was which.</p>
<p>I moved to writing formal reports, with Word or LaTeX, but I’d have to spend a lot of time getting the figures to look right. Mostly, the concern is about page breaks.</p>
<p>Everything is easier now that I create a web page (as an html file). It can be one long stream, so I can use tall figures that wouldn’t ordinary fit on one page. Scrolling is your friend.</p>
</div>
<div id="literate-programming" class="section level3">
<h3>Literate programming</h3>
<p>Ideally, such analysis reports are <em>reproducible</em> documents: If an error is discovered, or if some additional subjects are added to the data, you can just re-compile the report and get the new or corrected results (versus having to reconstruct figures, paste them into a Word document, and further hand-edit various detailed results).</p>
<p>The key tool for R is <a href="http://yihui.name/knitr/">knitr</a>, which allows you to create a document that is a mixture of text and some chunks of code. When the document is processed by knitr, chunks of R code will be executed, and graphs or other results inserted.</p>
<p>This sort of idea has been called “literate programming”.</p>
<p>knitr allows you to mix basically any sort of text with any sort of code, but we recommend that you use R Markdown, which mixes Markdown with R. Markdown is a light-weight mark-up language for creating web pages.</p>
</div>
<div id="creating-an-r-markdown-file" class="section level3">
<h3>Creating an R Markdown file</h3>
<p>Within R Studio, click File → New File → R Markdown and you’ll get a dialog box like this:</p>
<p><img src="img/New_R_Markdown.png" alt="R Markdown dialog box" /><br/></p>
<p>You can stick with the default (HTML output), but give it a title.</p>
</div>
<div id="basic-components-of-r-markdown" class="section level3">
<h3>Basic components of R Markdown</h3>
<p>The initial chunk of text contains instructions for R: you give the thing a title, author, and date, and tell it that you’re going to want to produce html output (in other words, a web page).</p>
<pre><code>---
title: "Initial R Markdown document"
author: "Karl Broman"
date: "April 23, 2015"
output: html_document
---</code></pre>
<p>You can delete any of those fields if you don’t want them included. The double-quotes aren’t strictly <em>necessary</em> in this case. They’re mostly needed if you want to include a colon in the title.</p>
<p>RStudio creates the document with some example text to get you started. Note below that there are chunks like</p>
<pre>
```{r}
summary(cars)
```
</pre>
<p>These are chunks of R code that will be executed by knitr and replaced by their results. More on this later.</p>
<p>Also note the web address that’s put between angle brackets (<code>< ></code>) as well as the double-asterisks in <code>**Knit**</code>. This is <a href="http://daringfireball.net/projects/markdown/syntax">Markdown</a>.</p>
</div>
<div id="markdown" class="section level3">
<h3>Markdown</h3>
<p>Markdown is a system for writing web pages by marking up the text much as you would in an email rather than writing html code. The marked-up text gets <em>converted</em> to html, replacing the marks with the proper html code.</p>
<p>For now, let’s delete all of the stuff that’s there and write a bit of markdown.</p>
<p>You make things <strong>bold</strong> using two asterisks, like this: <code>**bold**</code>, and you make things <em>italics</em> by using underscores, like this: <code>_italics_</code>.</p>
<p>You can make a bulleted list by writing a list with hyphens or asterisks, like this:</p>
<pre><code>* bold with double-asterisks
* italics with underscores
* code-type font with backticks</code></pre>
<p>or like this:</p>
<pre><code>- bold with double-asterisks
- italics with underscores
- code-type font with backticks</code></pre>
<p>Each will appear as:</p>
<ul>
<li>bold with double-asterisks</li>
<li>italics with underscores</li>
<li>code-type font with backticks</li>
</ul>
<p>(I prefer hyphens over asterisks, myself.)</p>
<p>You can make a numbered list by just using numbers. You can use the same number over and over if you want:</p>
<pre><code>1. bold with double-asterisks
1. italics with underscores
1. code-type font with backticks</code></pre>
<p>This will appear as:</p>
<ol style="list-style-type: decimal">
<li>bold with double-asterisks</li>
<li>italics with underscores</li>
<li>code-type font with backticks</li>
</ol>
<p>You can make section headers of different sizes by initiating a line with some number of <code>#</code> symbols:</p>
<pre><code># Title
## Main section
### Sub-section
#### Sub-sub section</code></pre>
<p>You <em>compile</em> the R Markdown document to an html webpage by clicking the “Knit HTML” in the upper-left. And note the little question mark next to it; click the question mark and you’ll get a “Markdown Quick Reference” (with the Markdown syntax) as well to the RStudio documentation on R Markdown.</p>
</div>
<div id="challenge" class="section level3">
<h3>Challenge</h3>
<ul>
<li>Create a new R Markdown document.</li>
<li>Delete all of the R code chunks and write a bit of Markdown (some sections, some italicized text, and an itemized list).</li>
<li>Convert the document to a webpage.</li>
</ul>
<!-- end challenge -->
</div>
<div id="a-bit-more-markdown" class="section level3">
<h3>A bit more Markdown</h3>
<p>You can make a hyperlink like this: <code>[text to show](http://the-web-page.com)</code>.</p>
<p>You can include an image file like this: <code>![caption](http://url/for/file)</code></p>
<p>You can do subscripts (e.g., F<sub>2</sub>) with <code>F~2~</code> and superscripts (e.g., F<sup>2</sup>) with <code>F^2^</code>.</p>
<p>If you know how to write equations in <a href="http://www.latex-project.org/">LaTeX</a>, you’ll be glad to know that you can use <code>$ $</code> and <code>$$ $$</code> to insert math equations, like <code>$E = mc^2$</code> and</p>
<pre><code>$$y = \mu + \sum_{i=1}^p \beta_i x_i + \epsilon$$</code></pre>
</div>
<div id="r-code-chunks" class="section level3">
<h3>R code chunks</h3>
<p>Markdown is interesting and useful, but the real power comes from mixing markdown with chunks of R code. This is R Markdown. When processed, the R code will be executed; if they produce figures, the figures will be inserted in the final document.</p>
<p>The main code chunks look like this:</p>
<pre>
```{r load_data}
reduced <- read.csv("CleanData/portal_data_reduced.csv")
```
</pre>
<p>That is, you place a chunk of R code between <code>```{r chunk_name}</code> and <code>```</code>. It’s a good idea to give each chunk a name, as they will help you to fix errors and, if any graphs are produced, the file names are based on the name of the code chunk that produced them.</p>
</div>
<div id="challenge-1" class="section level3">
<h3>Challenge</h3>
<p>Add code chunks to</p>
<ul>
<li>Load the ggplot2 package</li>
<li>Read the portal data</li>
<li>Create a plot</li>
</ul>
<!-- end challenge -->
</div>
<div id="how-things-get-compiled" class="section level3">
<h3>How things get compiled</h3>
<p>When you press the “Knit HTML” button, the R Markdown document is processed by <a href="http://yihui.name/knitr">knitr</a> and a plain Markdown document is produced (as well as, potentially, a set of figure files): the R code is executed and replaced by both the input and the output; if figures are produced, links to those figures are included.</p>
<p>The Markdown and figure documents are then processed by the tool <a href="http://pandoc.org/">pandoc</a>, which converts the Markdown file into an html file, with the figures embedded.</p>
<p><img src="img/R-ecology-rmd_to_html_fig-1.png" width="768" style="display: block; margin: auto auto auto 0;" /></p>
</div>
<div id="chunk-options" class="section level3">
<h3>Chunk options</h3>
<p>There are a variety of options to affect how the code chunks are treated.</p>
<ul>
<li>Use <code>echo=FALSE</code> to avoid having the code itself shown.</li>
<li>Use <code>results="hide"</code> to avoid having any results printed.</li>
<li>Use <code>eval=FALSE</code> to have the code shown but not evaluated.</li>
<li>Use <code>warning=FALSE</code> and <code>message=FALSE</code> to hide any warnings or messages produced.</li>
<li>Use <code>fig.height</code> and <code>fig.width</code> to control the size of the figures produced (in inches).</li>
</ul>
<p>So you might write:</p>
<pre>
```{r load_libraries, echo=FALSE, message=FALSE}
library(dplyr)
library(ggplot2)
```
</pre>
<p>Often there’ll be particular options that you’ll want to use repeatedly; for this, you can set <em>global</em> chunk options, like so:</p>
<pre>
```{r global_options, echo=FALSE}
knitr::opts_chunk$set(fig.path="Figs/", message=FALSE, warning=FALSE,
echo=FALSE, results="hide", fig.width=11)
```
</pre>
<p>The <code>fig.path</code> option defines where the figures will be saved. The <code>/</code> here is really important; without it, the figures would be saved in the standard place but just with names that being with <code>Figs</code>.</p>
<p>If you have multiple R Markdown files in a common directory, you might want to use <code>fig.path</code> to define separate prefixes for the figure file names, like <code>fig.path="Figs/cleaning-"</code> and <code>fig.path="Figs/analysis-"</code>.</p>
</div>
<div id="challenge-2" class="section level3">
<h3>Challenge</h3>
<p>Use chunk options to control the size of a figure and to hide the code.</p>
<!-- end challenge -->
</div>
<div id="inline-r-code" class="section level3">
<h3>Inline R code</h3>
<p>You can make <em>every</em> number in your report reproducible. Use <code>`r</code> and <code>`</code> for an in-line code chunk, like so: <code>`r round(some_value, 2)`</code>. The code will be executed and replaced with the <em>value</em> of the result.</p>
<p>Don’t let these in-line chunks get split across lines.</p>
<p>Perhaps precede the paragraph with a larger code chunk that does calculations and defines things, with <code>include=FALSE</code> for that larger chunk (which is the same as <code>echo=FALSE</code> and <code>results="hide"</code>).</p>
<p>I’m very particular about rounding in such situations. I may want <code>2.0</code>, but <code>round(2.03, 1)</code> will give just <code>2</code>.</p>
<p>The <a href="https://github.com/kbroman/broman/blob/master/R/myround.R"><code>myround</code></a> function in my <a href="https://github.com/kbroman/broman">R/broman</a> package handles this.</p>
</div>
<div id="challenge-3" class="section level3">
<h3>Challenge</h3>
<p>Try out a bit of in-line R code.</p>
<!-- end challenge -->
</div>
<div id="other-output-options" class="section level3">
<h3>Other output options</h3>
<p>You can also convert R Markdown to a PDF or a Word document. Click the little triangle next to the “Knit HTML” button to get a drop-down menu. Or you could put <code>pdf_document</code> or <code>word_document</code> in the header of the file.</p>
<p>Actually, that drop-down menu seems to just change what’s shown in the header at the top of the file!</p>
</div>
<div id="resources" class="section level3">
<h3>Resources</h3>
<ul>
<li><a href="http://kbroman.org/knitr_knutshell">Knitr in a knutshell tutorial</a></li>
<li><a href="http://www.amazon.com/exec/obidos/ASIN/1482203537/7210-20">Dynamic Documents with R and knitr</a> (book)</li>
<li><a href="http://rmarkdown.rstudio.com">R Markdown documentation</a></li>
<li><a href="http://www.rstudio.com/wp-content/uploads/2015/02/rmarkdown-cheatsheet.pdf">R Markdown cheat sheet</a></li>
</ul>
</div>
</div>
<script>
// add bootstrap table styles to pandoc tables
function bootstrapStylePandocTables() {
$('tr.header').parent('thead').parent('table').addClass('table table-condensed');
}
$(document).ready(function () {
bootstrapStylePandocTables();
});
</script>
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
script.src = "https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML";
document.getElementsByTagName("head")[0].appendChild(script);
})();
</script>
</body>
</html>