layout: true background-image: url(img//R-logo.png) background-position: middle center background-size: 25% # Experimental Design and Analysis --- <br> <br> <br> <br> ## R session 02 - Data viz .font120[**Daniel Vaulot**] 2020-01-23 <br> <br> <br> .pull-left[ <img src="img/NTU-Logo-full-colour.png" width="50%" style="display: block; margin: auto auto auto 0;" /> ] .pull-right[ <img src="img/logo_SBR.png" width="50%" style="display: block; margin: auto 0 auto auto;" /> ] --- class: middle ## Outline .font150[ * Tidy data * Graph types * Grammar of graphics * Read the data * Playing with ggplot2 * ggplot2 syntax * Your turn ] --- layout: false # Installation and Resources .pull-left[ ## Packages * readxl : Reading Excel files * readr : Reading and writing Text files * dplyr : Filter and reformat data frames * ggplot2 :plotting ## Download * R-session-03.zip ## Resources * [Chapter 5 and 28 of R for data science](https://r4ds.had.co.nz/graphics-for-communication.html) * [Fundamental of data visualization](https://serialmentor.com/dataviz/) * [Data visualization: practical introduction](http://socviz.co/lookatdata.html#what-makes-bad-figures-bad) ] .pull-right[ <img src="img/R_for_datascience.png" width="60%" style="display: block; margin: auto;" /> ] --- # Tidy data 1. Each variable must have its own column. 2. Each observation must have its own row. 2. Each value must have its own cell. <img src="img/tidy-data.png" width="100%" style="display: block; margin: auto;" /> --- # Tidy workflow <img src="img/tidy_worflow.png" width="55%" style="display: block; margin: auto;" /> -- ## Graph purposes -- .pull-left[ * **Analysis graphs** * design to see patterns, trends * aid the process of data description * interpretation] -- .pull-right[ * **Presentation graphs** * design to attract attention * make a point * illustrate a conclusion ] .font70[Source: Michael Friendly - http://datavis.ca/courses/RGraphics/] --- layout: true # Graph types --- .left-column[ ## Jitter * Two variables numerical ] -- .right-column[ <img src="img/graph_jitter.png" width="90%" style="display: block; margin: auto;" /> ] --- .left-column[ ## Bubble * Two variables numerical * **Add another variable numerical** ] .right-column[ <img src="img/graph_bubble.png" width="90%" style="display: block; margin: auto;" /> ] --- .left-column[ ## Animate * Two variables numerical * One variable numerical * One variable categorical * **Animate another variable** ] .right-column[ <img src="img/graph_animate.gif" width="60%" style="display: block; margin: auto;" /> ] --- ## Times series .left-column[ * Line graph ] .right-column[ <img src="img/graph_time_series.png" width="90%" style="display: block; margin: auto;" /> ] --- ## Bargraphs .left-column[ * One variable categorical * One variable numerical ] .right-column[ <img src="img/graph_bars2.png" width="90%" style="display: block; margin: auto;" /> ] --- ## Bargraphs .left-column[ * Rotate ] .right-column[ <img src="img/graph_bars1.png" width="90%" style="display: block; margin: auto;" /> ] --- ## Bargraphs .left-column[ * Two variable categorical * One variable numerical ] .right-column[ <img src="img/graph_bars3.png" width="90%" style="display: block; margin: auto;" /> ] --- ## Boxplots .left-column[ * One variable categorical * One variable numerical but with many values ] .right-column[ <img src="img/graph_box.png" width="70%" style="display: block; margin: auto;" /> ] --- ## Treemaps .left-column[ * One variable categorical * One variable numerical * Much better than pie charts ] .right-column[ <img src="img/graph_treemap.png" width="60%" style="display: block; margin: auto;" /> ] --- ## 3D .left-column[ * Three variable numerical * Avoid unless it is a simple shape ] .right-column[ <img src="img/graph_3d.png" width="70%" style="display: block; margin: auto;" /> ] --- ## Contours .left-column[ * Three variable numerical * Better than 3D ] .right-column[ <img src="img/graph_contour.png" width="60%" style="display: block; margin: auto;" /> ] --- .left-column[ ## Many... ] .right-column[ <img src="img/graph_gallery.png" width="60%" style="display: block; margin: auto;" /> ] * Choose as a function of what you want to analyze or the story you want to tell * https://www.r-graph-gallery.com/all-graphs/ --- layout: true # Initialize ## Load necessary libraries ```r library("readxl") # Import the data from Excel file library("readr") # Import the data from Excel file library("ggplot2") # graphics ``` --- layout: true # Reading the data --- ## CARBOM cruise off Brazil .pull-left[ <img src="img/carbom_cruise.png" width="100%" style="display: block; margin: auto;" /> ] -- .pull-right[ * Stations * Depth * Coordinates * Temperature, Salinity * Nitrates, Phosphates <img src="img/carbom_isme.png" width="100%" style="display: block; margin: auto;" /> ] --- ## Microbial populations .pull-left[ <img src="img/picopk-domi.png" width="100%" style="display: block; margin: auto;" /> ] -- .pull-right[ <img src="img/carbom_flow_cytometry.png" width="80%" style="display: block; margin: auto;" /> ] * Flow cytometry : * pico-eukaryotes * nano-eukaryotes --- ## Text file - TAB delimited <img src="img/carbom_txt.png" width="100%" style="display: block; margin: auto;" /> --- ## Reading a text file ```r samples <- readr::read_tsv("data/CARBOM data.txt") ``` -- <table class="table table-striped table-hover table-condensed" style="font-size: 9px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> sample number </th> <th style="text-align:right;"> transect </th> <th style="text-align:left;"> station </th> <th style="text-align:left;"> date </th> <th style="text-align:left;"> time </th> <th style="text-align:right;"> depth </th> <th style="text-align:left;"> level </th> <th style="text-align:right;"> latitude </th> <th style="text-align:right;"> longitude </th> <th style="text-align:right;"> picoeuks </th> <th style="text-align:right;"> nanoeuks </th> <th style="text-align:right;"> phosphates </th> <th style="text-align:right;"> nitrates </th> <th style="text-align:right;"> temperature </th> <th style="text-align:right;"> salinity </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 10 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> 81 </td> <td style="text-align:left;"> 13/11/2013 </td> <td style="text-align:left;"> 01:00:00 </td> <td style="text-align:right;"> 140 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -27.42 </td> <td style="text-align:right;"> -44.72 </td> <td style="text-align:right;"> 3278 </td> <td style="text-align:right;"> 1232 </td> <td style="text-align:right;"> 0.20 </td> <td style="text-align:right;"> 0.26 </td> <td style="text-align:right;"> 17.3 </td> <td style="text-align:right;"> 35.9 </td> </tr> <tr> <td style="text-align:left;"> 11 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> 85 </td> <td style="text-align:left;"> 13/11/2013 </td> <td style="text-align:left;"> 13:30:00 </td> <td style="text-align:right;"> 110 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -26.80 </td> <td style="text-align:right;"> -45.30 </td> <td style="text-align:right;"> 16312 </td> <td style="text-align:right;"> 1615 </td> <td style="text-align:right;"> 0.29 </td> <td style="text-align:right;"> 0.22 </td> <td style="text-align:right;"> 21.3 </td> <td style="text-align:right;"> 36.5 </td> </tr> <tr> <td style="text-align:left;"> 120 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 96 </td> <td style="text-align:left;"> 18/11/2013 </td> <td style="text-align:left;"> 23:50:00 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Surf </td> <td style="text-align:right;"> -27.39 </td> <td style="text-align:right;"> -47.82 </td> <td style="text-align:right;"> 1150 </td> <td style="text-align:right;"> 75 </td> <td style="text-align:right;"> 0.43 </td> <td style="text-align:right;"> 0.19 </td> <td style="text-align:right;"> 23.1 </td> <td style="text-align:right;"> 33.5 </td> </tr> <tr> <td style="text-align:left;"> 121 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 96 </td> <td style="text-align:left;"> 18/11/2013 </td> <td style="text-align:left;"> 23:50:00 </td> <td style="text-align:right;"> 30 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -27.39 </td> <td style="text-align:right;"> -47.82 </td> <td style="text-align:right;"> 1737 </td> <td style="text-align:right;"> 218 </td> <td style="text-align:right;"> 0.43 </td> <td style="text-align:right;"> 0.23 </td> <td style="text-align:right;"> 22.6 </td> <td style="text-align:right;"> 33.7 </td> </tr> <tr> <td style="text-align:left;"> 122 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 96 </td> <td style="text-align:left;"> 18/11/2013 </td> <td style="text-align:left;"> 23:50:00 </td> <td style="text-align:right;"> 50 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -27.39 </td> <td style="text-align:right;"> -47.82 </td> <td style="text-align:right;"> 853 </td> <td style="text-align:right;"> 234 </td> <td style="text-align:right;"> 0.56 </td> <td style="text-align:right;"> 0.21 </td> <td style="text-align:right;"> 20.3 </td> <td style="text-align:right;"> 35.9 </td> </tr> <tr> <td style="text-align:left;"> 125 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 98 </td> <td style="text-align:left;"> 18/11/2013 </td> <td style="text-align:left;"> 05:00:00 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Surf </td> <td style="text-align:right;"> -27.59 </td> <td style="text-align:right;"> -47.39 </td> <td style="text-align:right;"> 3086 </td> <td style="text-align:right;"> 1300 </td> <td style="text-align:right;"> 0.29 </td> <td style="text-align:right;"> 0.25 </td> <td style="text-align:right;"> 23.1 </td> <td style="text-align:right;"> 35.7 </td> </tr> <tr> <td style="text-align:left;"> 126 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 98 </td> <td style="text-align:left;"> 18/11/2013 </td> <td style="text-align:left;"> 05:00:00 </td> <td style="text-align:right;"> 50 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -27.59 </td> <td style="text-align:right;"> -47.39 </td> <td style="text-align:right;"> 1217 </td> <td style="text-align:right;"> 782 </td> <td style="text-align:right;"> 0.25 </td> <td style="text-align:right;"> 0.20 </td> <td style="text-align:right;"> 23.7 </td> <td style="text-align:right;"> 37.2 </td> </tr> <tr> <td style="text-align:left;"> 127 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 98 </td> <td style="text-align:left;"> 18/11/2013 </td> <td style="text-align:left;"> 05:00:00 </td> <td style="text-align:right;"> 85 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -27.59 </td> <td style="text-align:right;"> -47.39 </td> <td style="text-align:right;"> 3420 </td> <td style="text-align:right;"> 226 </td> <td style="text-align:right;"> 0.25 </td> <td style="text-align:right;"> 0.47 </td> <td style="text-align:right;"> 22.9 </td> <td style="text-align:right;"> 37.0 </td> </tr> <tr> <td style="text-align:left;"> 13 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> 86 </td> <td style="text-align:left;"> 13/11/2013 </td> <td style="text-align:left;"> 17:00:00 </td> <td style="text-align:right;"> 105 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -26.33 </td> <td style="text-align:right;"> -45.41 </td> <td style="text-align:right;"> 6366 </td> <td style="text-align:right;"> 1007 </td> <td style="text-align:right;"> 0.34 </td> <td style="text-align:right;"> 0.15 </td> <td style="text-align:right;"> 20.9 </td> <td style="text-align:right;"> 36.3 </td> </tr> <tr> <td style="text-align:left;"> 140 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 101 </td> <td style="text-align:left;"> 18/11/2013 </td> <td style="text-align:left;"> 12:00:00 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Surf </td> <td style="text-align:right;"> -27.79 </td> <td style="text-align:right;"> -46.96 </td> <td style="text-align:right;"> 500 </td> <td style="text-align:right;"> 366 </td> <td style="text-align:right;"> 0.29 </td> <td style="text-align:right;"> 0.14 </td> <td style="text-align:right;"> 23.5 </td> <td style="text-align:right;"> 36.5 </td> </tr> </tbody> </table> -- - __readr::read_tsv()__ : read tab delimited files - __readr::read_csv()__ : read comma delimited files - __readr::write_tsv()__ : write tab delimited files --- ## Excel sheet <img src="img/carbom_excel.png" width="80%" style="display: block; margin: auto;" /> --- ## Read the data - read_excel ```r samples <- readxl::read_excel("data/CARBOM data.xlsx", sheet = "Samples_boat") ``` -- <table class="table table-striped table-hover table-condensed" style="font-size: 9px; margin-left: auto; margin-right: auto;"> <thead> <tr> <th style="text-align:left;"> sample number </th> <th style="text-align:right;"> transect </th> <th style="text-align:left;"> station </th> <th style="text-align:left;"> date </th> <th style="text-align:left;"> time </th> <th style="text-align:right;"> depth </th> <th style="text-align:left;"> level </th> <th style="text-align:right;"> latitude </th> <th style="text-align:right;"> longitude </th> <th style="text-align:right;"> picoeuks </th> <th style="text-align:right;"> nanoeuks </th> <th style="text-align:right;"> phosphates </th> <th style="text-align:right;"> nitrates </th> <th style="text-align:right;"> temperature </th> <th style="text-align:right;"> salinity </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 10 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> 81 </td> <td style="text-align:left;"> 2013-11-13 </td> <td style="text-align:left;"> 1899-12-31 01:00:00 </td> <td style="text-align:right;"> 140 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -27.42 </td> <td style="text-align:right;"> -44.72 </td> <td style="text-align:right;"> 3278 </td> <td style="text-align:right;"> 1232 </td> <td style="text-align:right;"> 0.20 </td> <td style="text-align:right;"> 0.26 </td> <td style="text-align:right;"> 17.3 </td> <td style="text-align:right;"> 35.9 </td> </tr> <tr> <td style="text-align:left;"> 11 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> 85 </td> <td style="text-align:left;"> 2013-11-13 </td> <td style="text-align:left;"> 1899-12-31 13:30:00 </td> <td style="text-align:right;"> 110 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -26.80 </td> <td style="text-align:right;"> -45.30 </td> <td style="text-align:right;"> 16312 </td> <td style="text-align:right;"> 1615 </td> <td style="text-align:right;"> 0.29 </td> <td style="text-align:right;"> 0.22 </td> <td style="text-align:right;"> 21.3 </td> <td style="text-align:right;"> 36.5 </td> </tr> <tr> <td style="text-align:left;"> 120 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 96 </td> <td style="text-align:left;"> 2013-11-18 </td> <td style="text-align:left;"> 1899-12-31 23:50:00 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Surf </td> <td style="text-align:right;"> -27.39 </td> <td style="text-align:right;"> -47.82 </td> <td style="text-align:right;"> 1150 </td> <td style="text-align:right;"> 75 </td> <td style="text-align:right;"> 0.43 </td> <td style="text-align:right;"> 0.19 </td> <td style="text-align:right;"> 23.1 </td> <td style="text-align:right;"> 33.5 </td> </tr> <tr> <td style="text-align:left;"> 121 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 96 </td> <td style="text-align:left;"> 2013-11-18 </td> <td style="text-align:left;"> 1899-12-31 23:50:00 </td> <td style="text-align:right;"> 30 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -27.39 </td> <td style="text-align:right;"> -47.82 </td> <td style="text-align:right;"> 1737 </td> <td style="text-align:right;"> 218 </td> <td style="text-align:right;"> 0.43 </td> <td style="text-align:right;"> 0.23 </td> <td style="text-align:right;"> 22.6 </td> <td style="text-align:right;"> 33.7 </td> </tr> <tr> <td style="text-align:left;"> 122 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 96 </td> <td style="text-align:left;"> 2013-11-18 </td> <td style="text-align:left;"> 1899-12-31 23:50:00 </td> <td style="text-align:right;"> 50 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -27.39 </td> <td style="text-align:right;"> -47.82 </td> <td style="text-align:right;"> 853 </td> <td style="text-align:right;"> 234 </td> <td style="text-align:right;"> 0.56 </td> <td style="text-align:right;"> 0.21 </td> <td style="text-align:right;"> 20.3 </td> <td style="text-align:right;"> 35.9 </td> </tr> <tr> <td style="text-align:left;"> 125 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 98 </td> <td style="text-align:left;"> 2013-11-18 </td> <td style="text-align:left;"> 1899-12-31 05:00:00 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Surf </td> <td style="text-align:right;"> -27.59 </td> <td style="text-align:right;"> -47.39 </td> <td style="text-align:right;"> 3086 </td> <td style="text-align:right;"> 1300 </td> <td style="text-align:right;"> 0.29 </td> <td style="text-align:right;"> 0.25 </td> <td style="text-align:right;"> 23.1 </td> <td style="text-align:right;"> 35.7 </td> </tr> <tr> <td style="text-align:left;"> 126 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 98 </td> <td style="text-align:left;"> 2013-11-18 </td> <td style="text-align:left;"> 1899-12-31 05:00:00 </td> <td style="text-align:right;"> 50 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -27.59 </td> <td style="text-align:right;"> -47.39 </td> <td style="text-align:right;"> 1217 </td> <td style="text-align:right;"> 782 </td> <td style="text-align:right;"> 0.25 </td> <td style="text-align:right;"> 0.20 </td> <td style="text-align:right;"> 23.7 </td> <td style="text-align:right;"> 37.2 </td> </tr> <tr> <td style="text-align:left;"> 127 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 98 </td> <td style="text-align:left;"> 2013-11-18 </td> <td style="text-align:left;"> 1899-12-31 05:00:00 </td> <td style="text-align:right;"> 85 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -27.59 </td> <td style="text-align:right;"> -47.39 </td> <td style="text-align:right;"> 3420 </td> <td style="text-align:right;"> 226 </td> <td style="text-align:right;"> 0.25 </td> <td style="text-align:right;"> 0.47 </td> <td style="text-align:right;"> 22.9 </td> <td style="text-align:right;"> 37.0 </td> </tr> <tr> <td style="text-align:left;"> 13 </td> <td style="text-align:right;"> 1 </td> <td style="text-align:left;"> 86 </td> <td style="text-align:left;"> 2013-11-13 </td> <td style="text-align:left;"> 1899-12-31 17:00:00 </td> <td style="text-align:right;"> 105 </td> <td style="text-align:left;"> Deep </td> <td style="text-align:right;"> -26.33 </td> <td style="text-align:right;"> -45.41 </td> <td style="text-align:right;"> 6366 </td> <td style="text-align:right;"> 1007 </td> <td style="text-align:right;"> 0.34 </td> <td style="text-align:right;"> 0.15 </td> <td style="text-align:right;"> 20.9 </td> <td style="text-align:right;"> 36.3 </td> </tr> <tr> <td style="text-align:left;"> 140 </td> <td style="text-align:right;"> 2 </td> <td style="text-align:left;"> 101 </td> <td style="text-align:left;"> 2013-11-18 </td> <td style="text-align:left;"> 1899-12-31 12:00:00 </td> <td style="text-align:right;"> 5 </td> <td style="text-align:left;"> Surf </td> <td style="text-align:right;"> -27.79 </td> <td style="text-align:right;"> -46.96 </td> <td style="text-align:right;"> 500 </td> <td style="text-align:right;"> 366 </td> <td style="text-align:right;"> 0.29 </td> <td style="text-align:right;"> 0.14 </td> <td style="text-align:right;"> 23.5 </td> <td style="text-align:right;"> 36.5 </td> </tr> </tbody> </table> ## Writing to Excel file * - __openxlsx::write.xlsx()__ : write Excel file --- layout: true # ggplot2 --- <img src="img/ggplot2.jpg" width="60%" style="display: block; margin: auto;" /> @allison_horst --- ## A simple plot .left-code[ * Choose the data set * Choose the geometric representation * Choose the __aesthetics__ : x,y, color, shape etc... ```r ggplot(data=samples) + geom_point(mapping = aes(x=phosphates, y=nitrates)) ``` * All functions are from __ggplot2__ package unless specified ] -- .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-30-1.png" width="70%" style="display: block; margin: auto;" /> ] --- ## The grammar of graphics <img src="img/ggplot2_grammar1.png" width="50%" style="display: block; margin: auto;" /> Every graph can be described as a combination of independent building blocks: * **data**: a data frame: quantitative, categorical; local or data base query * **aes**thetic mapping of variables into visual properties: size, color, x, y * **geom**etric objects (“geom”): points, lines, areas, arrows, … * **coord**inate system (“coord”): Cartesian, log, polar, map --- .left-code[ Syntax ```r ggplot(data=samples) + geom_point(mapping = aes(x=phosphates, y=nitrates)) ``` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-32-1.png" width="70%" style="display: block; margin: auto;" /> ] --- .left-code[ Alternatively ```r ggplot(data=samples, mapping = aes(x=phosphates, y=nitrates)) + geom_point() ``` * If different geometries have different mapping the mapping must be called **inside** the geom function ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-33-1.png" width="70%" style="display: block; margin: auto;" /> ] --- .left-code[ Alternatively ```r ggplot(samples, aes(x=phosphates, y=nitrates)) + geom_point() ``` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-34-1.png" width="70%" style="display: block; margin: auto;" /> ] --- ## Make dot size bigger .left-code[ ```r ggplot(samples, aes(x=phosphates, y=nitrates)) ``` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-35-1.png" width="70%" style="display: block; margin: auto;" /> ] --- ## Make dot size bigger .left-code[ ```r ggplot(samples, aes(x=phosphates, y=nitrates)) + geom_point(size=5) ``` * Add: __size=5__ outside of the aesthetics function ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-36-1.png" width="70%" style="display: block; margin: auto;" /> ] --- ## Color according to depth level (discrete) .left-code[ ```r ggplot(samples, aes(x=phosphates, y=nitrates, color=level)) + geom_point(size=5) ``` * The mapping aesthetics must be an argument of the aes function * geom_point(__color=level__, size=5) will generate an error... ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-37-1.png" width="70%" style="display: block; margin: auto;" /> ] --- ## Color according to depth (continuous) .left-code[ ```r ggplot(samples, aes(x=phosphates, y=nitrates, color=depth)) + geom_point(size=5) ``` * Add: __color=depth__ ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-38-1.png" width="70%" style="display: block; margin: auto;" /> ] --- ## Symbol according to transect (continuous) .left-code[ ```r ggplot(samples, aes(x=phosphates, y=nitrates, color=depth, shape=transect)) + geom_point(size=5) ``` * Add: __shape=transect__ ] .right-plot[ ``` Error: A continuous variable can not be mapped to shape ``` <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-39-1.png" width="70%" style="display: block; margin: auto;" /> ] --- ## Symbol according to transect (continuous) .left-code[ ```r ggplot(samples, aes(x=phosphates, y=nitrates, color=depth, shape=as.character(transect))) + geom_point(size=5) ``` * Add: __shape=as.character(transect)__ ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-40-1.png" width="70%" style="display: block; margin: auto;" /> ] --- ## Panels depending on one variable .left-code[ ```r ggplot(samples, aes(x=phosphates, y=nitrates)) + geom_point() + facet_wrap(~ level) ``` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-41-1.png" width="70%" style="display: block; margin: auto;" /> ] --- ## Adding a regression line .left-code[ ```r ggplot(samples, aes(x=phosphates, y=nitrates, color=level)) + geom_point(size=5) + geom_smooth(mapping = aes(x=phosphates, y=nitrates), method="lm") ``` * Add: __geom_smooth()__ * You can choose the type of smoothing "lm" is for linear model ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-42-1.png" width="70%" style="display: block; margin: auto;" /> ] --- ## Adding a regression line .left-code[ ```r ggplot(samples, aes(x=phosphates, y=nitrates)) + geom_point(aes(color=level), size=5) + geom_smooth(mapping = aes(x=phosphates, y=nitrates), method="lm") ``` * If the mapping is in the ggplot function is for all the geom.... ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-43-1.png" width="70%" style="display: block; margin: auto;" /> ] --- ## Finalizing the graph .left-code[ ```r ggplot(samples) + geom_point(mapping = aes(x=phosphates, y=nitrates, color=level), size=5) + geom_smooth(mapping = aes(x=phosphates, y=nitrates), method="lm") + xlab("Phosphates") + ylab("Nitrates") + ggtitle("CARBOM cruise") ``` * Add: __geom_smooth()__ * You can choose the type of smoothing "lm" is for linear model ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-44-1.png" width="70%" style="display: block; margin: auto;" /> ] --- exclude: true layout: true # Putting several graphs together --- exclude: true ## First graph .left-code[ ```r g1 <- ggplot(samples) + geom_point(mapping = aes(x=phosphates, y=nitrates,color= level), size=5) + geom_smooth(mapping = aes(x=phosphates, y=nitrates), method="lm") + xlab("Phosphates") + ylab("Nitrates") g1 ``` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-45-1.png" width="70%" style="display: block; margin: auto;" /> ] --- exclude: true ## Second graph .left-code[ ```r g2<- ggplot(samples) + geom_point(mapping = aes(x=nanoeuks, y=picoeuks, color=level), size=5) + geom_smooth(mapping = aes(nanoeuks, y=picoeuks), method="lm") + xlab("Pico-eukaryotes") + ylab("Nano-eukaryotes") g2 ``` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-46-1.png" width="70%" style="display: block; margin: auto;" /> ] --- exclude: true ## Putting together .left-code[ ```r cowplot::plot_grid(g1, g2, nrow = 2, labels = c("A", "B") ) ``` * See also package : `gridExtra` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-47-1.png" width="70%" style="display: block; margin: auto;" /> ] --- layout: true # ggplot2 syntax --- ## Anatomy of a plot <img src="img/ggplot2_anatomy.png" width="70%" style="display: block; margin: auto;" /> --- ## Geometries <img src="img/ggplot2_geom.png" width="60%" style="display: block; margin: auto;" /> --- ## Continuous x and y <img src="img/ggplot2_continuous.png" width="40%" style="display: block; margin: auto;" /> --- ## Plotting error <img src="img/ggplot2_error.png" width="60%" style="display: block; margin: auto;" /> --- ## Discrete x - Continuous y <img src="img/ggplot2_discrete.png" width="60%" style="display: block; margin: auto;" /> --- ## Continuous x <img src="img/ggplot2_one_var.png" width="50%" style="display: block; margin: auto;" /> --- ## 3D <img src="img/ggplot2_3d.png" width="100%" style="display: block; margin: auto;" /> --- ## Modifying axis and scales <img src="img/ggplot2_scales.png" width="80%" style="display: block; margin: auto;" /> --- ## Palettes <img src="img/color_palettes.png" width="60%" style="display: block; margin: auto;" /> Package tmaptools : https://github.com/mtennekes/tmaptools * Function : `palette_explorer()` Package paletteer : https://github.com/EmilHvitfeldt/paletteer * More than 1000 palettes --- ## Themes <img src="img/ggplot2_themes.png" width="50%" style="display: block; margin: auto;" /> --- ## Extensions http://www.ggplot2-exts.org/gallery/ <img src="img/ggplot2_extensions.png" width="50%" style="display: block; margin: auto;" /> --- layout: false # What did you learn ? --- <br> <br> <br> .font150[ - Grammar of graphics ] -- .font150[ - Differentiate the elements that are fixed and those that vary (aes) ] -- .font150[ - Choose the graphics that best suit your purpose ] --- layout: true # Your turn - Write code to reproduce figure below --- <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-59-1.png" width="50%" style="display: block; margin: auto;" /> --- exclude: true .left-code[ ```r ggplot(filter(samples, transect==2 & !is.na(depth)), aes(y=depth, x=picoeuks)) + geom_point(size=3) ``` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-60-1.png" width="70%" style="display: block; margin: auto;" /> ] --- exclude: true .left-code[ ```r ggplot(filter(samples, transect==2 & !is.na(depth)), aes(y=depth, x=picoeuks)) + geom_point(size=3) + facet_wrap(~ station) ``` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-61-1.png" width="70%" style="display: block; margin: auto;" /> ] --- exclude: true .left-code[ ```r ggplot(filter(samples, transect==2 & !is.na(depth)), aes(y=depth, x=picoeuks)) + geom_point(size=3) + facet_wrap(~ station) + geom_path() ``` * Do not use `geom_line` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-62-1.png" width="70%" style="display: block; margin: auto;" /> ] --- exclude: true .left-code[ ```r ggplot(filter(samples, transect==2 & !is.na(depth)), aes(y=depth, x=picoeuks)) + geom_point(size=3) + facet_wrap(~ station) + geom_path() + scale_y_reverse() ``` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-63-1.png" width="70%" style="display: block; margin: auto;" /> ] --- exclude: true .left-code[ ```r ggplot(filter(samples, transect==2 & !is.na(depth)), aes(y=depth, x=picoeuks)) + geom_point(size=3) + facet_wrap(~ station) + geom_path() + scale_y_reverse() + theme_bw() ``` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-64-1.png" width="70%" style="display: block; margin: auto;" /> ] --- exclude: true .left-code[ ```r ggplot(filter(samples, transect==2 & !is.na(depth)), aes(y=depth, x=picoeuks)) + geom_point(size=3) + facet_wrap(~ station) + geom_path() + scale_y_reverse() + theme_bw() + ggtitle("Percentage of pico-eukaryotes per station on transect 2") + xlab("Pico-eukaryote per mL") + ylab("Depth (m)") ``` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-65-1.png" width="70%" style="display: block; margin: auto;" /> ] --- exclude: true .left-code[ ```r ggplot(filter(samples, transect==2 & !is.na(depth)), aes(y=depth, x=picoeuks)) + geom_point(size=3) + facet_wrap(~ station) + geom_path() + scale_y_reverse() + theme_bw() + ggtitle("Percentage of pico-eukaryotes per station on transect 2") + xlab("Pico-eukaryote per mL") + ylab("Depth (m)") + scale_x_log10(limits= c(100,10000)) + annotation_logticks(sides="b") ``` ] .right-plot[ <img src="R-session-02-data_visualization_files/figure-html/unnamed-chunk-66-1.png" width="70%" style="display: block; margin: auto;" /> ]