class: center, middle, inverse, title-slide # R ## Tool for scientific research ### Jakub Kuzilek, ### 2021-05-11 --- ## What is R? * open-source implementation of S programming language * programming language & environment for statistical computing and graphics * contains facilities for data manipulation, calculation and display * intensive tasks can be written in C, C++ or Fortran and called during run time * easy extension using packages * LaTeX-like documentation * great community .right[ ![](https://www.r-project.org/logo/Rlogo.svg) ] --- ## History * 1976 S developed by John Chambers in Bell Labs * 1991 Ross Ihaka and Robert Gentleman start the developing the R * 1995 R released under GNU-GPL * 1997 [R Core group](https://www.r-project.org/contributors.html) formed * 1997 [CRAN](https://cran.r-project.org/) founded by Kurt Hornik & Fritz Leisch * 2004 first [UseR! conference](https://user2021.r-project.org/) * 2009 RStudio founded by J.J. Allaire * 2012 R Markdown introduced * 2015 R Consortium * 2017 10,000 published packages on CRAN --- ## What I like on R * native support for data.frames * base R supports statistical analysis * many tools which simplifies the work with the data * [CRAN](https://cran.r-project.org/) * packages are reviewed by independent developer from R Core team * documentation * [RMarkdown](https://rmarkdown.rstudio.com/) * many surprising possibilities * functional-style language * apply instead of for loops --- ## What I hated on R * <- * apply instead of for loops * functional-style language * indexing from 1 * OOP --- ## R & RStudio * latest version of R is available at [CRAN](https://cran.r-project.org/) * with base R you can start developing your own analysis * [RStudio](https://www.rstudio.com/) * development IDE for R * the most used IDE * provides many features and can be used also for Python, Bash, C++, SQL development * RStudio Desktop or Server .right[ ![](https://www.rstudio.com/assets/img/icon-rstudio.svg) ] --- ## Comprehensive R Archive Network * [CRAN](https://cran.r-project.org/) is the network of servers that stores identical, up-to-date versions of codes and documentation for R * base R distributions * main repository infrastructure for R packages * since 1997 .center[ ![](cran.png) ] --- ## R packages * one of the most awesome features of R * provides the extensions of base R * extends the R with new statistical/ml methods, features, DB support, ... * 17,567 available packages (2021-05-10) * all packages can be installed from within the R * list of packages is available on [CRAN](https://cran.r-project.org/) * usually anything you need can be found in the list * development of packages is simple and follows the well defined rules * every package contains documentation and howtos * with extension ([devtools](https://cran.r-project.org/web/packages/devtools/index.html)) one can install package from other repos (github, gilab,...) --- ## RMarkdown 1/2 * introduced in 2012 * starts with knitr package, which main idea was to embed code chunks in to the Markdown documents * soon knitr was combined with Pandoc * knitr executes the code embedded in Markdown and converts the RMarkdown to "normal" Markdown document * Pandoc then renders the output (PDF, HTML, Word) * later the RStudio started to support RMarkdown Notebooks (like Jupyter notebooks) * multiple languages can be used within one notebook * it enables the boom of packages that supports building books, webpages, posters, papers,... --- ## RMarkdown 2/2 * Word and PDF documents * RMarkdown notebooks * [books](https://bookdown.org/yihui/bookdown/) * [webpages](https://jakubkuzilek.github.io/) * [posters](https://pbs.twimg.com/media/D9Qf0JDX4AEXJRd.jpg:large) * and more ... --- ## Data analysis * R supports the whole data analysis pipeline * many tools (packages) can be used for the task * predominantly used tools are from so-called [tidyverse](https://www.tidyverse.org/packages/) ![](https://d33wubrfki0l68.cloudfront.net/571b056757d68e6df81a3e3853f54d3c76ad6efc/32d37/diagrams/data-science.png) --- ## Visualisations * [ggplot2](https://ggplot2.tidyverse.org/) package * implements grammar of graphics * layered way to produce the production quality picture --- ## Shiny * by Joe Chang * interactive web applications with R * deployment on shiny server, shiny-proxy server * [more details](https://shiny.rstudio.com/) --- ## Other features * Machine Learning & Deep Learning ([caret](https://topepo.github.io/caret/), [tidymodels](https://www.tidymodels.org/), [targets](https://docs.ropensci.org/targets/), [keras](https://keras.rstudio.com/), [TensorFlow](https://tensorflow.rstudio.com/)) * REST API building and serving from R ([plumbr](https://www.rplumber.io/)) * analysis of huge datasets - Spark ([sparklyr](https://spark.rstudio.com/)) * teaching r in R with [swirl](https://swirlstats.com/) or [learnr](https://rstudio.github.io/learnr/) --- # Our infrastructure * [RStudio Server](https://r-server.cses.informatik.hu-berlin.de/) * [Shiny Proxy server](https://shiny.cses.informatik.hu-berlin.de/) --- class: middle, center, inverse # Thank you!