Use of an R package to facilitate reproducible research

The goal of a research compendium is to provide a standard and easily recognisable way for organising a reproducible research project with R. A research compendium is ideal for projects that result in the publication of a paper because then readers of the paper can access the code and data that generated the results in the paper. A research compendium is a convention for how you organise your research artefacts into directories. The guiding principle in creating a research compendium is to organise your files following conventions that many people use. Following these conventions will help other people instantly familiarise themselves with the structure of your project, and also support tool building which takes advantage of the shared structure. Some of the earliest examples of this approach can be found in Robert Gentleman and Duncan Temple Lang’s 2004 paper ‘Statistical Analyses and Reproducible Research’ Bioconductor Project Working Papers and Gentleman’s 2005 article ‘Reproducible Research: A Bioinformatics Case Study’ in Statistical Applications in Genetics and Molecular Biology. Since then there has been a substantial increase in the use of R as a research tool in many fields, and numerous improvements in the ease of making R packages. This means that making a research compendium based on an R package is now a practical solution to the challenges of organising and communicating research results for many scientists.


Shiny app to explore ggplot2

Do you struggle with learning ggplot2? Do you have problems with understanding what aesthetics actually do and how manipulating them change your plots? Here is the solution! Explore 33 ggplot2 geoms on one website! I created this ggplot2 explorer to help all R learners to understand how to plot beautiful/useful charts using the most popular vizualization package ggplot2. It won’t teach you how to write a code, but definitely will show you how ggplot2 geoms look like, and how manipulating their arguments changes visualization.


Quick illustration of Metropolis and Metropolis-in-Gibbs Sampling in R

The code below gives a simple implementation of the Metropolis and Metropolis-in-Gibbs sampling algorithms, which are useful for sampling probability densities for which the normalizing constant is difficult to calculate, are irregular, or have high dimension (Metropolis-in-Gibbs).


Deep Learning Dude pt 1

You’ve probably noticed that Deep Learning is all the rage right now. AlphaGo has beaten the world champion at Go, you can google cat photos and be sure you won’t accidentally get photos of canines, and many other near-miraculous feats: all enabled by Deep Learning with neural nets. (I am thinking of coining the phrase “laminar learning” to add some panache to old-school non-deep learning.) I do a lot of my work in R, and it turns out that not one but two R packages have recently been released that enable R users to use the famous Python-based deep learning package, Keras: keras and kerasR.


(Linear Algebra) Do not scale your matrix

In this post, I will show you that you generally don’t need to explicitly scale a matrix. Maybe you wanted to know more about WHY matrices should be scaled when doing linear algebra. I will remind about that in the beginning but the rest will focus on HOW to not explicitly scale matrices. We will apply our findings to the computation of Principal Component Analysis (PCA) and then Pearson correlation at the end.


Fighting financial crimes and money laundering with graph data

Fighting financial crimes is a daily battle worldwide. Organizations have to deploy intelligent systems to prevent and detect wrongdoings, such as anti-money laundering (AML) control frameworks. We’ll see in this blog post how graph technologies can reinforce those systems.


Introduction to R: The Statistical Programming Language

R is an intense dialect utilized broadly for information investigation and measurable registering. It was created in the mid 90s. It is a standout amongst the most well known dialects utilized by analysts, information experts, scientists, and advertisers to recover, clean, dissect, imagine and display information. It is open source and free. It bolsters cross-stage interoperability, i.e. R code composed on one stage can undoubtedly be ported to another with no issues. IEEE distributes a rundown of the most well known programming dialects every year. R was positioned fifth in 2016, up from sixth in 2015. It is a major ordeal for a space particular dialect like R to be more prominent than a universally useful dialect like C#. R is anything but difficult to learn. All you need is information and a reasonable purpose to reach an inference in view of examination of that information. Notwithstanding, developers that originated from a Python, PHP or Java foundation may discover R particular and befuddling at first. The linguistic structure that R uses is somewhat not the same as other normal programming dialects. To introduce and run R on your ubuntu frameworks utilize the accompanying summons: sudo able get refresh and sudo well-suited get introduce r-base . After establishment, sort R in your terminal and you are ready!


Another Opinion: The Difference Between Data Science and Data Analytics

Data science and data analytics: people working in the tech field or other related industries probably hear these terms all the time, often interchangeably. However, although they may sound similar, the terms are often quite different and have differing implications for business. Knowing how to use the terms correctly can have a large impact on how a business is run, especially as the amount of available data grows and becomes a greater part of our everyday lives.
Advertisements