Free interactive R exercises on OpenIntro

DataCamp first to offer free interactive R tutorials via the OpenIntro platform.

In one week, the ten-week Coursera course on Data Analysis and Statistical Inference by prof. Mine Çetinkaya-Rundel of Duke University comes to an end. At DataCamp it was one of our first experiences providing interactive R exercises on a large scale, and we’re proud to say this journey is coming to a successful end. (We’ll write a more detailed post on this in the near future.)

For those of you who were not able to follow the course in the Coursera format, but still want to do the interactive exercises on DataCamp, there is now a great alternative: OpenIntro statistics. OpenIntro statistics is part of the OpenIntro project, and covers a wide range of educational materials on statistics such as videos, textbooks, and as of now interactive exercises by DataCamp. If you’re a student looking for a great introductory statistics course, or a teacher in need of a fully fledged teaching material package, the OpenIntro project is the place to go! (The OpenIntro project is an organisation focused on developing free and affordable education materials. OpenIntro statistics is their first project)


All DataCamp R tutorials can be found under the labs section of the OpenIntro website. Just like for the Coursera course, these interactive exercises serve best as complements to the statistical concepts covered in the free OpenIntro statistics textbook and corresponding videos. If you’re a teacher using OpenIntro in your class, and you want to use the DataCamp tutorials as well, you can always contact us at if you need more detailed information.

We’re happy being offered the opportunity to add our interactive R tutorials to the high-quality OpenIntro curriculum.  To us, it is again another step to increase the understanding and adoption of R in the data science and statistics world.

We hope you will enjoy it!

The DataCamp Team

Note: if you prefer to take the course via Coursera, a new session of the course has been announced and will launch September 1st 2014. 

Get notified when R packages update

Today’s highly active R user base is developing, re-developing, and releasing R packages at a never-before-seen rate. While this is fantastic news for the R community as such, it inevitably also causes growing pains as mentioned before.

One of the often cited problems is the painful and time-consuming task to keep track of changes and version updates of packages and functions (see for example the paper of Jeroen Ooms in The R Journal). After all, nothing beats the fun of putting a lot of effort in a project or task, just to realize minutes after finishing the job that package xyz released its latest version. (To say nothing of the frightening but inevitable moment when loading in the new version, praying to God the fragile life of your precious code will be spared.)

A better way to deal with these package updates, is to be informed automatically when changes are made to the packages you depend on. This is exactly what the brand new notification feature of Rdocumentation does. It gives you the option to subscribe to the R-packages of your choice, and then when one of these packages gets updated on CRAN, Rdocumentation automatically sends an email to inform you.

Getting updates on future package versions via Rdocumentation is simple. Navigate to the package of your choice (let’s say ggplot2 on Rdocumentation), provide your email address, and hit the green subscribe button. A message will pop-up to confirm your subscription, and that’s it. This is also shown in the following screenshot:

ggplot2 rdocumentation

Rdocumentation is a tool that enables you to easily find and browse the documentation of all current packages and functions on CRAN. If offers features such as advanced search, package popularity rankings, community forums, and package download statistics. Rdocumentation is supported by DataCamp, provider of free R tutorials. 

April Fools’ Day: The 7 Funniest Data Cartoons

To give this years April Fools’ day a more analytical touch, we decided last week do a little poll on internet cartoons. We asked our friends and colleagues to select their favourite data related cartoon on the web, and organized a voting session to construct a top 5 list. (You can always share your own favourites in the comments.)

We proudly present you the winners of the April Fools’ 2014 Data Cartoon awards:

Number One: The Cloud 


Number Two: A Study on Statistics


Number Three: Pacman Statistics


Number Four: Dilbert One


Number Five: Haloween Statistics


Number Six: Dilbert Two


Number Seven: XKCD Correlation


Disqualified for the competition, but still funny:

big data



The Stack Overflow R Top 5

Like every start-up in the IT and data science sector, we often find ourselves spending more time on Stack Overflow than on our own site. For those of you who are not familiar with it, Stack Overflow is like a Q&A forum on steroids. It features questions and answers on a wide range of topics in programming, and it’s dedicated to answering any and all of these questions. Thanks to its clever reputation system based on points and badges, chances are high you will find a high-quality answer to your particular problem. Believe us, this will save you a lot of time!

Since we use it that often for DataCamp, we wanted to share our ‘Top Five’ list of the most popular R questions:

Position 5 = What statistics should a programmer (or computer scientist) know? (262 votes)     

This question is targeted at programmers who want to understand how their programming efforts can benefit from a more statistical approach. Not only does it provide an overview of statistical techniques, part of the answers also focus on the statistical tools programmers can use in their day-to-day activities.

Position 4 = R Grouping functions: sapply vs lapply vs apply vs tapply vs by vs aggregate (272 votes)     

This is something almost every new R programmer struggles with in the beginning:  how and when to use the functions in the apply family. If you are one of these, just check out this Stack Overflow post and it will be a lot clearer to you. Multiple individuals have responded to the question, and most of them provide very clear answers with some even including slide presentations.

Position 3 = How to sort a data frame by column(s) in R (302 votes)     

Again, a very easy but highly relevant question (certainly for new R users switching from Excel). Based on an example, the questioner wants to know how he can sort his data frame by multiple columns. This is a standard task in R, but if you’re not familiar with using functions, the barrier to entry might be high. (Spoiler alert: the order function will take you a long way)

Position 2 = How can we make xkcd style graphs in R (307 votes)     

Close, but no cigar. This question on xkcd style graphs reached the second place in our top five list. As a start-up we personally love xkcd style graphs since they have this arty-farty layer over them.  They allow you to provide information in a very clear way, but their unique and fun style just increases the chances your audience will pick them up. A must read for everyone!

Position 1 = How to make a great R reproducible example (525 votes)  

Simply put: great question and great answers! Reproducible examples are fundamental for teaching, research, and even when asking questions on for example Stack Overflow. However, the creation of reproducible examples is not that easy, and requires a certain finesse. This post will guide you through the ins and outs of creating such reproducible examples, so make sure to check it out since it will definitely help you to better understand R in the long run.

Bonus: What’s your favorite data analysis cartoon 

For the not so serious moments…

A new series: R-fiddle of the Week

Now that our ‘Learning R’ -series is coming to an end (for those who missed it, have a look at our Twitter or Facebook ), it is time to announce the start of a new series : R-fiddle of the Week. Every week, we will share an R-fiddle link that contains the code of some popular or well liked R blog posts. Since R-fiddle allows you to run and write R-code right inside your browser, it is then easy to start playing around with the code yourself and make your own versions and adaptions of it. You can even share your code experiments with your friends, colleagues, students…

For the first week we wanted to start with something very tailored and visual, so we made an R-fiddle that uses Google’s API and your personal input. Based on the address you provide, it will return the corresponding coordinates and shows you the location on a Google Map via a plot. It is easy to think of variations (e.g. multiple addresses), so we are curious to see what you will come up with.

With this new set of posts we aim to show new R users the power of R, and introduce experienced users to some nifty R features they might not be aware of. You can follow the ‘R-fiddle of the Week’ series via Twitter or Facebook.

If you have any suggestions or ideas on a R-fiddle we should make, just send them to  

R-fiddle provides you with a free and powerful environment to write, run and share R-code right inside your browser. We designed it for those situations where you have code that you need to prototype quickly and then possibly share it with others for feedback. All this without needing a user account, or any scrap projects or files! We even included a very-easy-to-use ‘embed’ function for blogs and website, so your visitors can edit and run R code on your own website or blog. 



Quandl is a “wikipedia” for numerical data that allows you to search rapidly through 8 million ready-to-use data sets. At DataCamp we created a free in-browser coding tutorial on how to use the corresponding R package to access Quandl data from within R.   

As every real world data analyst knows, finding and formatting numerical data for analysis in R is a often a hard and rigid task. Quandl wants to make this task less painful, by providing you with a ‘search engine” for numerical data . Not only does it allow you to find data fast, but once you find it, it is ready to use. This is because Quandl’s bot returns data in a standard format, meaning you can translate it to any format you want. One of the great things is that Quandl has its own R package. This package is built on top of the Quandl API, and allows you to access many of the Quandl functionalities right inside the R console.


Our free interactive Quandl course introduces you to the main functionality in the Quandl R package. In two short chapters you learn how to search through Quandl’s data sets, how to access them, and how you can easily manipulate them for your own purposes. All exercises are based on real-life examples (e.g. Bitcoin exchange rates), and take place in the comfort of your own browser thanks to DataCamp’s interactive learning platform for R.

We hope you will enjoy the course! If you have suggestions on future courses we should develop, or if you want us to develop a course for you, just contact us via


Two new free interactive courses with R on DataCamp

We’re happy to announce that as of today, DataCamp has added two new and free online interactive courses to its curriculum: ‘Data Analysis and Statistical Inference‘ and ‘Introduction to Computational Finance‘.  They will be the biggest DataCamp courses to date, so we’re very excited to find out what this will give.

We developed these courses in close collaboration with the teaching professors of the like-named Coursera courses. Hence, you can expect the same high-quality standards as from an academic course, but presented in DataCamp’s fun and learning-by-doing environment. Students that choose to enroll for the course on Coursera, will be directed to DataCamp to practice their skills and to complete assignments.

In ‘Data Analysis and Statistical Inference‘, taught by Dr. Mine Çetinkaya-Rundel from Duke University, you learn how to make use of data in the face of uncertainty. Throughout the course, you’ll understand how to collect, analyze, and use data to make inferences and conclusions about real world phenomena.

Introduction to Computational Finance‘ focuses on mathematical and statistical tools and techniques used in quantitative and computational finance. Professor Eric Zivot  (University of Washington) designed the course, and with the help of real life examples introduces you to the do’s and don’ts when analyzing financial data, estimating statistical models, and constructing optimized portfolios.

To follow the pace of the two Coursera courses, the different chapters will be released on DataCamp periodically over the next few weeks.  Once fully released, the courses will remain available on the DataCamp platform as a stand-alone version. The courses require no formal background, but some basic mathematical skills will come in handy. A genuine interest in data analysis is a plus!

We hope to welcome you in our online classroom  soon!

Any ideas on new courses we should launch? Let us know via Facebook or Twitter!

DataMind goes to DataCamp

We’re happy to announce that effective immediately, we’ve officially changed our startup’s name from DataMind to DataCamp.

It was very obvious from the start that we did not want to become the next consultancy firm -in a row of many- that offered training and learning services on the side. We believed the time was ripe to build a company within the field of data science that had education and training as its sole core. A company that would develop tailored educational technology, and use it to offer something more exciting than the traditional two-week seminars or long monotonous webinars (depending on which of the two you can afford). The vision was to build a tailored online learning platform that offered students and professionals an engaging, learning-by-doing environment were they could build their knowledge through in-browser coding and exercises.

Today, it seems like there is indeed room for a vision like ours. Everyday, more and more (soon-to-be) data analysts are finding their way to our free interactive intro to R course, and based on the increasing retention figures we have (at least the impression) that they like the interactive learning approach a lot. This traction allowed us to make improvements faster, and just recently we managed to get out of the beta stage.

So why the name change? In the process of building the learning platform, and spreading the message of it to students, professionals and academics, we learnt that a more professional image would benefit us if we wanted to access bigger players in the market, more funding sources, and better mentors. So for the benefit of the project’s growth and future we decided to do a name switch. Instead of the playful domain name you can now find us on the more professional

We felt the timing was right because in the upcoming months we’re releasing some interesting new features to the the online interactive learning platform (like a new gamfication system). Even more exciting is that we recently started working together with Coursera professors on how to integrate DataCamp with their course. This will hopefully allow even more students and starting data scientists to become familiar with the power and benefits of R. But more on that in our next post…

We hope you’ll love our new name as much as we do!


Complete list of Coursera courses using R ranked by “popularity”

Coursera – an online education startup – has rapidly expanded its curriculum of statistics and data analysis courses. Today, there are already 33 modules directly linked to the field, excluding the courses where statistics and data science are solely used as a supportive tool (e.g. finance). These courses make use of multiple statistical software packages like Python, MATLAB and of course R.

I decided to make a list of all Coursera courses that use R as either their first choice, or as one of the many statistical software packages allowed to use by students to perform the homework’s assignment. Coursera does not publish all data on how many students enroll in their courses, but most (some?) courses reach well over a hundred thousand students each year.

To have some kind of indication of their popularity, I list below all courses using R ranked by the number of facebook likes:

Ranking Courese title Professor University Facebook likes Tweets
Social Network Analysis
Lada Adamic University of Michigan 12000 3543
Statistics one
Andrew Conway Princeton University 9600 1421
Computing for Data Analysis
Roger Peng John Hopkins University 8500 1934
Data Analysis
Jeff Leek John Hopkins University 5200 1408
Introduction to Data Science
Bill Howe University of Washington 2600 1103
Introduction to Computational Finance and Financial Econometrics
Eric Zivot University of Washington 2100 351
Mathematical Biostatistics Boot Camp 1
Brian Caffo John Hopkins University 1400 239
Statistics: Making Sense of Data
Alison Gibs & Jeffey Rosenthal University of Toronto 1400 243
Asset Pricing
John H. Cochrane University of Chicago Booth 855 102
Mathematical Methods for Quantitative Finance
Kjell Konis University of Washington 635 92
Case-Based Introduction to Biostatistics
Scott L. Zeger John Hopkins University 424 110
Financial Engineering 2
Martin Haugh & Garud Iyengar Coumbia University 109 13
Data Analysis and statistical inference
Mine Çetinkaya-Rundel Duke University 80 18
Core Concepts in Data Analysis
Boris Mirkin Higher School of Economics 77 15
Mathematical Biostatistics Boot Camp 2
Brian Caffo John Hopkins University 60 21

Given the unwillingness of Coursera’s search function, I had to manually draft the list above. Therefore, it is possible I overlooked some of the courses. Feel free to mention them in the comment section, and I will make sure to update the list. In case you are interested in taking (or teaching) interactive data analysis courses, make sure to have a look at our own educational startup DataMind.

While I expect that most of you are familiar with Coursera, for those who don’t a quick summary: Coursera is one of the leading providers of Massive Open Online Courses (MOOCs). Today they have more then 100+ institutional partners offering 500+ courses to over 5 million students worldwide. So despite being criticized by some, it is becoming more and more clear that they are here to stay.

R-Fiddle: An online playground for R code is an early stage beta that provides you with a free and powerful environment to write, run and share R-code right inside your browser. It even offers the option to include packages. Since a couple of days it’s gaining more and more traction, and was mentioned on the frontpage of Hacker News.

We designed it for those situations where you have code that you need to prototype quickly and then possibly share it with others for feedback. All this without needing a user account, or any scrap projects or files! We even included a very-easy-to-use ‘embed’ function for blogs and website, so your visitors can edit and run R code on your own website or blog. This is the first version of R-fiddle, so do not hesitate to give us feedback.

Working together with the help of R-fiddle

You can use R-fiddle to share code snippets with colleagues when tossing around ideas, in order to find that annoying bug, or by making your own variations on others people code. It’s easy: Just go to, type your code, and get your public URL by pressing ‘share’. This is a lot easier for your potential troubleshooter/colleague/.. since (s)he can immediate run and check the code, save it once finished and share it again. So by sharing your R-code through R-fiddle, you can not only help others to better understand your code, but they can also help you!

Embedding an R-fiddle in your blog or website

Embedding the interactive code of your fiddle on a website or blog is easy. R-fiddle automatically generates a piece of code that you can then simply paste in your HTML at the desired place.

You can choose between two ways to embed the code: with or without the console. If you embed a fiddle with the console, your visitors can edit and run your code within the environment of your own site. If you embed a fiddle without the console, your visitors can see the code with a link to the r-fiddle website where they can edit and run it. For more information on how to embed interactive code, just check the documentation at

The R-fiddle working environment

Working with R-fiddle is very straightforward. The page exists out of two sections. The main section of the site (on the left) is divided into two areas: the editor and the console. Here is were you put your code. They work just like the standard editor and console you are familiar with from your IDE. For example, it colour-codes the syntax. The right pane is the discussion area. Here others can comment on your code, make suggestions, or ask questions. You can immediately see the comments others made, making collaboration easy.


The R-fiddle buttons

The R-fiddle interface provides plenty of features to assist in your development. The buttons at the top of the page include:

  • Save: By clicking save you activate the Embed and Share buttons. You always have to click save first, that’s when R-fiddle knows things are getting serious.
  • Embed: This allows you to embed your code on your website and blog with the help of an iframe.
  • Share: This allows you to share code from the R-fiddle page with other users. You can share it through a web link, Facebook and Twitter. These users can than provide feedback or even adapt/fix your code within their own browser.
  • Run:Executes the code entered in the editor, and displays the results in the console area.
  • Graph: Here you can find the graphs that are possibly created by your code.

 In conclusion:

With this quick tour on R-fiddle, we hope to have given you a better understanding of what it provides and why you should use it. Please be aware that R-fiddle is a hosted application in beta, so performance can degrade during peak usage. As R-fiddle usage increases, we will add more servers to it asap. Check out today, and you will discover its power!

For any questions or suggestions, do not hesitate to contact us at