Blog Posts

43 Posts

Data Science & Covid --- 2020-11-14

Many commenters have noted the salience of data visualization in the current Covid-19 crisis and its new-found popularity since March. While I think this is positive, the data scientist in me has been frustrated by some of the ideas and intent behind these visualizations as well…

modelscript --- 2020-06-08

{modelscript} is a little RStudio add-in I wrote to help me with modelling. I've been testing out the tidymodels framework for R (fantastic, btw) and I thought it would be handy to be able to create a new .R file with all the steps you'd need for most modelling tasks. You can…

What do the mtcars actually look like? --- 2020-05-02

It popped into my head the other day that I had no idea what most of the cars in the mtcars dataset look like. Some Google image searches later, I had a folder of them (you can get them here -- mtcars.zip, they're all free to use as far as I could tell from the image search…

Render RMarkdown Code Chunks Based on Output Document Type --- 2020-04-26

RMarkdown users -- did you know you can render code chunks based on the type of output you want to produce? It's even easy-peasy 🤓 knitr makes available some parameters that you can access with opts_knit$get() . The one we want is "rmarkdown.pandoc.to". Once you have that, you…

From R to Gatsby --- 2020-03-12

This post details how I use Gatsby.js to blog about R stuff. My site is deployed by Netlify , which builds it after any merges into the master branch repo on GitHub . I use a little tool I wrote called writeMDX to help me out. The workflow is quite simple: write something…

Images as column headers in R --- 2020-03-01

Have you ever wanted to include an image as a column header in a data frame? Of course you have! All joking aside, this is actually surprisingly common in corporate environments, where tables may have the company logo in the header, probably as the first ‘column’. You can…

One liner to show all colours available in R --- 2020-02-28

Some years ago, I came across a great little repo that contained R code to display all the colours available in R. You can source it as so: It creates a two-page PDF that looks like this: Super nice. Back then, I thought it would be cool to see how to this with ggplot2, so I…

Using Docker for Data Science --- 2020-02-08

In this post, I'll go through a few examples of how you can use Docker for data science, from running a simple script to making reports. It's based on real usage, so I think there are a couple of things in there that are interesting. In fact, using Docker for data science…

Brazilian Legislative Data with congressbr --- 2020-01-17

Recently, a paper by myself and two friends was published in the Latin American Research Review (you can read it here ). As we write in the paper: Plenty of social science researchers are not a) interested in programming and/or b) not very good programmers (probably because of…

UK Elections 2019 --- 2019-12-13

After the UK elections in 2017, I posted about how easy it was to plot the results in R . Given that the UK just had another election, I thought I’d update that post with another one. So here ya go. What we'll do is make a plot of the results using R, and then we'll compare it…

Easily Use Python and R together with {reticulate} --- 2019-11-27

I work in an environment where R and Python are used interchangeably, and most of the data scientists here have some familiarity with both languages. We regularly use one language to call the other and I’ve been struck by just how easy this is, particularly with RStudio’s…

Taking RStudio's renv for a spin --- 2019-08-17

I’ve been working on a project recently where we’ve been building a data analysis pipeline that involves bits of R code and bits of Python. Since the whole thing runs on Docker, on a secured server with no internet access, it’s been illuminating seeing the different ways that…

Improving your DataViz --- 2019-07-23

A while ago, I posted about including D3 charts in MDX documents. I was pretty chuffed at my little D3-React bar chart (with tooltips!), but that was mainly to do with me being technically able to do it, not because it was a beautiful piece of data visualization (it isn’t). In…

D3, React and MDX --- 2019-06-03

Recently, I moved my blog from making it in R with blogdown and RMarkdown. Hugo is the engine that blogdown uses, and while it was fast and very handy to create blog posts from RStudio, I had problems once I tried to put in D3 plots. That in combination with the fact that…

Visualizing the Irish Divorce Referendum in R --- 2019-05-27

A while ago, I wrote a blog post on visualizing the results of the UK elections in 2017 (quite a while ago!). After the Irish elections and divorce referendum on Friday last, I thought it would be a nice opportunity to do something similar with Irish political data. For a…

Shuffling Strings in R --- 2019-05-14

Let's say you need to share some data that has some potentially identifiable sensitive information in it -- people's addresses, phone numbers etc. Maybe these fields are not particularly important, but you don't want to take them out exactly, and neither do you want to have to go…

Avoiding the tiresome training & test data split --- 2018-09-03

I really don't like splitting data into 'train' and 'test'. I don't mean that I'm against the idea of it, though you could say it's a waste of data that could be used to better your model, but I mean that actual assignment in R of 'train' and 'test'. I always liked destructuring…

Mapping Economic Partners with flagfillr --- 2018-02-10

🇨🇨 🇨🇽 🇵🇹 🇩🇴 🇫🇲 🇰🇷 Recently I wrote a little package for R called flagfillr (you can read more details here ). One of the main reasons I made this is because I had seen a few maps of economic partners, for example this one, from here : These types of maps (some more…

Stan IRT Code --- 2018-01-05

( This turned out to be a bit of a ramble, for the code go here 😄) My PhD thesis focused on latent variable models as a way to model legislative voting behaviour. The main model I used is called the Bayesian Item Response model, and the idea is that, from the observed votes…

Blogdown & Netlify --- 2018-01-03

I had some problems setting up my website to work properly with blogdown and Netlify (draft posts kept getting built), so in the process of learning how to do it properly (and repeatedly badgering Yihui haha -- sorry, Yihui!), I realised plenty of others are having the same…

Customize Interactive R Visuals in Power BI --- 2017-12-01

Some of us, through no fault of our own, have to work with things like Power BI. While it's a powerful application, it's just a little...you know. For anybody who works with R, Python or JavaScript or anything like that, it just feels like closing the black box a bit, not to…

Gauge-style plots with ggplot2 --- 2017-10-24

I've been working on a project where the client wanted a "cockpit" style dashboard, with meter/gauge/speedometer type things. Even though this wasn't likely to be implemented in R for the final version, I started thinking about how I could do this with ggplot2, influenced by some…

UK Elections 2017 --- 2017-09-27

This post is a quickie to show how we can visualize the UK election results with just a few lines of R code. (Really, very few). We can load in our usual tidyverse tools, along with a handy little data package, parlitools . Thanks to this R Bloggers post , we have the data…

Analyzing Prison Data in R --- 2017-07-28

My good friend Danilo Freire and I have just finished a little R data package, called prisonbrief . We hope that it will be useful for R users, particularly researchers in the area, since this is still a much understudied topic. Why does prison population change? In many…

TFW you have to copy and paste something into R... --- 2017-04-22

From time to time, you might need to copy and paste something into R and turn it into a character string. Maybe it's something from the output of an error message, or from someone else's malformed data, or something copied from a document or the internet. If it's something small…

Update R from inside R --- 2017-03-16

I was just about to update R a while ago when I thought to myself that there must be a way to do this inside of R (RStudio, I mean). A quick Google search brought me to the installr package. Very nice, but I use a Mac. Hmmm... A bit more searching and I found Andrea Cirillo's…

Peace, Bread and Data! --- 2017-02-19

I really like this image by Tom Burns . The liberal in me appreciates making cheap fun of people who were horribly mistaken (Lenin; Marx, although I don't mean to slight his contributions to social science), scum like Stalin, and Fidel Castro, who might have started out with a…

Carnaval! --- 2017-02-18

I've been getting more and more interested in web graphics, particularly d3 . All of this of course depends on javascript, a language I don't know very well. As a way to start learning it, I thought I'd give Shiny a go, as a bridge between R and javascript (I've since started…

How to make a GitHub pages blog with RStudio and Hugo --- 2017-02-01

Update: for some people who may have some issues setting up the blog the way I've set out here, see Kate's helpful comments below. Since April or so of last year, I've had a personal website on GitHub pages, where I keep this blog and a few other things. Setting it up was at…

Tips and Tricks for R Markdown html --- 2017-01-02

Here are a couple of little tips and tricks that I've picked up for use with RMarkdown html documents (including presentations and notebooks). This post is aimed at the R user who doesn't know much, if anything, about html and css. Background images Sometimes it's useful (or just…

Suicides in Ireland --- 2016-12-21

The Irish radio station newstalk published this video the other day, in which director and actor Terry McMahon spoke out against the austerity programme running in Ireland since the aftermath of the financial crisis in 2008. Leaving aside his conflation of any type of business…

Theme-Specific Voting in the European Parliament --- 2016-10-20

Since it's European Statistics Day , I thought I would make a quick post showing how to utilise some of the data that we have on the European Union in R. In particular, I will use European Parliament voting data from Simon Hix's website . The data is freely available, so by…

Map-making with R and electionsBR --- 2016-10-09

For those interested in Brazilian politics, there's a great new package called electionsBR (those who understand Portuguese can find a post on it here ). This package takes data from the Tribunal Superior Eleitoral and makes it available in a tidy format for users of R…

Re-creating Plots from The Economist in R and ggplot2 --- 2016-08-21

The Economist is well known for its graphs and images, and I personally like them a lot. I was doing some work on Brexit when I spied the image above, and thought how much I would like to make something similar. Since my go-to environment is R, and its go-to plotting package…

Inhaling/Boozing Earth --- 2016-08-14

After seeing Nadieh Bremer’s great Breathing Earth infographic, I thought it would be cool to recreate it in R, as you do. Then I saw that it was made from lots of tif files…hmmm. I did some work with those before, ain’t doin it again voluntarily, no thanks. So then I started…

Geo-reference an image in R --- 2016-08-13

R is actually great for working with spatial data (for example, see here and here for fantastic graphs and maps made with R), however, you often need data that is actually spatial to get started! What do you do if you have an image, a map, let's say, that is not geo…

Rating R Packages --- 2016-08-13

The new rOpenSci package packagemetrics is a new ‘meta’ package for R with info on packages: dependencies, how long issues take to be resolved, how many watchers on GitHub, and more. Let’s take a look at a few packages I use and some of my own. Install: Then load the packages…

Easier web scraping in R --- 2016-08-05

In an earlier post, I described some ways in which you can interact with a web browser using R and RSelenium . This is ideal when you need to access data through drop-down menus and search bars. However, working with RSelenium can be tricky. There are, of course, easier ways…

Bayesian IRT in R and Stan --- 2016-05-21

This blog post is a bit outdated -- for newer, cleaner IRT R code, see this github repo and this blog post . The code below on Stan is also available as an RPub webpage , if you'd rather work through the examples than read all of the post.   One of the first areas where…

Bayesian Stats: Book Recommendations --- 2016-05-03

The first time I came across Bayes’ Theorem , I must admit I was pretty confused. It was in Introductory Statistics by Neil A. Weiss, the course book in a statistics course I was taking at the time. Neither the logic of it nor the formula for it made much sense to me. For…

Web Navigation in R with RSelenium --- 2016-04-27

It goes almost without saying that the internet itself is the richest database available to us. From a 2014 blog post , it was claimed that every minute : Facebook users share nearly 2.5 million pieces of content. Twitter users tweet nearly 300,000 times. Instagram users post…

Write your thesis or paper in R Markdown! --- 2016-04-15

There are many reasons why you would want to use some variant of Markdown for writing, and indeed, posts are common on the net as to why you should. A simple summary of the reasons are that Markdown is: 1) easy; 2) easy; 3) yup, you guessed it – it’s easy. One variant of…

Stan or JAGS for Bayesian ideal-point IRT? --- 2016-04-13

Anybody who has ever tried to run even a moderately-sized Bayesian IRT model in R (for ideal points as in the political science literature, or otherwise) will know that these models can take a long time. It’s not R’s fault: these are usually big models with lots of parameters…