(This turned out to be a bit of a ramble, for the code go here 😄)
My PhD thesis focused on latent variable models as a way to model legislative voting behaviour. The main model I used is called the Bayesian Item Response model, and the idea is that, from the observed votes of the legislators, we can build a scale on which we can place them, relative to one another. This is a common way of analysing legislative voting behaviour, and I’m sure you’ve seen it before somewhere. There are basic ways to do it, using factor analysis or something like NOMINATE. These work well when you have lots of data (i.e. lots of legislators voting often on lots of votes) and when you either suspect that there is low dimensionality (in other words, there is basically one scale on which you can place the legislators, not two or more) or you have interest in only one aspect of the voting space. In practice, most of these scales are interpreted to be between ‘left’ and ‘right’ as commonly understood in politics. Whether this accurately captures voting behaviour is, of course, another question altogether.
So why didn’t I just do it the easy way? Well, apart from being a glutton for punishment (joking 😄), I had two reasons for diving into the world of Bayesian statistics, R, JAGS, and later Stan. The first is that Bayesian stats made sense to me the moment I first read about it, on a deep, intuitive level. Frequentist statistics just always seemed convoluted to me. Anyway, after reading Clinton, Jackman and Rivers’ 2004 paper on Bayesian analysis of roll-call votes (earlier working paper here), I was convinced it was scientifically the sound thing to do as well. Secondly, when you don’t have lots of data (as was the case for me), the Bayesian version of these models is better equipped to come out with better results (see the paper above).
I started off using Simon Jackman’s pscl package in R to model these votes and to create ideal points. I also used MCMCpack by Quinn & Martin, which turned out to be better suited to my case. However, I soon realised that more complex models were difficult or impossible with these (fast) ready-made tools. Enter JAGS and the world of probabilistic programming. Martyn Plummer has done a great job with JAGS, and personally, I like it a lot, especially the syntax, but for IRT models like the ones I was using, it was painfully slow. Days and days running models that just didn’t seem to converge (I later found out why).
Well, that left me with two months to finish my thesis and models that wuldn’t converge. Lovely. Previously, I had tried to learn Stan but I found it unappealing for various reasons, including the difficult syntax. Actually, the syntax makes much more sense to me now that I’ve learned some other programming languages, whcih shows where Stan is coming from – it’s a fully fledged computer language, designed to be a computer language. Hence things that PhD students in non-computer science areas are not accustomed to seeing pop up, like variable type declarations and so on. But my looming deadline left me with no choice but to give Stan a shot and see if it could help me out.
I turns out Stan did help, it runs much faster for these IRT models, and I was able to get everything done and make a thesis I’m proud of. At the time, I published some example code on a GitHub repo and this led to me regularly getting emails from students and professors around the world asking how in the hell does Stan work. I understand your pain!
Anyway…this blog post was supposed to be a simple, quick note to say that I have updated said repo with new and better Stan IRT code. It’s deliberately simple and comes with example data and example R code for how to run the models and plot ideal points with ggplot2 afterwards. The plots are quite nice, I think. They look a little somethin’ like this: