Skip to Content

Three hours of pure soccer emotion, visualized with R

David Smith's picture

The biggest prize in UK soccer, the Premier League Championship, is decided by a points system. Unlike most sports competitions, there's no final round or playoff series: once the regular round of games is complete, the team that has accumulated the most points (three for a win, and one for a draw) is the champion of English football. In the event of a tie in points, the winner is decided by goal difference (total goals scored minus total goals conceded), and then by total goals scored. 

The last games of the 2011-2012 season were played nearly simultanously on May 13. The two condenders for the title were the Manchester City and Manchester United teams, each playing in separate games. To win the title, Manchester United needed to win their game and for Manchester City to lose theirs. The United game finished their match first with the title seemingly in the bag, until an injury-time goal at the other game took City to an historic championship victory.

As you can imagine, for players watching at home it was a tense experience, with the final minutes of both games being displayed split-screen. For the fans at the United game, it was a surreal experience, with the local play over but the ultimate result being decided hundreds of miles away. At the UseR 2012 conference last month, Przemyslaw Biecek showed how he used the R language to perform sentiment analysis on tweets captured the game and stored in the IBM Netezza data warehousing appliance to capture that emotion in the data visualization below (click to enlarge):

Check out the middle panel in particular: the blue line is the sentiment of Man City fans (higher is more positive) and the red that of Man United fans, where you can see the transition from elation to crushing disappointment at 5:47 PM.

For complete details of the analysis, check out Przemyslaw's post at IBM developerWorks below.

IBM Developerworks: Premier Emotions League: title fight Twitter visualization in R