Reading List 2: Week 53

September 3, 2018
Reading List Quant Finance Data Science

This week includes some nice visualisation showcases, a bit of theory on better understanding deep learning, surrogate losses in machine learning and Bayesian statistical inference. Practical aspects are provided in a tutorial on how to predict movie popularity with either linear regression or Bayesian modelling. Reflection on how we work includes a post about jupyter notebooks and one about the down sides of remote working. The quantitative finance side this week includes an article about why statistical arbitrage breaks down, including an example pair that diverges in the out-of-sample data. Also: how to animate a plot with matplotlib, how to internationalise a shiny app and advice on writing for data science.

Data Science

  • Surrogate Loss Functions in Machine Learning In machine learning, we approximate risk as the empirical risk with loss functions (i.e. hamming loss). However, these are hard to optimize, as they are not continuous. Surrogate loss functions approximate discontinuous losses in order to make the optimisation problem tractable. This post introduces the concept and some background with really nice visualisations. Read more
  • Recent Advances for a Better Understanding of Deep Learning − Part I introduces some of the recent advances that helps to better understand deep learning / neural networks. I think there is definitely so much room for improvement and the field should definitely look at things like stochastic optimization. Read more
  • LearnBayes: Functions for Learning Bayesian Inference is an R package providing vignettes and functions introducing Bayesian statistical inference. Start with the pdf Introduction to Bayes Factors . Might not be suitable for total beginners. Read more
  • Linear and bayesian modelling in R: Predicting movie popularity showcases movie popularity prediction by linear regression as well as a Bayesian model. Read more
  • Internationalisation of Shiny Apps Neat new package published that helps with internationalisation of shiny apps. Read more
  • How to Create Animated Graphs in Python gives an example of how to animate a plot with matplotlib. Read more

Data Visualisation

  • Tufte in R Implementations of some plots for several plotting libraries for the R language in the style of Tufte. Read more
  • Top 50 ggplot2 Visualizations - The Master List (With Full R Code) Overview of different types of visualisations with examples styled after Tufte rules in ggplot. Read more

How We Work

  • I don’t like notebooks Talk that reflects on jupyter notebook usage and patterns, especially regarding reproducability. Read more
  • The difficulties of Remote Machine Learning Work details some of the down sides of remote working, which I all agree to as a 100% remote worker. Read more
  • Practical Advice for Data Science Writing “Useful tips to get started writing about your data science projects” Read more


  • Why Statistical Arbitrage Breaks Down shows an example and background for a pair that shows temporary cointegration, but breaks down further along the line. Interesting to see a full analysis of a pair for trade. Read more
comments powered by Disqus