Seeing The Future From The Past

Hop in my Delorian and we’ll travel back in time and appropriately tweak our predictive models.  Imagine courtesy of William Warby via FlickCC.

I just finished reading The Signal and the Noise, a book about predictions by the American statistician and blogger turned big time data journalist Nate Silver.  I highly recommend it.  The book came out in 2012 and there was some sort of meta-instructive quality to reading a book whose main theme is using the past to predict the future, written in a past that was (mostly) unaware its own future.  Still with me? 

For example, reading anything written about correctly interpreting political polling in the pre-2016 world induces the reader to struggle under the weight of her own misconceptions.  This in turn causes her to stare off into the distance for one dim moment and think about all of the futures that could have been but at the same time were never destined to be.   Then she shakes it off and keeps reading.

A theme that Silver returns to repeatedly is the impact of our own biases on our ability to interpret and deliver statistical predictions.  Our biases almost inevitably seep into our mathematical models, like which variables we choose to include, how heavily we choose to weight them, and how willing we are to adjust models as we move through time.  Bias has been a hot topic for Google lately, who has been in some hot water for its biased human-programmed (and not actually a magical oracle at all) search results that autocomplete and prioritize some really racist search results.  This could be compounded as Google and others become more reliant on AI that’s trained on fallible human data and the output gets simultaneously farther from an actual human brain but somehow more deeply steeped in human thought. 

The blog Overcoming Bias, written (mostly) by Robin Hanson explores a lot of the places where bias interferes with our understanding of predictive modeling.   In his most recent post Hanson explores the idea of introducing what he calls news accuracy bonds to combat the spread of fake news.  The basic idea is that each article comes paired with a token, or bond, and a reader can get the bond by provably demonstrating that the article is false.  The article goes into more of the details on how this might work, but the basic idea is that an article with a high value bond is more likely to be true.  A high bond values indicates a high degree of certainty on the part of the publisher.  

A related idea has been put forward by Facebook to introduce reputation scores, where effectively users’ scores get decreased every time they post content that the company considers suspicious.  It sounds a little bit spooky — and maybe a bit too reminiscent of a certain episode of Black Mirror — but the idea is similar to Hanson’s in that number attached to an article or user profile will act like a confidence interval by the platform or publisher.  

What do you think?  Would you trust Facebook to responsibly sort dubious information?  Do you ever think about how your biases from the past make it into your assessment of the present or predictions of the future?  If you need me I’ll be in the garage tinkering with my time machine, or as usual, you can find me on Twitter @extremefriday

This entry was posted in Data Science and tagged , , , , , . Bookmark the permalink.