Thursday, October 20, 2016

Stop Writing Dead Papers

The idea struck me while listening to Bert Victor's talk "Stop Drawing Dead Fish" (and hence the title for this post).

We have been writing and publishing papers for more than 500 years (according to Wikipedia, one of the earliest journals started in the 17th century!) and yet, somehow, we are still using the same format and writing our papers as if hardcopies are the main, if not the only, medium for distributing them. Now this is really disturbing as we are now living in the 21st century where we have a way more powerful medium available to us: the interactive digital interfaces.

Most of academic papers written nowadays are dead: they are static with no interactive content. I'm talking particularly about papers reporting empirical results and showing graphs and tables filled with numbers and statistics, and supported with long discussions to help the reader understand and visualise (in her head) what can not be fully articulated by static content.

Take for instance the figure below appeared in Cho et al. (2014) paper, which is meant to visualise the space of representations of phrases of four words learned by a recurrent neural network. The authors clearly put a lot of effort into visualising the space and presenting their results in a convincing and expressive way. But because of the lack of interactive medium, they had to present the full graph (clearly hard to understand) and some closeups (not fully representative of the space). 

2–D embedding of the phrase representation learned with RNN. Cho et al. (2014)
Some zoom-ins from the figure above. Cho et al. (2014)
This is not only inadequate, presenting a number of figures to support an argument also takes up a lot of the limited space available in academic papers, which can be put for better use. Moreover, despite all these static illustrations, one wish she could hover over some points to highlight what they represent or zoom-in to get a better understanding.

While this format was totally accepted in the 17th century, it is way outdated in the 21st and no longer enough!

To compensate for the limitations of this medium that poorly accommodates our goals, a number of researchers started a tradition of writing blog posts that serves as fancier versions of their papers, usually supported with interactive visualisations and easier-to-access and understand analytics. Take for instance this great blog with many interactive examples for some of the results in Dai, et al. (2014) paper that tries to cluster Wikipedia articles. You can still see the same figures presented in the paper (like the one below), while also being able to interact with them and play with the parameters.
 Visualisation of Wikipedia paragraph vectors using t-SNE. Dai, et al. (2014)
Now I understand that this does not apply equally to all fields (I don't expect researchers working in the field of literature to move directly, or accept such a new medium). But I believe that researchers with a computer science background to be capable of making, and arguably welcoming, such move. I believe such functionalities could be integrated in new editing tools or traditional ones (such as LaTex web-based editors), and ultimately, if papers could be submitted in a scripting language format, say in php (or an editor built on top of it) in which many interactive tools already available can be easily integrated, one could have the opportunity to take creativity and accessibility of academic papers to a whole new level.

As someone who read, write and review papers, I'm really looking forward to the day where academic papers become more interactive and I strongly believe that this will lead to research that is highly accessible, easier to understand and evaluate and more fun to work with.

No comments:

Post a Comment