Hairballs

Network visualizations are notoriously difficult to interpret. Their canonical representation in a visual form has earned the moniker hairball, and you can probably guess why. If you are unfamiliar with the hairball, or doubt their prevalence in biological sicences, explore what is always a good source of network hairballs: study of yeast and systems biology.

You can already guess that nothing with the name hairball can truly be useful. In general, they are not. These views are at best accidentally informative, and cannot be relied upon to consistently reveal meaningful patterns.

Conventional network visualization is unsuitable for visual analytics of large networks. So-called hairballs earn their moniker by becoming impenetrably complex as your network grows. They are least effective when visualization is most needed — for large networks.

Hairballs turn complex data into visualizations that are just as complex, or even more so. Hairballs can even seduce us to believe that they carry a high information value. But, just because they look complex does not mean that they can communicate complex information. Hairballs are the junk food of network visualization — they have very low nutritional value, leaving the user hungry.

In a hairball, data is subordinate to layout — node and edge positions and lengths depend as much on the layout algorithm (of which there are many), as on the data. The effect of layout rules is difficult to predict, making direct comparisons of these visualizations impossible. For example, imagine trying to compare two scatter plots in which the ordinality of the scales were altered (e.g. x = 1, 2, 3, … in one and x = 3, 1, 2, … in the other).

As a result, a great deal of detail about the structure of a network is irretrievably lost in a hairball and any emergent patterns may be either real (reflected in the data) or accidental (artefact of the layout). Importantly, there is no aesthetic magic sauce added to the layout. If the layout shows a pattern, you can be sure it is due to structure in the underlying data and not on the layout algorithm’s interpretation of how the data should be shown.”
http://www.hiveplot.net

Advertisements