A Critical Analysis of the Usage of Dimensionality Reduction in Four Domains

This figure shows how dimensionality reduction is used across scientific domains outside of computer science. Through a bibliometric study of over 21,000 publications and a detailed survey of 71 papers in biology, chemistry, physics, and business, the study reveals common workflows, frequent misinterpretations, and opportunities for visualization research to better support scientific needs

Abstract

Dimensionality reduction is used as an important tool for unraveling the complexities of high-dimensional datasets in many fields of science, such as cell biology, chemical informatics, and physics. Visualizations of the dimensionally-reduced data enable scientists to delve into the intrinsic structures of their datasets and align them with established hypotheses. Visualization researchers have thus proposed many dimensionality reduction methods and interactive systems designed to uncover latent structures. At the same time, different scientific domains have formulated guidelines or common workflows for using dimensionality reduction techniques and visualizations for their respective fields. In this work, we present a critical analysis of the usage of dimensionality reduction in scientific domains outside of computer science. First, we conduct a bibliometric analysis of 21,249 academic publications that use dimensionality reduction to observe differences in the frequency of techniques across fields. Next, we conduct a survey of a 71-paper sample from four fields: biology, chemistry, physics, and business. Through this survey, we uncover common workflows, processes, and usage patterns, including the mixed use of confirmatory data analysis to validate a dataset and projection method and exploratory data analysis to then generate more hypotheses. We also find that misinterpretations and inappropriate usage is common, particularly in the visual interpretation of the resulting dimensionally reduced view. Lastly, we compare our observations with recent works in the visualization community in order to match work within our community to potential areas of impact outside our community. By comparing the usage found within scientific fields to the recent research output of the visualization community, we offer both validation of the progress of visualization research into dimensionality reduction and a call for action to produce techniques that meet the needs of scientific users.

Publication
IEEE Transactions on Visualization and Computer Graphics
Date