Compressed Nonnegative Matrix Factorization Is Fast and Accurate

Project Summary

Nonnegative matrix factorization (NMF) has an established reputation as a useful data analysis technique in numerous applications. However, its usage in practical situations is undergoing challenges in recent years.The fundamental factor to this is the increasingly growing size of the datasets available and needed in the information sciences. To address this, in this work we propose to use structured random compression, that is, random projections that exploit the data structure, for two NMF variants: classical and separable. In separable NMF (SNMF) the left factors are a subset of the columns of the input matrix. We present suitable formulations for each problem, dealing with different representative algorithms within each one.

Themes and Categories
Contact
Mariano Tepper
Electrical and Computer Engineering
mariano.tepper@duke.edu

We show that the resulting compressed techniques are faster than their uncompressed variants, vastly reduce memory demands, and do not encompass any significant deterioration in performance. The proposed structured random projections for SNMF allow to deal with arbitrarily shaped large matrices, beyond the standard limit of tall-and-skinny matrices, granting access to very efficient computations in this general setting. We accompany the algorithmic presentation with theoretical foundations and numerous and diverse examples, showing the suitability of the proposed approaches.

See full figure and text (PDF).

Related People

Related Projects

How are women influenced by the spaces that they are allowed to occupy? A group of students, led by English Professor Charlotte Sussman, will examine how the spaces and places women can inhabit have changed over time, and how such changes have affected women’s rights and opportunities. The team will analyze the visual representations of women depicted in magazines from the nineteenth to the twenty-first century through the Women’s Magazine Archive, considering how images about women influence the reality that women can both imagine and live. Using this data, the group will design and visualize a potential women’s space that can empower and support women to reach their highest potential.

What do we mean by the term “poverty”?

A team of students under the direction of Professor Astrid Giugni will analyze how the way we talk about poverty and public policy has changed over time. The team will work with two databases containing visual, textual, and audio documents from 1473 to the present, allowing students to track and analyze how our understanding of poverty has changed over time. The group will tackle the challenge of analyzing the political and popular language and imagery of poverty in order to create a visualization that contextualizes how financial and welfare policy is influenced by how we talk about poverty.

Students in the Performance and Technology Class create a series of performances that explore the interface between society and our machines. With the theme of the cloud to guide them, they have created increasingly complex art using digital media, microcontrollers, and motion tracking. Their work will be on display at the Duke Choreolab 2016.