Geometric Analysis of Musical Audio Data

Project Summary

In this work, we turn musical audio time series data into shapes for various tasks in music matching and musical structure understanding. 

Themes and Categories
Christopher Tralie
Electrical and Computer Engineering

In particular, we use sliding window representations of chunks of audio to create high dimensional time-ordered point clouds, and we extract information by analyzing the geometry of these clouds.  We have shown, for example, that sequences of these shapes can be used to identify two different versions of the same song, or "cover songs."  We have also shown that both local and global musical properties can be expressed in geometric language.  For instance, hip hop is very "wiggly" while classical music is very "smooth."  Choruses and verses tend to live in distinct clusters connected by paths, and "bridges," or detours to a different musical idea, show up as large loops.

Related People

Related Projects

Nathan Liang (Psychology, Statistics), Sandra Luksic (Philosophy, Political Science),and Alexis Malone (Statistics) began their 10-week project as an open-ended exploration how women are depicted both physically and figuratively in women's magazines, seeking to consider what role magazines play in the imagined and real lives of women.

Click here to read the Executive Summary

In tracing the publication history, geographical spread, and content of “pirated” copies of Daniel Defoe’s Robinson Crusoe, Gabriel Guedes (Math, Global Cultural Studies), Lucian Li (Computer Science, History), and Orgil Batzaya (Math, Computer Science) explored the complications of looking at a data set that saw drastic changes over the last three centuries in terms of spelling and grammar, which offered new challenges to data cleanup. By asking questions of the effectiveness of “distant reading” techniques for comparing thousands of different editions of Robinson Crusoe, the students learned how to think about the appropriateness of myriad computational methods like doc2vec and topic modeling. Through these methods, the students started to ask, at what point does one start seeing patterns that were invisible at a human scale of reading (reading one book at a time)? While the project did not definitively answer these questions, it did provide paths for further inquiry.

The team published their results at:

Click here for the Executive Summary

Ashley Murray (Chemistry/Math), Brian Glucksman (Global Cultural Studies), and Michelle Gao (Statistics/Economics) spent 10 weeks analyzing how meaning and use of the work “poverty” changed in presidential documents from the 1930s to the present. The students found that American presidential rhetoric about poverty has shifted in measurable ways over time. Presidential rhetoric, however, doesn’t necessarily affect policy change. As Michelle Gao explained, “The statistical methods we used provided another more quantitative way of analyzing the text. The database had around 130,000 documents, which is pretty impossible to read one by one and get all the poverty related documents by brute force. As a result, web-scraping and word filtering provided a more efficient and systematic way of extracting all the valuable information while minimizing human errors.” Through techniques such as linear regression, machine learning, and image analysis, the team effectively analyzed large swaths of textual and visual data. This approach allowed them to zero in on significant documents for closer and more in-depth analysis, paying particular attention to documents by presidents such as Franklin Delano Roosevelt or Lyndon B. Johnson, both leaders in what LBJ famously called “The War on Poverty.”

Click Here for the Executive Summary