The Making of Natural Language
An Evening with Enron Email Archive
Date: October 6 2017
Identifier: 17_01006_The Mak
This panel discussion took place in conjunction with the online exhibition “Sam Lavigne and Tega Brain: The Good Life,” copresented by the New Museum and Rhizome as part of the series First Look: New Art Online.
Datasets often appear to us as if by magic: tidy lists set apart from the messiness of history and the politics of the human world. Much less visible, however, is the fact that each dataset emerges from a specific historical moment and is the product of a massive amount of human labor. These contextual details are fundamental to the nature of all datasets and, by extension, to the systems they go on to train. Many early machine learning systems—software systems that make decisions based on correlations in given data—were trained using the Enron email archive, which consists of hundreds of thousands of messages that were released into the public domain as part of the investigation into the company in the early 2000s. Although it was sourced from participants in a massive corporate fraud, the archive was assumed to represent how a much broader population used email and language. In this way, machine learning systems extend the historical conditions of their training data into the present.
Building on the recent New Museum events that delved into the implications of machine learning for issues of subjectivity and bias, this panel discussion convened artists and researchers who considered the historical and human context of the Enron email archive and its ongoing use in machine learning. This event was organized by New Museum affiliate Rhizome with Sam Lavigne and Tega Brain, in conjunction with their online exhibition “First Look: The Good Life,” which invited users to subscribe to receive hundreds of thousands of emails from the Enron archive over the course of seven years.
The panel included Lavigne and Brain; Finn Brunton, scholar and author of Spam: A Shadow History of the Internet; artist Constant Dullaart, who will present an experiment with the Enron corpus undertaken with NYU data scientist Leon Yin; Devin Kenny, artist, writer, musician, and curator; and Mimi Onuoha, artist and researcher; and was chaired by Kate Crawford, researcher, academic, and cofounder of the AI Now Institute. Blending historical analysis and artistic response, the panel called attention to the significance of datasets like the Enron email archive—largely invisible, but instrumental in our daily lives.