Programme

(T164)
The Potential Futures of Data Science: A Roundtable Intervention
Location 117a
Date and Start Time 01 September, 2016 at 09:00
Sessions 1

Convenor

  • Brian Beaton (California Polytechnic State University) email

Mail All Convenors

Short Abstract

This panel focuses on new and breaking STS research on data science. Panel participants will present research on data science's communication patterns, tools, styles of work, analytical habits, standards, visual strategies, professional ethics, and on data science research cultures.

Long Abstract

This panel will focus on new and breaking STS research on data science: the systematic process of creating, building, and structuring knowledge with data. Panel participants will present research on data science's communication patterns, tools, styles of work, analytical habits, standards, visual strategies, professional ethics, and on the social facets of data science research cultures. Each paper will be drawing on historical, philosophical, and social scientific methods. The larger goal of the panel is to advance critical understanding of data science among students and scholars in STS. Another goal is to continue building a coherent STS research agenda on data science. Because of data science's relative newness as a scientific profession, the research to be presented on this panel has the potential to advance data science as it formalizes: helping the profession to become an ethical, self-aware, and conscientious domain of expertise that more deeply understands the implications of how it produces and presents knowledge, and a profession that understands how its knowledge production practices are quietly changing notions of what counts as scientific evidence. This panel builds directly on a panel called "Debating Data Science" that was held at 4S 2015 in Denver. The 2015 panel focused on mapping potential obstacles to making data science researchable on STS terms. The 2015 panel also focused on debating what a broader STS research agenda on data science might look like. This panel will focus on how data science fits within larger shifts in scientific communications, computing, and consulting.

This track is closed to new paper proposals.

Papers

Reshaping Data Science Research Cultures

Author: Brian Beaton (California Polytechnic State University)  email

Short Abstract

This paper argues that STS has a key role to play in shaping the future of data science and that STS should proactively steer data science toward more progressive and interesting ends.

Long Abstract

This paper discusses data science as part of larger shifts in knowledge production and knowledge politics. A science-in-the-making, data science is beginning to formalize as a consultative profession that sits at the fulcrum point between data and prediction, and at the center of larger efforts to figure out (through a tremendous amount of backend labor) new ways of generating wealth and value through the gathering and use of data-- scientific and otherwise. The paper argues that the data science profession is strangely adrift: rooting everywhere (e.g., industry, government, NGOs) and yet following no particular compass when it comes to creating a coherent set of professional ethics, standards, and values. The paper also argues that STS has a key role to play in shaping the future of data science. Although data science and STS have not yet interacted to a large degree, STS will be essential to helping data scientists make sense of themselves, overcome internal obstacles, document their story, and become an ethical, communicative, and transparent profession that minimizes the authoritarian tendencies that data scientists themselves have flagged as lurking within their profession (Rudder 2014) given the ease with which data gathering and publishing can be used to accomplish unwanted levels of population surveillance and reporting. Engaging directly with other panelists via a roundtable format, this paper provocatively advocates for the following position: STS should proactively steer data science toward more progressive and interesting ends.

Data Science and the Security State

Author: Lauren Di Monte (North Carolina State University)  email

Short Abstract

This paper studies the relationship between data science and the American security state. It examines exchanges between data science and federal intelligence agencies, and describes how work in this field is enabled by particular infrastructures and regimes of data collection.

Long Abstract

This paper will study the relationship between the American security state and the technologies, practices and networks of data science. It will demonstrate how and why data science relies on surveillance and inscription technologies borrowed from intelligence agencies and will show how the work of data science is enabled by particular infrastructures and regimes of data collection. By describing the technological and organizational origins of data science this paper will provide an enriched context for understanding contemporary state-sponsored data science projects, like the NSA's MARINA database, which logs and analyses data culled from millions of web browsers to predict emerging threats, as well as relationships between state and industry data science projects grounded in predictive analytics, like Amazon's "anticipatory package shipping" model (Speigel, McKenna, Lakshman, & Nordstrom, 2013). Moreover, by describing how state-sponsored data science tools and processes migrate into new domains, like industry and academia, this paper will help expose possible trajectories for the future development of the data science profession.

More data, more work: problems, evidence, and collecting futures

Author: Amelia Acker (The University of Texas at Austin)  email

Short Abstract

This paper presents a survey of emerging data work coming out of the professionalization and 'trickle down' effects of the institutionalization and enrollment of data science into higher education and the professionalization of information workers.

Long Abstract

Currently, we are witnessing the emergence of new kinds of evidence generated by human trace data, supported by networked infrastructures, captured in vast digital collections, and mediated through new kinds of data work. The scales of these digital collections (big, small, unstructured, and sometimes dark) are subject to a range of problems, particularly related to that of data storage and data processing. The nature of these "data problems" rest on a figurative seesaw of expertise and technical constraints: at one end are issues of where data goes, how it is accessed, how it lives. On the other end of the plank are the actual motives, skills, techniques, models, and frameworks for processing data in digital collections into some meaningful evidence and put to use. At the fulcrum of this seesaw are the new data workers, solidifying and continuing to grow around these specialized problems of storing and processing such data collections at scale. These new data workers themselves are gradually becoming experts at mediating different kinds of evidence in order to be seen as legitimate information professionals. This paper presents a survey of emerging data work coming out of the professionalization and 'trickle down' effects of the institutionalization and enrollment of data science into higher education and the professionalization of information workers.

Data Science as a Service: Emergent Cultures of Modeling and the Production of Insight

Author: Shivrang Setlur (Cornell University)  email

Short Abstract

This paper explores values of management in data science. Focusing on new modeling practices, I explore how data science generates "insightful" knowledge, paying attention to how data science enables novel notions of evidence, expertise, and interdisciplinary practice.

Long Abstract

Data scientists frequently describe their work as providing not just knowledge, but "insight," according well with data science's promise that by experimentally combining data from different life worlds the profession can capture or disclose new patterns, behaviors, or objects that would not otherwise be knowable. Insight is part of how data scientists distinguish themselves from their parent disciplines, such as statistics, information science, and the human sciences, enabling the self-image of the data scientist as a new kind of super-consultant who provides science-as-a-service.

The history of data science suggests that values from management and the promise of "insight" have played a crucial role in crystallizing a distinctive approach to modeling within the young profession. As practitioners sought to loosen their historical bonds to medicine, science, and policy to enter the lucrative worlds of consulting and engineering, they turned away from exploring why things happened, adopting approaches attuned to prediction and speculation. Recent developments indicate similar trends at work, as the emerging field of "computational social science" seeks to scale-up and unify the social and behavioral sciences through data science modeling (Gonçalves & Perra, 2015; Lazer et al., 2009; Raghavan, 2014).

Drawing on the literature on modeling cultures (Morgan & Morrison, 1999), this paper will probe how practitioners in the rapidly expanding field of data science generate distinctively "insightful" knowledge, paying particular attention to how data science enables novel notions of evidence, new forms of scientific self-presentation, and new forms of interdisciplinary practice.

Data Silence(s): Data Science, Inclusivity, and Barriers to Social Change

Author: Tonia Sutherland (University of Alabama)  email

Short Abstract

Increasingly there are data science projects that aim to address planet-scale social problems. However, Western and Northern perspectives govern the current data science landscape. This paper discusses the ways underrepresented groups are being further silenced by vagaries in data science practices.

Long Abstract

Data work that aspires to global scales is increasingly common in data science. Often collaboratively or collectively undertaken, many of these data science projects aim to address planet-scale social problems and enact social change. What makes these projects all the more fascinating (and provocative) is the thoroughly Western and Northern perspectives that govern the current data science landscape. Even data science teams working on world-historical social and economic inequality thus far have very little data from the Global South. It isn't that these data don't exist. Rather, the issue is that the data science teams, despite the rhetoric of breadth and depth, are frequently narrow when it comes to including anyone from areas of the world beyond highly industrialized, financialized, and densely-networked locales. Whereas in professions like early anthropology, in which data was gathered from elsewhere to make claims about specific groups of others, what we currently see in data science is data gathered around cleanliness and convenience to make claims about everyone everywhere by teams that, at least for now, lack broad representation, inclusivity, and trans-hemispheric participation. Building on this point, this paper discusses the ways underrepresented groups and cultures are being further silenced by vagaries in data science practices - from collection and cleaning to description and representation.

This track is closed to new paper proposals.