Studying missing data: an interdisciplinary approach to data gaps © Clint Adair

Studying missing data: an interdisciplinary approach to data gaps

Sarah Giest and Annemarie Samuels are kicking off this series 'Large Issues, Interdisciplinary Approaches' on interdisciplinary research and teaching at the Leiden Anthropology Blog. They offer one example of how talking across disciplines can open up novel questions and perspectives.

Sarah Giest (Public Administration) and Annnemarie Samuels (Cultural Anthropology and Development Sociology) met each other in the Young Academy Leiden and collaborated on the topic of data gaps.

On a summer afternoon in 2019, we found ourselves in one of Leiden’s beautiful new buildings (Van Steenis) giving 3-minute pitches on our individual research projects for a small group of Young Academy Leiden colleagues. The two of us talked about the use of big data in policymaking (Sarah), and silences in healthcare (Annemarie). Afterwards we started a conversation on missing data. How do missing data influence policy-making? How do the data gaps that bias policymaking adversely affect vulnerable groups? And what can governments do to address these gaps if they want to truly commit to inclusive policymaking? We decided to join forces and cross disciplines to see what answers to these questions might look like if we took a truly interdisciplinary approach. The resulting discussion paper – “‘For Good Measure’: Data Gaps in a Big Data World” – was published open access this year in Policy Sciences.

Data gaps
In our paper, we point out that while governments have started to increasingly rely on large datasets for policymaking, relatively little attention has been paid to the systematic omissions in such datasets and the ways they especially affect already marginalized groups. These groups tend to produce less of the data points that make it into such large datasets on e.g. mobility, social media, or lifestyle. More data clearly does not equal better data. We distinguish three types of data gaps. First, a government may know that data are missing and unavailable, but have limited capacity (or will) to follow up on it. This we call a primary data gap. We note that current ways to address this, e.g. through using proxy variables, have their own potential biases. Moreover, the subsequent (political) pressure to report the (so far missing) data may itself lead to distortion. A secondary data gap emerges when governments know that there is a data gap and where there actually are data available that may fill this gap, but the data is inaccessible, difficult to use, or of poor quality. Finally, there may be what we call a ‘hidden data gap’, where governments are unaware that some groups are not well represented in the data they use for policymaking. Obviously, this can have huge consequences for making and implementing such policies, as they may reiterate patterns of exclusion. It is therefore urgent to highlight who and what gets misrecognized and miscategorized in the social production of data – whether in targeted surveys or ‘automatically’ generated digital datasets.

Data politics
Government efforts to measure and count are always political and datasets, big and small, will never be free of gaps. In our paper, we maintain that in order to create more inclusive policymaking, it is crucial for governments to be aware of data gaps and start addressing them. We suggest several ways of doing so, for example including domain experts (not just data analysts) in data analysis, combining quantitative and qualitative research methods, taking a critical look at what questions are asked and by whom, and increasing the level of granularity of the data. The latter means, for example, to include sex and gender differences in data collection – research shows that where this has not been done, results are often less representative of women’s situations. In all of this, ethical questions on who collects data and for what use remain crucial, as of course there are very good reasons why some – often marginalized – groups do not want to be included in big data collection, and big data may create new forms of inequality.

Large questions, interdisciplinary approaches
The turn to big data in governance is a complex and urgent societal issue that – like so many pressing questions in our present-day society – is most effectively addressed by combining methods, views, theories and experiences from different scientific disciplinary traditions. Large questions need interdisciplinary approaches. It is important, therefore, to build academic structures that facilitate interdisciplinary collaboration. Let’s creatively think together about how to open up new possibilities and perspectives; in the study of big data, policymaking, and beyond.

- - - - - -
About this series

There is a growing awareness in and beyond academia that the large challenges of our times cannot be addressed from a single disciplinary perspective. Increasingly, therefore, anthropologists and others are putting their disciplinary knowledge to work in interdisciplinary research and teaching. The collaborative work done in the PortCityFutures project highlighted on this blog is an excellent example. In this series called Large Issues, Interdisciplinary Approaches Institute members and their collaborators reflect on working interdisciplinary and offer a peek into the results of their collaboration.


Add a comment