February 28, 2023
Collecting, understanding and using data is central to digital services of all kinds. Making data-driven decisions - rather than relying on assumptions or untested hypotheses – is eminently sensible. However, at Valtech, we’re committed to helping organisations draw the correct conclusions from data. That means understanding that data can be partial or reflect biases. As we discussed at a recent Valtech webinar, data bias can take myriad forms and have considerable unintended consequences.
Not long ago, there wasn’t always data available to guide the design and development of digital services. Teams would have to model, hypothesise and make assumptions about what was required. Today, being data-driven is increasingly the default. However, ensuring that the data we collect or inherit gives us valid answers is also essential.
Partial or skewed data can be a real challenge in digital transformation and the design of digital services. Everything from the process, user experience, research and even investment decisions can be made based on data representing only part of the whole picture. Perhaps it over-represents particular people, needs or scenarios, or ignores others.
Caroline Criado Perez, author of Invisible Women: Exposing Data Bias in a World Designed for Men, recently shared insights on the gender data gap at a Valtech webinar. She explained how, for hundreds, even thousands of years, most products, services, buildings, experiences, policies and decisions have been designed from a male point of view (as described below). The words ‘human’ and ‘mankind’ in practice mean the needs of men, with women considered to be a subset of men by scientists, architects and philosophers. Whilst not always deliberate, biases shape much of our understanding, models and actions.
Male perspectives dominate design and decisions
The book explores how the male perspective dominates in everything from the workplace to healthcare, product design to automotive safety. The result isn’t just that it’s unfair. Ignoring the needs, perspectives and physiology of 51% of the world’s population can be life-limiting and even fatal for individuals. It also harms productivity and the economy.
Her point is that even today, we often aren’t collecting and analysing sufficient data about women. Even when decisions are based on data rather than assumptions or the needs of what she calls Reference Man (a Caucasian male, around 70kg and 25-35 years old who is believed to represent everyone), there is a Gender Data Gap. This might be a case of not collecting data about women or just looking at the average of all respondents without segmenting women separately.
What does this mean in practice?
One example is deciding which roads to clear of snow in metropolitan areas. The typical approach is to prioritise clearing the arterial roads to keep commuters moving and, therefore, focusing – so they think – on the economy’s needs. However, Caroline points out that, on the whole, both women and men have travel requirements contributing to the economy. Men are more likely to have simple commutes travelling by car along arterial roads directly to their place of work. Women are more likely to walk or take public transport and their travel patterns tend to be more complicated, for example, dropping children off at school on the way to work or taking an elderly relative to the doctor on the way home. Not only is public transport infrastructure poorly designed to support these journeys, leaving side roads and pavements covered in snow leads to more accidents and significant healthcare costs. It also has a massive economic impact in terms of sick days (including the loss of unpaid care work).
Another particularly chilling example is that for decades crash test dummies haven’t reflected women’s needs. Women and men have different shaped hip bones, and seat belts are designed to restrain us around the hips. A 2019 study from University of Virginia concluded the result is that women are 73% more likely to be seriously injured or die in the event of a crash than a man wearing a seatbelt.
When I read Caroline’s book, I found it as enlightening as it was informative. In particular, I repeatedly asked myself how many design decisions we trust – because data and research supposedly back them – are actually flawed. The data doesn’t represent everyone’s needs or hides significant differences in averaged-out analyses. The book is a clear warning about what happens when you don’t take diversity into account.
I should add that we’ve only touched on a few examples here, and would encourage anyone reading this article to also read Caroline’s book “Invisible Women: Exposing Data Bias in a World Designed for Men” to more fully understand the extend of the problem.
Avoid amplifying biases
A common theme across every project we’re involved with at Valtech is the desire to make data-based decisions. It’s a vital part of the National Data Strategy, a Government-led strategy that suggests “Better use of data can help organisations of every kind succeed – across the public, private and third sectors”. Clearly, we need to assure ourselves that the data we are using is suitable to inform unbiased decision making. And with the growth of Artificial Intelligence, dealing with these biases is more important than ever as we otherwise risk amplifying any bias present in the data.
Sometimes we gather new data through research to inform design decisions. Sometimes we’re leveraging existing data. Often, we’re doing both. We all must ask questions about the data itself. The only way to deal with data bias is to be aware and address it when we collect data. If we’re inheriting data, make sure we are interrogating and validating it.
- If you’re collecting data as part of a research project to inform policy, are you gathering sufficient data to represent your user base?
- If you’re tapping into Open Data, are you doing enough to assure yourself of its provenance?
During Q&A, we had a question asking what the single most important step data practitioners can take is? The answer is simple, make sure you sex disaggregate your data. There is of course much more we should be considering to address data bias, but this is a great practical first step.
Sharing data insights through storytelling
The webinar continued by exploring another critical challenge when using data to make decisions and inform the design of digital services. Most deliveries involve large teams, cross-functional working and multiple teams, departments, organisations or agencies. It is crucial to communicate the meaning of data and share an understanding of what it’s telling us.
Dr Nicholas Heck-Grossek of Tableau highlighted that “with great data power comes great responsibility”. People have used storytelling to illustrate the meaning of data for hundreds of years, of course. Data storytelling is a structured approach to communicating data insights that combine the data with narrative and visuals to bring context and understanding.
Understanding data bias in practice
The webinar concluded with a panel discussion. We were lucky enough to be joined by four people who’ve played significant roles in delivering public sector digital services.
Sophie Adams heads up consumer experience for the Office for Zero Emission Vehicles and recognised the challenges of ensuring that the roll-out of charge points and infrastructure is designed for a diverse audience, not just male drivers and early adopters.
Ian Gordon, Head of Data for the Parliament Restoration and Renewal programme, highlighted the need to continue to challenge actual bias as well as tackle data bias.
Louisa Nolan, Head of Public Health Data Science at Public Health Wales, is focused on ensuring that different groups aren’t left out – whether based on gender, ethnicity, socio-economic background or other factors. She welcomed the increasing acceptance of data science in public health (and other) departments. However, she also pointed to the need to not just “disaggregate, analyse and monitor” but to act and engage with the people we want to help and the need for user research and other skills.
Among his responsibilities, Giuseppe Sollazzo, Deputy Director of the NHS AI Lab and founder of the Open Data Camp, is setting up ‘skunkworks’ AI initiatives that establish proof-of-concepts to test models. He recognised that established processes and existing datasets have biases embedded. The challenge is to understand these, try to make legacy data usable with adequate workarounds and rights of redress, and collect new data in a way that doesn’t introduce biases.
The panel were united on the importance of how we use data to design services, the dramatic effect that Artificial Intelligence could have in amplifying bias and that the level of automated decision making should be considered on a case-by-case basis to ensure it’s appropriate. There was clear support for codes of ethics and standards regarding data use, as well as the need to understand them at the point of collection.
At Valtech, we believe every Discovery needs to consider the provenance of the data it uses or collects. If you’d like to understand more about dealing with data bias in public sector digital services, contact us today.