Practical Data Ethics — How You Can Make Your Data Work More Ethical
As a junior data professional, getting your company to implement ethical approaches can feel as daunting as climbing a mountain. But there are some ways you can make it easier!
Data ethics is important. We’ve all seen notable examples of questionable methodology in data work from invasive approaches to collecting data for facial recognition algorithims to flawed crime data used in predictive policing models, and the problems that can arise from them.
Many of us aren’t anywhere near data like this. But we are still concerned with how the projects we work on can be poorly handled or our findings misinterpreted, both of which can have serious ethical implications.
In this post I’ll be covering the ways in which a junior data professional can go about starting conversations around ethical approaches to analytics especially when they have little to no decision-making power in their larger organization.
Whether attempting to address issues with an existing project or proactively seeking to prevent future mishaps, these approaches have proven useful in starting the necessary conversations.
Identifying pain points
Before you can begin, it’s essential to reflect on kinds of data, you, your team, and your larger organization work with and rely on. You should also think about the kinds of reporting and analytics that your team creates.
Organizationally, this practice of reflection personalizes conversations, prevents people from engaging with these topics as abstract concepts, and is more likely to motivate actionable steps to embed ethics in existing workflows.
To illustrate this, let’s use data pertaining to education as an example.
In this sector, you might be analyzing data on student performance or engagement, which in many instances will be analyzed along with demographic variables such as race/ethnicity, gender, and income.
If the research questions being asked are about vulnerable or marginalized groups, it’s important to ask the following:
- What value or support do these questions have to said groups?
- Are there unspoken but implied assumptions about those who the data is collected on?
It’s also common to collect more data on students in the form of surveys, which raises the following questions:
- What are we asking students and why?
- Do we ask some questions to specific students and not others?
- Do some of these questions in and of themselves harbor implicit assumptions that may have a negative effect on those responding?
Not engaging with these questions can lead to poor research.
For example, imagine a survey project meant gauge how low-income students are engaging with a program meant to support them.
If the work has not been done to explicitly call out assumptions about the capabilities, interests, and life experiences of students, it’s not uncommon to see survey questions explicitly asking students how often they skip class, bully other children, or even how frequently they engage in criminal activity.
Unfortunately, these are real questions that have been asked to students in such programs.
Not only are these questions pointless (we should not expect respondents to admit to such things, even anonymously), they also weaken the trust of the students this program is trying to help.
These questions can make students feel as though the organization has already painted them with a broad brush. How likely is a child to believe that you are really trying to help them if your survey questions assume they’re misbehaving?
Whether you’re attempting to address an ongoing ethical quandary or preempt potential issues in your work, you’re going to need support.
A key next step involves inviting others into conversations about these topics. There are two key points when it comes to enlisting allies. First,
- Start within your own team and those who work closely with the data
If you’re new in your role, a fresh perspective can be especially crucial in revealing blind spots to issues that may have gone overlooked by more senior members who’ve grown accustomed to “how things are done here.”
Over time and with a growing consensus, a unified front from those working with the data can really help with making suggestions for new directions when engaging in reporting to internal and external stakeholders.
This leads to the second key point:
- Invite non-technical professionals in your organization whose projects rely heavily on analytics
While your perspective can reveal blind spots to the more senior members in your team and other data-savvy individuals, outside perspectives from people who never touch the data can reveal blind spots the you yourself may have on these issues.
Including more people in the conversation also brings with it the added benefit of demystifying data within your organization.
At any company, data should be critically discussed and engaged with, not treated like an objective solution only experts can use.
Speaking to your audience(s)
As you navigate these conversations, it’s important to bring the right framing to each group. In terms of technical professionals, I’ve found it best to approach the conversations focusing on a mixture of ethics and methodology:
- Is the way we’re going about our work causing more harm than good in terms of the questions and reports we’re putting together?
- Are our questions laden with assumptions that need serious reflection?
From a methodological standpoint, by improving our ethical approaches we will no doubt more accurately investigate our questions. We are also likely to find more interesting and impactful stories in the data we analyze.
Many of the ethical issues we’ve seen in recent years in data science cause harm, and that’s the most important issue to address. But they also paint a grossly inaccurate picture of what is going on in the real world. For convincing some stakeholders at your organization, framing the discussion in terms of accuracy may help.
The ethics and methodology approach can also be useful framing for those not immersed in the data. But if that doesn’t work, concerns over liability and image can grab the attention of those who, for whatever reason, remain unmoved by ethical or methodological arguments.
In other words: even if a team member doesn’t care that your approach is causing harm or risking inaccurate results, they may care that it represents a potential legal or PR threat.
We’re a long way away from accountability in misuse of analytics, but issues of data breaches, for example, bring real-world repercussions.
- To what extent does the way data is stored, reported, and collected open your organization up to real and legal repercussions?
- To what extent is the image of an organization affected if knowledge of biased and unethical approaches is ever exposed?
While the focus should always be on good methodology and ethical concerns, sometimes framing the conversation in these ways may be the only way to get some attention and at least get the conversation started.
A mixture of these and other framing perspectives can be used as you, and hopefully those invested in this work alongside you, have larger conversations about the work being done with the data at your organization.
Patience is a “virtue”?
Finally, it goes without saying that this kind of work takes time, and with that come pros and cons. Let’s start with the bad news first.
- You may have to watch a flawed project go through multiple iterations before any changes are made
Once you’ve come across problems embedded in a workflow and have a few key people on your side who agree, there may still be roadblocks to making the necessary shifts.
These may be due to external resistance or an internal priority not to change anything that could potentially upset revenue streams or client satisfaction.
This can be frustrating to say the least and you’ll need to find a balance between knowing when to push for a change and when you’ve reached the limits of what can be done given institutional barriers. However, even in these situations, your work need not be in vain.
- Just having the conversation can have lasting positive outcomes
Real changes are possible when it comes to improving the ethical framework in data workflows.
For those flawed projects that for one reason or another still go through in their current form, the genie is out of the bottle when it comes to their inherent problems. Changes may not be happening, but no one can say the issues aren’t known.
More importantly, once ethical issues are raised, people are more likely to embed an ethical approach into new projects during their development which goes a long way in preventing new issues from becoming part of the status quo.
Keeping at it
Win some, lose some is the name of the game when it comes to this kind of work. Every organization is different, with varying levels of enthusiasm for tackling these issues.
However, at the end of the day, applying an ethical framework to data work does not have to be an abstract concept.
Reflection on the ways in which our own work can be improved and generating buy-in to collaboratively meet these goals is possible.
This work is vital, and unfortunately there will be roadblocks that make it challenging to use data to tell accurate stories and avoid causing harm.
But that’s what at stake, so don’t be deterred!
Learn Data Skills
Get that next raise or to switch to a career in data science by learning data skills.
Sign up for a free account and try our interactive courses in Python, R, SQL, and more!