Podcast: Play in new window | Download
Show Notes on Encode Equity
Organizations have flocked to data science as a means of achieving unbiased results in decision-making on the premise that “the data doesn’t lie.” Yet, as data is reflective of the biases in our culture, in our history, and in our perspectives, it is particularly naïve to assume that models will somehow smooth everything out and provide equitable results. The truth is that it falls on the shoulders of everyone working on and with data to question whether it is likely to produce the intended, more equitable outcomes or if, instead, it may propagate a pattern of injustice that is endemic to the data itself based on representations of the past.
Today, we talk with Renee Cummings, Data Activist in Residence at the University of Virginia and Founder of Urban AI, on the need to encode equity into data science and artificial intelligence.
Additional Links on Encode Equity
Renee Cummings, Data Activist in Residence and Criminologist within the School of Data Science at University of Virginia
Equality, Equity, and Social Justice A succinct coverage on the distinction between equity and equality from the University of Florida
Encode Equity Episode TranscriptView full episode transcript
Welcome to the Data Science Ethics Podcast. My name is Lexy and I’m your host. This podcast is free and independent thanks to member contributions. You can help by signing up to support us at datascienceethics.com. For just $5 per month, you’ll get access to the members only podcast, Data Science Ethics in Pop Culture. At the $10 per month level, you will also be able to attend live chats and debates with Marie and I. Plus you’ll be helping us to deliver more and better content. Now on with the show.
Lexy: Welcome to the Data Science Ethics Podcast. This is Lexy Kassan.
Marie: And Marie Weber.
Lexy: And today we are joined by Renee Cummings. Renee is a criminal criminal psychologist and data ethicist. She is currently serving as the data activist in residence at the University of Virginia. Renee, thank you so much for joining us.
Renee: Thank you for having me. It’s certainly an honor, and I know it’s going to be a pleasure
Lexy: Wonderful. Today, we’re gonna be talking about Encoding Ethics. This is the last in our framework of practicing data science ethics. I’m thrilled that you can join us for this one.
Lexy: First of all, as we think about this topic of encoding equity, I wanted to dig in because we even had some debate on what to call this – specifically about the difference between equality and equity. Renee, in your opinion, what is the distinction between equality and equity? And why is equity perhaps more pertinent in data science?
Renee: Well, I think equality is about providing the same for everyone. Equity recognizes that we all don’t start from the same place, which means we must always acknowledge that diversity and ensure that we are inclusive in the ways in which we are thinking and ensure that we make those adjustments to always correct the imbalance. So equity is more about fairness and impartiality and justice.
Lexy: In terms of where we see equity or equality playing out in data science, where have been some of the areas that you’ve focused on in your research, in your activism, in terms of finding these areas, where there are opportunities for better equity in data science?.
Renee: Well, I think my work looks at equity in general because I look at data. And one of the things that I do is I try to deconstruct the data. I try to disrupt the data and we’ve got to think about data, at least for me, I look at it, as a power structure. So it’s an infrastructure and it’s also a power structure. And the thing about data is where there is power, there is privilege. And if there is privilege, often there is less equity than we need to see it. So when you think of a dataset and you think that the dataset has a history and it has a memory you are bringing with it, sometimes, those old biases and old forms of discriminatory thinking and old forms of prejudices that are moving their way into new technology, which we are using data science to design.
Renee: So if we want to be fair and we want to be impartial, then we’ve got to reimagine the data sets that we are using. We’ve got to reimagine the ways in we classify and we interpret and analyze data. And we’ve also got to reinterpret the fact that something like data sharing, and this is something I talk a lot about the fact that through data sharing, we can transmit intergenerational trauma. So when you think about equity and you think about data, you’ve realized that without equity, you are not going to get the data that you need to build systems that we can trust.
Marie: Wow, Renee. You just painted so many pictures for our listeners to sit with and, and imagine. I love how you said that you are looking at ways to disrupt the data. I think Lexy and I can both appreciate where you’re coming from there.
Lexy: We’ve talked in the past about collecting data carefully and evaluating data sets. It sounds like from your experience and what you were just describing, part of that is ensuring that not only do you collect data in a cautious manner in trying to remove biases that you’re specifically trying to strip out of the systems that you’re encoding, but also that you’re not propagating even accidentally some of those biases. So it’s not even just the collection, but also how that redistributes. What advice would you have for people who are in positions, where there are data sets that they can share, or that they’re looking at providing for access for others to use in decision making and in other algorithms, to be able to ensure that they are not pushing data that has those types of biases that would then get encoded that would push inequities, I suppose, into other algorithms?
Renee: So, so what I say is, uh, we need to build a new type of consciousness around our understanding of data because data equals decision making. And when you are making decisions, it means are creating access. You are creating opportunities, you are sharing resources, you are sharing wealth. Uh, certain communities are going to thrive. Certain communities may not survive, and we are giving priorities to certain groups and it’s also about progress. So that’s how I think about data. So if it is that you are working with data, you have got to be vigilant of those areas, and you have got to have a kind of due diligence that is so sharp, that you understand that with every decision you make, you may be impacting due process, or you may be impacting duty of care. So one of the things that I really like to encourage just from a risk management perspective and from a rights-based perspective, is that we understand the risks within the business model. And we also understand data rights. And as we think about human rights, and as we think about civil rights, we have got to think about data rights. And if it is you bring that kind of consciousness and intention to your work as a data scientist, then you are ensuring that your is as fair as it could be, is certainly impartial. And that equity and diversity and inclusion are critical aspects of any, uh, product or program or process that you are designing.
Lexy: You mentioned the due diligence, that’s a fantastic concept for, you know, it comes from law and from accounting and so forth. But, you know, it really does speak to the heart of the need for this type of practice of being very intentional about the use of data and the use of algorithms and how that’s going to manifest in the world, the impacts it’s going to have. We’ve talked too in prior episodes about considering the context in which you’re building an algorithm. So for example, if you’re in a fairly high stakes environment where the decisions that you’re making really do speak to the resources that are being distributed. Systems that will impact people’s lives materially versus maybe something that’s a lower risk, lower impact, and the need to be that much more vigilant to be proportionally doing that due diligence. Do you feel that there are areas where it is less of a concern?
Renee: You know, I, I don’t think so. I always think of data science and the movement into artificial intelligence as we advance with new and emerging technologies as all being high stakes because they leave a legacy. And when you think about the painful past encoded in data, and when you think of some of the punitive uses data to punish and to disenfranchise certain groups, and when you think of the adverse data experiences of vulnerable and marginalized communities and those misleading data narratives often encoded in the algorithms, nothing is low stake. Everything to me is high stake. So this is why I’m saying that we’ve got to build a new kind of consciousness because we don’t want to continue those stories. So you may think it may just be a product, but you don’t know the impact, or you may not have been able to foresee because of a lack of due diligence, the impact that could have on a particular group. Because if you’re talking about diversity, we’re talking about racial, cultural, gender, ability, access, geography, sexuality. So we don’t know what a product or a policy or a program that is data driven may have on a particular community and how that future progress of that community could be impacted. So I think it’s something that we have got to pay particular attention to, and we really have to be vigilant in any application of data because it leaves a history, it creates a memory, and it leaves a legacy.
Lexy: That’s a phenomenal quote. It’s so powerful. As much as we say that data is reflective of history, it is reflective of the data that was collected, not necessarily all of history. So it’s so true.
Renee: Exactly. And then I would ask you, who did the collection?
Lexy: Absolutely. And what did they care about? What were they looking for?
Renee: Exactly. And then how did they classify that? How did they interpret that? What did they include? Who did they exclude? And these are the reasons why we’ve got to be so vigilant moving forward with new and emerging technologies.
Lexy: We’ve talked at this point about the equity part of, of this phrase “encoding equity”. As we think about the encoding part of it, certainly data science is done with code typically, or generates code that then does the same thing. What is the longevity of something that is encoded, in your mind?
Renee: It is forever. It is forever. Because think of the criminal justice system. Think of the data sets that have been used in the criminal justice system. When you think of the kinds of data points that have been used to create particular models, and then you realize that many of those data points are actually proxies for race, and you create these systems using algorithms and you design these algorithm decision making systems, and then you use them in policing. You use them in sentencing, you use them in corrections, and then you realize, wait a minute, the data sets are bias and discriminatory. And now we have a situation in the criminal justice system where disproportionate sentences have been served to black and brown people. And now we’ve got to pull back and we’ve got to push back against these systems. And now we’re thinking about the kinds of challenges facing the criminal justice system.
Renee: And when you think of something like the black box, and you say, you know, the algorithms in there are now rights or, or trade secrets or intellectual property rights. And now you have someone who is trying to get parole, trying to leave the system and return to society. And that person is constantly being denied because of an algorithmic decision making system or risk assessment tool that’s part of a parole hearing. I mean, where is the equity? Where is the fairness, where is the justice? So something so simple creates an extraordinary amount of trauma. Not only for that individual who’s incarcerated, but think of the family, the generation. So when we look at it, we’ve got to think about what are these algorithms doing when they misbehave? It’s not, you know, as, as sometimes people say, well, that’s what the computer said. But it’s so much more than that. And we’ve got to think about that. And I think criminal justice always creates the requisite level of conscience for AI.
Lexy: In addition to that need for that consideration, not only of the data, but also of “the algorithm itself, that it will have a legacy. It seems as though it kind of hearkens back to retaining responsibility for that algorithm. You can’t just say “that’s what the computer said. Therefore, the decision is made and thus ends the conversation.” Is there a need, do you think in the industry, even beyond criminal justice to have some sort of a cycle or a lifetime to a given algorithm, some sort of means of revisitation?
Renee: Sure. Well, we, we know the are impact assessments and we know they’re supposed to be audits of algorithms and, and a law was passed in, in 2020 that spoke to that. And it impacts every industry. You know, when you think about it, HR education, healthcare, credit scores, finance just about every industry, we have the impact of algorithms, misbehaving or algorithms are really creating some very contentious situations. And what it says there is that even the organizations that have the technical resources and the talent to do the audits are not doing that. And this is why for me, it’s so important in the design and development stage to really have high levels of intellectual confrontation among teams. And to really explore that question of diversity, equity and inclusion from the design across the life cycle to deployment to understand that there are certain risks.
Renee: So you’ve got to detect, you’ve got to mitigate, you’ve got to monitor those risks. You’ve got to evaluate those systems and you want to improve those systems. So auditing the impact assessments are what have been presented at this moment to create the requisite levels of checks and balances algorithms, but yet they are not working as efficiently and effectively as we would’ve hoped. So something is definitely missing from that conversation. It could be that because it is something that is required of these organizations and really not enforced that there is really no will to really get it done. But then there’s also the concept of accountability and transparency, and the fact that many of these algorithms and many of these algorithm decision making systems are really opaque and that opacity could be with intention. And that is something that we have been debating because what we are seeing is when these algorithms are opaque, they really impede the ability to gather ENT evidence that is necessary to see how they work and to see whether or not they are working fairly. So when you think about accountability and transparency and explainability and all those great things, and then you have these really opaque algorithms and we are unable to truly interrogate them, you have to ask your self, you know, what’s the question there really?
Marie: Renee, you’re not trying to, but I love how you are touching on so many of the areas of our, our PRACTICE framework. So that’s really exciting that this is an example of you being in the industry and kind of in the weeds day to day. And you’re our framework naturally organically. So that’s exciting.
Marie: Since you’re bringing up a bigger question about how these systems function, our society, are there certain datasets that maybe we have included in systems that we thought were, were going to be useful and now we realize that they, they might be making the system make biased decisions or they might not be needed for the system to be able to come to equitable decisions. And are there times when you think it makes sense for data to be in move from the system so it can be more equitable by actually having less data?
Renee: I think for me, it’s, you know, there is no perfect data set as there is no perfect algorithm, but I think if we were to have that kind of intention and understanding, and consciousness and vigilance, then when we bring that kind of critical thinking pre the application of these datasets, then what we are doing is naturally creating those, uh, really sturdy guardrails of ethics that are required to really examine the data that we are going to use. So it comes back to due diligence. So this is why, you know, people always say to me, you know, what is a data activist? And one of the things that I always say is a, a data activist is the conscience of the data scientist. And, and that’s what I’m speaking to. We, we have to have this broader, more dynamic and multicultural cosmology as individuals who work with data and not take data at face value. And that’s what we’ve got to do. Get behind it. Look for those historic and systemic and institution biases that may baked it, be baked into it. Remember we live in a society where widespread biases persist. Remember, you know, there are many preexisting patterns of exclusion and inequality that have impacted many communities. So what is simply saying is that the data science is have got to broaden their perspective when they are looking at a data set.
Lexy: We have talked about inclusivity essentially in garnering perspectives, from a multicultural diverse group that can evaluate data from a different lens. Have you seen this work well in organizations, in data science teams? And if so, how
Renee: Well, I think data science teams are now looking in that direction and you would find across, uh, the academy and academia as well, and companies, they’re starting to realize that we need these interdisciplinary teams to work. And what they’re realizing is that we need to combine the data scientists with the social scientists, with the psychologists, with the educator, because what we want is that kind of diversity of perspective, diversity, uh, in approaches diversity in interventions, because where there is diversity, uh, you are enhancing the imagination that you are bringing to the table. So if you don’t have equity, you’re not going to have diversity and you’re not going to have, have inclusion, and you’re not going to have that interdisciplinary imagination. That’s required to really enhance not only the business model, but to really build up the organization and to build up the technology. So those are the critical aspects that we always want to look at when we are thinking about this data power structure. When we are thinking about the design, the, the definitions, the designations, the distinctions that we’re going to use, because when it comes to data, I think we all could appreciate the data, a system that is not only economical, but it is social. It is political, and it is cultural.
Lexy: It sounds like in that system, there’s almost the potential for several different vicious cycles. One of the types bias that gets then encoded into algorithm that create further inequity. That was what we’ve been talking about primarily today, but also in this inclusivity and the lack of inclusivity or diversity, potentially leading to outcomes in equity that put those specific people who’ve been building data science algorithms even into further power that then mean that they stay in their bubble without including additional perspectives. And it just keeps going. In these cases, how do you break out? Especially for data science teams that are smaller, maybe in organizations that are not as, or don’t have the kind of interdisciplinary capabilities that may exist in say academia, where can you find groups to be able to kind of bounce ideas off of, and look at data with you essentially?
Renee: Well, I think any organization where it’s solved would realize that this is the trend. This is the movement. And I think every organization pays attention to some of the crises that happen. So we have seen with algorithms, the proliferation of cases in which, uh, there’s been extraordinary bias and discrimination keeps popping up. So the, we know this is a challenge for the industry. So at that point, I think anyone or someone in the organization has got to look at what is the culture of org or organization when it comes to an ethical approach to new and emerging technologies. And beyond that, you’ve got to say, you know, you’ve gotta look at the organization and you’ve got to see whether or not it’s diverse or whether or not it’s inclusive. And you’ve got to look at who’s at the table when we are coming up with these concepts when we are developing, when we’re deploying. And if you don’t have that mix, then that’s something that the organization itself should really try to procure because you want to procure that level of diversity.
Renee: You know, it, it, and it’s more than just looking for bias. So dealing with discrimination. And I think we, we, you know, I spoke about this earlier. It’s about decisions and when decisions are made, which are the groups that are impacted fairly and justly, and which are the groups that are denied access to opportunity. And it’s also about the, the digital divide and the wider, the digital divide gets the wider, the wealth gap gets. And we’ve got to also think about, you know, the differences in outcomes. And, and this is when we talk about equity and this whole concept of the lack of diversity, equity, and inclusion, it comes back to the differences in outcomes for certain groups. Which groups are dealt with, and which groups are able to prosper and which groups may be denied. So there’s so many things, uh, when we think about data and when we think about equity. So that’s why it’s critical for any organization, no matter its size to really procure that diverse talent and that interdisciplinary talent to add and enhance its business model.
Marie: I feel like outcomes is where, so how many organizations start in terms of why they pursue different algorithms or, or different data science projects? It’s a promise of, “well, we’ll be able to provide more loans for people, and we won’t be biased because we won’t need somebody to go into a branch and sit in front of a loan officer. They can do it online. It can be automated. It can be faster. It can be a better user experience.” But then I think, like you’re saying, Renee, you need to make sure that the outcomes that you envisioned at the start are really being actualized. And if they’re not taking those hard looks at how to adjust the system. Or even better really thinking about how do you make sure that new and improved user experience is going to have equity encoded in it. So you don’t end up with a different set of biases potentially.
Renee: Exactly because you’ve got to ask yourself this you user experience is for what profile of a user, when you look at the datasets that you’ve been using or the data points that you’ve been using. So when you think about something like, uh, digital or algorithmic redlining, or you think about, uh, how particular groups have been denied access through credit or finance, and then if you’re building financial tools, uh, who are you building it for? Because those groups that you have denied access with the data sets that you’ve used are certainly not going to get access to these new financial tools where, you know, they’re going to have that great user experience. So this is why it’s so critical when we think about wealth and we think about resources and we think about opportunities, and we think about those critical decisions that we use data to make and how, uh, you know, certain groups are really continuously marginalized and disenfranchise. And when we speak about equity, what we are speaking about is representation, amplification, and visibility of all groups.
Lexy: You’d mentioned that this is the trend being more conscious, being able to ensure that the outcomes that you are driving towards are more equitable are not disparately impacting different groups, but we’ve also talked about the fact that some organizations are doing, uh, I will say doing assessments and compliance, um, essentially for fear of, of a reputational wrap on the wrist. At what point does that trend tip? Have we hit that point where companies are being more proactive or other organizations are being more proactive, not necessarily from just the fear of somebody scrutinizing them and saying, “oh, you messed up pretty bad here,” but from the perspective of really wanting to do the right thing, even if it means potentially less profit or less cost savings, or, you know, a slower path to growth for what they’re trying to build in outcomes?
Renee: Well, I think companies are realizing that ethics washing or window dressing with AI and data ethicists or ethical theater as I call it that full ethical performance that we are being so ethical because we have designed these fantastic frameworks. So these codes of conduct or professional standards when it comes to the application of data, of the ways in which we deploy new and emerging technologies, I think people have, uh, really seen through that. And I think many companies now are realizing that there is a huge and very, uh, vocal and vibrant and dynamic movement in artificial intelligence and, uh, technology in general, that calls for responsible technology that is looking at these, uh, this movement is really examining and exploring the intersections of technology with racial justice and social justice, design justice, algorithmic justice, and data rights. And what they’re realizing is that at this moment, uh, you really have got to do the right thing for the right reasons because there’s a whole movement of advocacy and activism. That’s really looking at the impact of data and society. So it’s not about trying to get away with it because we can or trying to get away with it because we are so big as a technology company that we are making the rules or that we are so advanced in our technology, that the law is so far behind, it can’t catch up to us or that we are working with so many governments providing not only the infrastructure and the hardware and the software, but we are everywhere within, uh, a government as a, a technology company that we have this kind of relationship with, uh, governments that you are not going to see the kinds of regulation and legislation that’s required. I think what they are realizing is that there are individuals who really care about the long term impacts of this technology on society. And these individuals are really calling them out. And I think we saw that last summer when several of the big tech companies put a moratorium on further research and development of facial recognition, uh, technologies and said, wait a minute, we need to pause. We really need to look at the impact of, of this technology on certain groups. And we really need to look at how, uh, it’s being deployed. So I think beyond the, the ethical theater that we’ve all so enjoyed and all those great actors and performances, when it comes to ethics, I think people are really stepping up to the plate and realizing, uh, you know, we’ve got to do the, the right thing in real time.
Marie: I have kind of a left field question. So Renee, as you’re talking through the, the ethical theater and, and really, you know, people becoming more aware of how these things play in their daily life, do you think there might be people also just thinking more about their privacy and what that means and realizing what they’ve given up over the past few years and there being more of a return to people wanting more privacy and there being a shift towards that in our technology as well?
Renee: I think that’s an excellent question. And privacy is huge. And I think more and more of something like predictive policing and mass surveillance technologies are bringing privacy right to the front burner, because people are starting to realize that I, it could be impacted as well because of my data. And people are starting to say, wait a minute, I would like to opt out of this service. And then they’re realizing now about the fact that sometimes, uh, organizations make it so difficult or companies make it so difficult for you to opt out of a particular service. So I think people are realizing that and data rights. And I keep saying, this is really going to be like the civil rights movement of this generation, because people have realized, wait a minute, my data has been monetized by big companies and they are profiting a off of my data and I’m not seeing a penny of that.
Renee: And now they’re seeing that through things like big data policing and predictive policing and mass surveillance technologies and all of those great things that are being designed, that my data could now be weaponized against me. So not only it’s being monetized, it’s also being weaponized. And I think because more so of COVID 19, and we are now seeing that systems integration between healthcare data and national security data, that people are starting to wake up the fact that I may, or we may have given up too much of our privacy for, uh, convenience and expediency of technology. And there is that movement. That’s saying that if we move forward with new and emerging technologies, we want something that, that is responsible, something that is principled and something that is trustworthy.
Marie: Totally agree.
Lexy: Are there any other thoughts you’d like to share on encoding equity or its place in practicing ethical data science?
Renee: Well, I think, you know, this is a very exciting time for technology. I’m very passionate about AI. I, I am, I’m really, uh, passionate about its potential, its its promise and uh, the extraordinary impact it could have on the progress of society. I mean, this is really a great time for, uh, technology G and while I love what this technology can do, I believe we’ve got to provide those protections and those ethical guardrails to ensure that it is fair. It is accountable. It is that transparent. It is explainable. And to ensure that diversity equity and inclusion are critical to the design matrix, because what we want uh, to do with this technology is really ensure we create new systems that benefit the greatest majority and what we want to totally ensure is that whatever you, we create no harm. And we understand that we want to create a legacy that we could be proud of.
Lexy: That’s a wonderful sentiment to finish with Renee. Thank you so much for joining us. This has been an absolutely marvelous conversation.
Renee: Thank you so much for having me. It’s certainly been a pleasure.
Lexy: And thank you all for joining us on the Data Science Ethics Podcast. This has been Lexy Kassan.
Marie: And Marie Weber.
Lexy: See you next time.
We hope you’ve enjoyed listening to this episode of the Data Science Ethics podcast. If you have, please like and subscribe via your favorite podcast App. Also, please consider supporting us for just $5 per month. You can help us deliver more and better content.
Join in the conversation at datascienceethics.com, or on Facebook and Twitter at @DSEthics where we’re discussing model behavior. See you next time.
This podcast is copyright Alexis Kassan. All rights reserved. Music for this podcast is by DJ Shahmoney. Find him on Soundcloud or YouTube as DJShahMoneyBeatz.