Podcast: Play in new window | Download
Show Notes on Retain Responsibility
One of the core tenets of ethical behavior in data science revolves around the concept of needing to retain responsibility or accountability. A differentiator between our take on this and that most commonly conveyed is the distinction between the two terms. Why, then, do we use the term “responsibility” instead of “accountability”?
Accountability is reserved for the holding of one person to the outcomes of a completed task or project. This person is answerable for whatever has happened.
Responsibility is finer-tuned. It can be shared among a group of people, each with responsibility for some amount of the outcome. Each constituent may then be retain responsibility for resolving what they are able to if a problem arises.
This shared aspect of the outcome and remediation for data science gone awry is why we choose to look for responsibility rather than accountability. Data scientists, companies, governments, even everyday people are partially responsible for the data and algorithms in common use. We each play a role and must retain responsibility for our contributions.
In this episode, we explore the responsibilities that we all have in developing, demanding, and consuming data science-based tools.
Additional Links on Retain Responsibility
Accountability vs Responsibility Diffen’s succinct definition of the difference between these terms along with examples.
Retain Responsibility Episode TranscriptView full episode transcript
Welcome to the Data Science Ethics Podcast. My name is Lexy and I’m your host. This podcast is free and independent thanks to member contributions. You can help by signing up to support us at datascienceethics.com. For just $5 per month, you’ll get access to the members only podcast, Data Science Ethics in Pop Culture. At the $10 per month level, you will also be able to attend live chats and debates with Marie and I. Plus you’ll be helping us to deliver more and better content. Now on with the show.
Marie: Hello everybody and welcome to the data science ethics podcast. This is Marie Weber.
Lexy: and Lexy Kassan
Marie: And today we are going to talk about retain responsibility. So this is a concept when we are approaching data science and we want to make sure that we’re following our best practices. So Lexy, when you are putting together a data science model, how do you make sure that you retain responsibility?
Lexy: There are a couple of aspects to this. One is that it’s the ownership of the model, it’s the care and feeding of that model over time. We’ve talked about that as part of the data science process and then it’s also kind of the accountability for what that model does in the future that at some point you may find that your model needs to be changed because there’ve been an unanticipated effects and keeping the responsibility of doing that maintenance, doing that adjustment over time so that your model is doing what it’s supposed to do and hopefully not what it’s not supposed to do. The first one really is as you’re creating your model, being responsible, using those best practices, obviously we talk a lot about them. It’s also about from the scientific aspect, making sure that you do your testing and validation and kind of statistical testing and all the things that go into making sure that a model is generalizable, is repeatable and that it’s useful in the in the real world.
Lexy: That’s part one of modeling responsibly. It’s using some of our other best practices, making sure that you’re considering the context of the model and really approaching the problem from an ethical standpoint from the start. If you’re listening to this podcast, hopefully we’ve given you some principles to start with as you get into the iteration and and the care and feeding, it’s making adjustments not only to keep your model accurate and generalizable and applicable to the business and then situation that you’re in, but also to the changing environment at large over time as your model is used for more and different things, it may start to hit upon circumstances that it was not trained on in that can not only degrade the model but also have consequences that maybe were not anticipated and that you need to adjust for. And I think part of this is really the accountability saying, yes, it was me that did this model, I worked on this. I am responsible for its actions. Not that I’m saying you need to be to the public at large for a model that does something was unanticipated, but that you’re willing to take the steps necessary to remediate any problems, whether they be statistical issues or ethical issues.
Marie: As you look at retrain to the model and analyzing the model. That’s something also that can change in terms of the timeframe that you need to do it because it might be in a certain situation. You can review a model once every six months and it might be in other situations where you need to review a model and update it on a much more frequent basis. As technology evolves, those timings can also change.
Lexy: Sure. The circumstances in which models are used and are run are real time in many cases, which means that a small change or a change in the market or a change in the news, you know what’s going on in the world can suddenly have an impact on a model that was otherwise doing its job just fine. And now the circumstances around it have changed. So even if you had been planning on maybe evaluating it in a couple of months, something comes up, you need to be responsible enough to say, okay, the context has changed. I need to look at this again. Regardless of the fact that from a timing perspective it wasn’t necessarily planned. That sure happens and it can happen. You can notice this in a number of ways. The most unfortunate is a news article that smears your company and potentially your algorithm with whatever has occurred that now may be a negative press, but more often you would see it in a degradation of a model’s performance.
Lexy: From a statistical standpoint, as a data scientist, you often look at certain metrics for your model’s accuracy and performance. Over time you might see that maybe it’s simply taking too long to run. That’s a very minor thing being going back, revisiting how it’s processing and and fixing it accordingly. It could be that the model is simply not performing as well in terms of the outcomes that it’s generating because something’s changed. We see this all the time and most often it’s not the fault of the model. It’s that the model doesn’t know about other extraneous circumstances that have occurred and so now it needs to know we add these things in. You test the model, you retrain the model, you validate the model. You do all of those responsible things as a data scientist to make sure that once it’s done, hopefully it will remain applicable for a period of time so that you’re not doing this every single day.
Lexy: Because the hope is that you can set up the model to run for a while and in machine learning techniques, it should learn over time as new patterns emerge so that you don’t have to retrain it specifically. But any change in the larger business contacts could mean gathering new data. It could mean adding new independent variables. It could mean using a different approach. So even a machine learning technique must be revisited over time to make sure that it’s getting the right inputs, the right food for the model to be able to produce the results it needs and being responsible enough to say, Yep, this needs to be redone, not well, I did my job and hands off. Now, if you’re the one that’s still working in that organization and you created that model and you’re the one that knows it best, be responsible enough to continue to nurture that model and make sure that from an ethical standpoint, it’s doing what it needs to be doing and that it and that you’re doing what you need to be doing for it.
Marie: And when you think about your role in the organization, another thing that ties into retain responsibility is listening to the business, listening to the business objectives, and maybe even things that people are looking at testing or even ways that the business is evolving and saying, okay, if we’re going to make that change, then here are some things that I need to go back and look at updating in my model.
Lexy: Certainly. And that really takes a lot of organizational communication. That is, I hate to say it, but it’s rare that people will come to you proactively and let you know that certain things are changing in the business. More often than not, it’s not because they don’t think you should know or they don’t. It’s just they don’t know that you need to know. But the fact is that if you’ve created a model and they change something that’s an input to your model, the model will not perform as well. It’s just a fact of life that the business changes. And ideally if you create a model, you’re doing this to impact a business process, to impact the business in some way. And so over time that impact will have to be remodeled in if you’re doing your job well, it actually creates the problem that we then have to solve it. But it’s a good problem to have because ideally you’ve developed a model that improves the performance of the business in some meaningful way. And so in order to predict than the next change, then you need to have in the business, you need to then revisit the model and revisit what you’ve put into it,
Marie: right? Like for example, if you had been tasked with designing a model that can help you identify customers that have the greatest ability of increasing their lifetime value, and then you actually are able to deploy a campaign that does increase the lifetime value, then what you probably need to do is say, okay, now we need to find the next segment that we have the best opportunity of improving lifetime value on and
Lexy: read a play the model certainly. Well there, I mean, hopefully in a situation like that, you’ve put together your model in such a way that you’re incorporating whether or not they responded to your recent campaigns so that you don’t actually have to revisit the model. But what you might have to say is I need to revisit from a business what the threshold is for the people that I’m going to say they’re likely to come back propensity to return. Then you would say, well, maybe rather than somebody who’s between let’s say 70 and 80% now I need to look a tear down. Maybe it’s 60 to 70% because we’ve already touched these people who we thought had a really good likelihood to return. Not necessarily they were already going to, that’s throwing good money at the problem that doesn’t exist, but potentially going to kind of that next group and said, you know, we’ve already touched them and if they were going to come back, hopefully we’ve gotten them to come back at some point. You don’t want to pastor them
Marie: as a man. That comes from more of the the marketing side of this equation am always thinking about the balance between okay, reaching out to somebody and getting our message out there and not reaching out to them too much. So those types of things that are able to be included in a model, so I know I’m reaching out to the right audience with the right message is always key.
Lexy: That actually brings up a really good point with regard to responsibility, which is as a data scientist you may have an opportunity to proactively suggest models that make for more ethical business practices. What we were speaking to just now is a concept called marketing fatigue, which is that people get exhausted from seeing emails or messages from a given company. Think about how often some of the more frequent businesses that send you emails send you an email. I can tell you that one or two brands that I subscribe to, I might get three or four a day. As a data scientist, I want to be able to say to that company, Hey, would you like a model that maybe helps you predict how you should cut off your communications to somebody or prioritize your communications to somebody so that you don’t exhaust them as a consumer and so that they don’t unsubscribe.
Lexy: It may not be in that context. There are other contexts that are potentially more sensitive contexts in which this would occur from an ethical standpoint, but it’s still very possible that as a data scientist you could come up with an idea that provides for a better experience or provides for a more ethical use of the information that you have that maybe the business isn’t asking for, but it should and it’s to you to say this is possible. Paint that picture of the art of the possible and indicate to the business that there’s value to that, communicate that value because sometimes the business just goes after what it thinks it should be going after and isn’t maybe considering some of these other obligations and and other areas and avenues that it should. So be proactive in considering what you can do with data science and helping the business to see that opportunity.
Marie: And even as a marketer, there are certain tools and systems that use that might include email automation or different marketing automation systems. What I’ve seen at different points in my career is people will set up campaigns or systems to send out emails, but they don’t consider all the different emails that are going out at the same time. So one of the ways that you really want to retain responsibility, even on the marketing side, when you’re looking at the different people that you’re communicating to, and this, this does play into data science ethics because you don’t want to over touch people and you don’t want to frustrate people with your brand is you want to think through what is somebody experiencing when they’re receiving campaigns from my company? Am I touching them? And then calling them and then maybe even mailing them a couple times a month or multiple times a day and what’s the right balance?
Marie: And you can, you can basically walk through your campaigns as if you were the customer and really envision what that user experience is and then say what would be a better user experience? And maybe I should just stop this campaign that’s been running for awhile because maybe we now have better campaigns that do a better job of achieving the business goals and we don’t need this other one that was an older version, so sometimes people get very scared but like, but we’ve had that campaign for so long, we don’t want to pause it and sometimes that’s the best thing that you can do and if you touch your customers fewer times, you can actually get a better response from when you do reach out to them.
Lexy: [inaudible] another example maybe in the medical field might be as AI’s get better at diagnosing specific conditions and there’s a lot of work going on in medical imagery right now and identifying for example, cancerous growths and so forth. If there were an AI that was indicating whether or not a given patient would be a good candidate for a surgical procedure, for instance, it might be a better option to say, is this person a good candidate for a more conservative treatment prior to a surgical procedure? And now admittedly in the medical field, there is not an AI right now that is prescribing a remediation for a problem. It’s simply identifying a problem and then alerting the physician who would then prescribe a course of action or discuss a course of action with the patient. However, at some point we have to wonder, is this something where there may be something that the computer could see that the algorithm could see to say this person is a better candidate than this other person for a given procedure and and maybe rather than just saying yes or no to surgery, it could have some of that nuance from an ethical standpoint, medical ethics is a whole other ball game, but I’m not going to go hugely into, so admittedly we’re going to kind of sidebar a lot of this, but from an algorithmic ethics standpoint, we could approach at least some amount of the furthering of the algorithm to be rather than a binary decision, more of a best action prescriptive.
Lexy: And I use that loosely prescriptive course of action algorithm to be able to, to give a conclusion
Marie: after you’ve retained responsibility for the algorithm that you’ve worked on. That kind of leads into something else that we’ve talked about before, which is anticipate adversaries. So if somebody is out there and does something with your algorithm that was unanticipated, how you address that. So we’d like to point you to our other episode on that topic anticipate adversaries for more details in that arena. We will link to it from this one. So Lexy, thank you for going over how to retain responsibility with your data science models. This has been Marie Weber.
Lexy: and Lexy Kassan.
Marie: Thanks so much.
We hope you’ve enjoyed listening to this episode of the Data Science Ethics podcast. If you have, please like and subscribe via your favorite podcast App. Also, please consider supporting us for just $5 per month. You can help us deliver more and better content.
Join in the conversation at datascienceethics.com, or on Facebook and Twitter at @DSEthics where we’re discussing model behavior. See you next time.
This podcast is copyright Alexis Kassan. All rights reserved. Music for this podcast is by DJ Shahmoney. Find him on Soundcloud or YouTube as DJShahMoneyBeatz.