Colleges Fake Data for Rankings

Colleges and universities fake data to get better rankings

Episode 30: Colleges Fake Data for Rankings – Show Notes

Going to college is a life-altering decision for young adults. They prepare themselves with academics, standardized tests, extracurricular activities, and a host of research as to the best fit for their interests. The colleges, for their part, provide marketing materials, campus outreach, tours, and more to try to lure talented students with good potential. They also provide information to third-party ranking companies to list them among their peers. The inputs to the ranking algorithm are public knowledge and the stakes are high. That’s why some colleges have been discovered using fake data for rankings.

Today we discuss what happens with too much transparency and not enough oversight.

Additional Links on Colleges Fake Data for Rankings

UMKC Bloch School submitted false data for rankings, but results may not have changed, auditors say Kansas City Star article detailing the many ways that the University of Missouri Kansas City falsified data provided to the Princeton Review to receive higher rankings

Updates to 8 Schools’ 2018 Best College Rankings Data US News reported that eight schools were discovered to have submitted incorrect information for ranking and were unranked for the year

University of Oklahoma Gave False Data to US News College Ranking for 20 Years CNN coverage of the 2019 discovery that UO had provided fake data for decades before being outed

In the New Age of College Transparency, Who’s Checking the Facts? Hechering Report article describing Texas Christian University’s independent audit process and a general call for better regulation of data provided for rankings

Colleges Fake Data for Rankings Episode Transcript

View full episode transcript

Welcome to the Data Science Ethics Podcast. My name is Lexy and I’m your host. This podcast is free and independent thanks to member contributions. You can help by signing up to support us at datascienceethics.com. For just $5 per month, you’ll get access to the members only podcast, Data Science Ethics in Pop Culture. At the $10 per month level, you will also be able to attend live chats and debates with Marie and I. Plus you’ll be helping us to deliver more and better content. Now on with the show.

Marie: Hello everyone and welcome to the Data Science Ethics Podcast. This is Marie Weber

Lexy: And Lexy Kassan

Marie: And today we are going to talk about universities behaving badly. Indeed, we actually found a few different articles we’re going to spend most of today. Talking about an example that came from UMKC University of Missouri, Kansas City, the local newspaper there found that they had been reporting some of their information inaccurately to some of the different places that do college rankings. So that investigation was done by the Kansas City star. And then there was a follow up report by pricewaterhouse coopers that shows that their rankings for the Princeton review were not correct.

Lexy: This one is from a few years ago, but there was also a recent incident that was very similar from the University of Oklahoma. So this has by no means gone away and it has in fact seemed to get worse.

Marie: Exactly. So we’re going to start with this Kansas City example cause there’s a lot of examples within this one school and what they were doing. So it seems that people at the university, top officials felt a lot of pressure to submit their information to these different ranking programs. They had been getting rankings in the Princeton review over a number of years and they felt that it was important for them to continue to get high rankings. So they knew how the rankings were scored and they looked for ways to make sure that they were submitting numbers that would help them get higher rankings in the system.

Lexy: So this is really them gaming the system of college rankings and college rankings are something that, especially in the United States, get a lot of credence with students who are looking to apply to colleges and looking at where their best prospects would be based on what they want to do. Highly ranked universities receive more applicants, they have a pick of better students who theoretically would then become better alumni who would contribute to the college and so forth. So getting ranked is a very important factor for universities to try to attract good students who are looking to go to college.

Marie: Exactly. This was another piece of the pressures that the university felt because they had a benefactor that was helping to support their entrepreneurial school and they were afraid that if they didn’t continue to get these high rankings that they wouldn’t continue to get the support of this benefactor. So when it comes to schools, there’s this whole ecosystem. Getting the rankings helps you get the applicants who helps you get the graduates who can help support the school in the future. And when you get supporters and donors supporting the school, that helps you get more press coverage, which can also help you get more applicants. So that ecosystem can really be something that the admission office is thinking about a lot. And

Lexy: all of those are factors in the rankings for Princeton review as well as US news. These are the top ranking groups in the United States and they take into consideration things like the average sat scores or act scores of applicants to a school. The admission rate, the donation rate of alumni, the graduation rate of their students. All of these are inputs into the algorithm that they use. And that algorithm is known to the universities and to the students. So that there is transparency. In this case though, that transparency got them into trouble because it was that transparency that allowed schools to game the system. So in the case of you MKC Marie, what were some of the factors that they had changed in their actual numbers to game the system?

Marie: So this is where it gets interesting because you would normally think, okay, somebody’s gonna Fudge numbers a little bit. But there were areas where there were just total leaps of faith in terms of the reality on the ground on campus and what they actually submit in their report. So when they were applying for the rankings, they told Princeton Review that they had 27 to 29 officially recognized clubs and organizations specifically open to entrepreneurial students. And again, the entrepreneurial ship program was one of the specific schools that they had been getting high rankings for. And from the investigation from the Kansas City star, it looked like, in fact it was more of a wishlist and that on campus there was more like a handful of clubs, kind of leaps of faith or leaps of the imagination in terms of what they were submitting in their application. And they also were asked about their mentorship programs. They said that they had 78 mentorship programs, but they were looking at the of different business executives that advise students and they felt like there was 130 different specialties that these business executives had knowledge of. There were basically no mentorship programs. They had five or six mentorship programs for entrepreneurial students. And again, 78 was never a reasonable number.

Lexy: So we’ve got all these people who could advise students, not in an official mentorship capacity, but they have expertise in 130 areas. So we’re just going to count some subset of those and say that’s the number, even though that’s not a program that’s just someone could talk to a student about this topic.

Marie: Yeah, that’s how I read it as well, where, okay, this person has a business and they’re really good at marketing and this person has a business and they’re really good at research and this person has a business and they’re really good at technology development. So we have all of these different entrepreneurs that have access to this great knowledge. And so that means that collectively we have 78 mentorship programs.

Lexy: Seems a bit far fashion. Yeah. And it keeps going.

Marie: So one question was how many students had started a business and instead of counting formally enrolled degrees, sinking entrepreneurial ship students, they counted people that had enrolled in an e scholars program and that was to get a certificate. And because of who was available to enroll in this certificate program, it could be non students and it could also be somebody that already had a degree and was just looking to get this certificate because of how the question was asked. And again, they were looking at how many of your students have started a business. They were able to say 100% because they had people coming into this east coast program that were already entrepreneurs themselves. Well, it was

Lexy: requirement from what I read it the, to get the certificates, right, so to get the certificate, it was required that you have started a business. So they said, oh, 100% of our students that enrolled for this particular certificate, I have started a business. So yes, 100% wow. Correlation does not equal causation.

Marie: You know, let’s put ourselves in the shoes of a student that is looking to see which program they want to get into, where they want to apply for college. They want to put in an entrepreneur. They’re evaluating different entrepreneurship programs across the country. They see that this school has a 100% success rate in terms of the students that go to this program start a business that’s really compelling for a potential applicant provided they don’t question

Lexy: it. And true always questioned the numbers all trust but verify trust but verify. Yes. That said, this article specified that there was a ranking in which you MKC was placed above some of the top recognized ivy league schools in the country. So it was showing that you MKC was ranked above MIT and Harvard and others, which you would think would already be a little bit questionable. I mean, you never know. Harvard and MIT have a lot of other focus areas. It’s very possible that their entrepreneurship program might not be the best. However you MKC is a relatively unknown by comparison to those levels university. And somehow this got through and was not questioned until the Kansas Star brought it up. Yes. And the star found even more. So they also found that, so part of how the rankings were put together is they looked at reports that talked about the rankings of entrepreneurship programs and there was a report that was submitted, but it actually had some of the people that worked at you MKC involved in the report and that wasn’t disclosed.

Lexy: Ooh. And they used a non traditional method for ranking universities and a more traditional method would not have ranked you MKC at the top because most of its researcher’s work occurred at other universities before they came to you. MKC All right, so if I interpret that right, it’s saying that generally speaking they want to know for a given, let’s say professor at a university, what research have they been involved in while tenured at that university? Yes. And in this case they were saying, well, this person had written this paper 20 years ago, but they still obviously have this knowledge so we get to get credit for it now because they came here two years ago. Sure. Whatever the case may be. Yeah, okay. That’s how I understand it. And then the person, I believe that was in charge of running the entrepreneurship school, their involvement in writing, editing and publishing this article that ranked them as the world’s top scholar in innovation management research was not disclosed. Wait, they wrote their own review saying that they were the top researcher. Yeah. Isn’t that a crazy story? Oh Man. This, this just takes me back to like Enron and other back office. Front office debacles yeah. Some additional details around that. Michael Song’s involvement in writing, editing and

Marie: publication of the article that ranked him as the world’s top scholar in innovation management research had not been revealed. Song said he may have written parts of the paper beyond basic editing and grammatical changes and he and the authors also said that the paper was largely completed by the time he signed it and submitted it to GPI m for publication to journal of Product Innovation Management. Hmm.

Lexy: So there are obviously a number of egregious lies of data here that when we think about data science ethics, you’re basically playing against an algorithm. It goes back to kind of the adversary. In this case, there was sufficient gain to be realized by university to make it worthwhile to them to try to game the system to be adversarial to this system in attempting to do this. They falsified any number of input variables. This happened a couple of years ago and these types of situations, particularly amongst college rankings have been coming out for several years. Some of them were actually discussed in weapons of math destruction by Cathy O’Neil. Some of them came from, there was one from, I think it was temple university several years ago. Yeah. So to make sure that we’re right,

Marie: I’m not singling out UMKC. This is something that u s news and World Report has basically, um, unranked schools who have submitted information that, well I guess wasn’t accurate, inaccurate information. So there’ve been multiple schools that have been caught basically having issues with the information they have submitted to different ranking organizations and then being either unlisted or called out for the information that they’ve submitted. As Lexy said earlier, it’s something that isn’t isolated to just one school or

Lexy: on ranking institution. True. So in addition to the fact that this one was called out by Princeton review, there were eight that were delisted from us news. Yup. There were some that were delisted from Forbes. This is something that is, I won’t say pervasive, but certainly not as anomalous as one would hope. There have been times when schools have gotten away with it. This most recent one from University of Oklahoma, it said they’d been submitting information incorrectly for 20 years or something like that. It was an incredible amount of time. As we look at these different situations, it seems like because the algorithms are known and the impact is substantial, it makes it worthwhile essentially for to consider this. There was one instance where a school specifically called out that they were getting a third party auditor to come in and validate that they were submitting accurate information. I believe it was Texas Christian.

Marie: Yup, and we have a, we’ll be linking to that article as well. So Texas Christians, they basically have all their stats that they submit audited for accuracy, but there yeah.

Lexy: Is No authority to provide oversight right now for these different statistics that are being submitted. The ranking boards are reliant upon the colleges to self report and they don’t have the capacity to go in and audit all of those numbers.

Marie: Well, the other exception to that is apparently the American board association is auditing all of the legal colleges because they found the same issue. The American Bar Association or sorry, the American Bar Association? Yes. Interesting. Yeah. When they were talking about the law schools that were getting in trouble, you can only imagine the basically American Bar Association going after these schools and what they were ready to bring to the table because you know where these other ones are talking about getting unranked or d listed. When it comes to the law schools, they were centered, they were fined, they were placed on probation and they could also face lawsuits for fraud. Unfair competition, false advertising, alleged misreporting, and you just know that you don’t want to deal with that right.

Lexy: Situation. Here’s something I thought I’d never say way to go lawyers. True. In this case, lawyers are the heroes. Yes. Wow.

Marie: So if you want credibility, you either need to go to Texas Christians or to a law school

Lexy: or we need to find some other way to ensure that colleges have very clear guidelines on what should be reported and oversight to ensure that any numbers that are submitted are accurate to those definitions. Yes. In one of the articles that we’ll link to, there were some suggestions as to at least groups, organizations that could potentially oversee that kind of work. So for example, maybe it would be the department of Education or some of the ranking boards themselves might be able to have kind of an audit arm that would be able to validate, again, trust, but verify these numbers to make sure that they comply with the definitions that they’ve laid out for that inputs to their algorithm. Because up to this point to what Marie had said, most

Marie: of the repercussion has been that the school was delisted from the ranking for one or two years. That doesn’t necessarily seem like a lot for a breach of ethics, a falsification of information. Going back to how the legal world has been dealing with this. There’s now been a law school transparency group that’s been founded and the director makes sure that they’re pushing law schools to provide accurate admission and job placement statistics. And then there’s also the ABA and the law school admissions council have stepped into check and certified that law schools are reporting entrance test scores and undergraduate grade point averages. So there can be a framework that is set up to make sure that the definitions are clearly known, the information is accurately reported and then you can get proper rankings out of that. And until then it’s reliant on internal whistleblowers or basically the schools to self report that they had inaccurately previously reported, which given the consequences they’re on likely to do.

Marie: Very unlikely. There’s also the fact that there are individuals that are in a job and they want to make sure they continue to have their job. And a lot of times these articles and news reports also highlight that people feel pressure just in terms of keeping their own job security, going to submit these numbers that will help them get these rankings for their college or school or university. Three kinds of lies. There’s lies, damn lies and statistics and it’s something that when I was a junior statistician, I remember having conversations with my superiors and some of the Times I would provide an interpretation of the data and they would come back at me and say, well that doesn’t tell the story that we want to tell. Go change it as a statistician. You know what ways it kind of can be changed to maybe shift the perspective without actually altering the numbers. But in this case it sounds like there were people who didn’t have that same sort of sense of it and so they just flat out picked numbers and said that’ll work. So for example, or had people that they reported to that gave them numbers to report and even though they didn’t agree with the numbers, they felt pressured to send those numbers along anyway. True. So they could have been fed that information and somebody else tried to provide the justification, whatever the case may be. In economic

Lexy: terms, we’re all self interested. We’re all, the base assumption for economics is that we are all going to act in a self interested manner. You know, if your self interest is driven by having your job and continuing to have your job, then you’re going to do what you need to do. But it doesn’t mean that it’s right. And I think that’s where in time those whistleblowers have come out and said, look, I wasn’t comfortable with this when it happened. I think it needs to be known that this is what was reported, which is good. I mean at least there’s that. But how many people don’t get to that point or don’t feel that they can get to that point and report? So how many more of these are we gonna find? It may not seem like this is, you know, the be all end all of, of everything. It’s still a very disturbing trend. As we continue to get more algorithms in our lives and those algorithms are known, there are even more opportunities for this kind of adversarial data, which is that people will specifically provide information that pushes an algorithmic results in their favor.

Marie: Very true. So in this case though, it seems like putting in measures where schools know specifically when something’s being asked for what that is. So it can’t be fudged. And putting in mechanisms that can verify and audit that information. And then the different rankings services can put their algorithm out there and it can be known in the public, but there’s less chance of people gaming it because they’re not self reporting. And there are mechanisms in place that can double check and make sure that it is accurate. It seems like this is an example of data science ethics where there are parameters that can be put in place to make sure that people are playing by the rules of the road so to speak.

Lexy: Absolutely. There could again, it, it requires a lot of oversight that is not currently anybody’s scope and purview and so it really would rely on a new system being set up for that. Especially because many of these schools are private institutions. They’re not required to report everything in their financials true outside of their own organization and so it’s a little bit different than maybe in the public sector where you have quarterly reports, annual reports, what have you. Schools will generally provide reporting, but again it’s their interpretation and so there needs to be some sort of consistent set of, like you said, rules of the road and that needs to be enforced by someone and that someone is yet to be determined.

Marie: Yeah. Again, another podcast where we don’t come up with all the answers. Of course, we hope that you have enjoyed this episode of data science ethics. This has been Marieā€¦

Lexy: And Lexy.

Marie: Thanks so much.

Lexy: Catch you next time.

We hope you’ve enjoyed listening to this episode of the Data Science Ethics podcast. If you have, please like and subscribe via your favorite podcast App. Also, please consider supporting us for just $5 per month. You can help us deliver more and better content.

Join in the conversation at datascienceethics.com, or on Facebook and Twitter at @DSEthics where we’re discussing model behavior. See you next time.

This podcast is copyright Alexis Kassan. All rights reserved. Music for this podcast is by DJ Shahmoney. Find him on Soundcloud or YouTube as DJShahMoneyBeatz.