“People are looking at the same data but coming to different conclusions”: How Universities are Planning for September

I started reporting this story a month before it was due. My email to every university media relations department was the same:

I am writing a story about how colleges and universities are using data to help bring students back to campus safely or to decide on alternate methods of education. I’d like to interview someone there who can discuss what your university is doing.

Usually, when I send this type of email I hear back within hours with enthusiastically proffered time slots. This time: Crickets. It was only over the last few days that a handful people started replying to my requests – and almost all of them were associated with a webinar that goes live the day this story drops: Safely Reopening Education and Campus Life. During my discussions with these experts, a few stark facts bubbled to the surface: There is no consensus on how to bring students back safely, there’s no uniformity on the type of data being used or how it’s being used, and some students are probably in for a bumpy ride this fall.

Reading the Room

For most of us, our lives revolve around local infection numbers – hospitalizations, deaths, and the ratio of positive tests to the number of overall tests administered. However, for universities that are bringing people in from all over the country and the globe in some cases, it’s nearly impossible to analyze the number of necessary data points as they flow in.

Administrators, in many cases, are also using internal data to inform their decisions. But the quality of this data – or the complete lack of data altogether – makes it unlikely that university decision makers will come to a consensus, says Charlton McIlwain, Vice Provost for Faculty Engagement & Development at New York University.

“Every school routinely collects some of the same kind of data, but we all may be collecting different kinds of data. Whether that’s data on students or faculty or about classes and availability, there’s a wide variety of data collection happening. Given the nature of COVID and its relative suddenness, a lot of individual decisions depend on what they’re collecting and what shape of everyone’s data is in,” McIlwain explains. “You may collect data about which students enroll in which classes. I may collect that but only sparingly. My systems and software might not be as robust. In that case, you’re left wondering if your data is reliable and sufficient and if you can even trust it at all.”

This is a problem that wouldn’t be easy to solve in pre-COVID-19 times. For many smaller educational institutions, there was never enough time or a real impetus previously to invest  in systems to extract and analyze data, or in data scientists. Now, faced with COVID-19, larger schools may have advanced data science and computer resources on campus, and alumni and sponsors with deep pockets to finance additional detective work. But smaller schools  are struggling.

Even those schools that possess the resources to collect extra data points are having difficulties because there’s no consensus around how useful any of the data they’re collecting actually is. So school presidents are taking cues from their colleagues.

“I was very impressed by my colleague at Tufts [University president] Anthony Monaco,” says Michael S. Roth, President of Wesleyan University. “He is a scientist himself and he worked with other colleagues in Massachusetts on their plan for reopening colleges and universities. He told us that surveillance testing was useless because by the time you got your results, you have too many contacts. There is no way of isolating or tracing people.”

Indeed, an April study out of Cornell University bears this out. Researchers analyzed course enrollment patterns at the school to uncover how connected students actually are. The results demonstrate that – at least in a medium-sized American university – student networks are expansive and interconnected. “Course enrollment networks are small-world networks, with high clustering and short average path lengths. Although only a small share of students are connected directly (in the same class), nearly all students are connected indirectly through a third student,” explains Kim Weeden, a sociology professor at Cornell who co-wrote the study and posted her analysis and results to Twitter.

Weeden says that colleges and universities, who are still in the planning stage, are trying to match an ever-changing supply of resources to an ever-changing set of demands on those resources in the context of an ever-changing national and local health situation. “It’s a tall order,” she adds. “Ideally, they would have high quality data on everything from the supply of faculty instructional time to actual classroom space under physical distancing to the student demand for on-line or in-person courses to local testing and hospital capacity.”

Same Data, Different Results

The day-to-day nature of the virus is keeping everyone guessing, even forcing some schools within the same educational system to have wildly diverging return-to-campus plans put into place. The State University of New York university system, which has 64 campus centers, is one example. None of its schools have exactly the same opening protocol. As a comparison, Binghamton University is welcoming all students back to campus, asking those from out of state to quarantine for 14 days when they return. Students will be in classrooms, but learning will shift online after Thanksgiving, with classes ending completely on December 7 and final exams being pushed to the wayside. Meanwhile, Nassau Community College on Long Island has committed to the exclusive use of remote learning. The University of Albany, on the other hand, is bringing students back, but is taking a more hybrid approach. It reduced classroom capacity to 20 and 40 percent, depending on if seating can be moved and eliminated fall break so it can complete the fall semester entirely before Thanksgiving. But if, theoretically, all schools in a school system are working with the same basic data, shouldn’t most schools be following a similar path?

“I remember saying to my board in late March, early April, that it seemed to me that schools would cluster around one or two options,” agrees Wesleyan’s Roth. “Because at that point, in conversation with my colleagues in our athletic division, which are large colleges in New England, and another consortium of very highly selective universities and colleges, it just sounded like people were looking at the same data. And they are, but they’ve come to different conclusions.”

For some, that means creating as much of their own data as possible so they have more to analyze and inform campus policies. Wesleyan University, like about 100 other universities, contracted with the Broad Institute of MIT and Harvard and will be using genetic-based COVID-19 testing for everyone who sets foot on its campus. Testing will happen twice a week, and results are set to come back within 24 to 48 hours. (The school took over a hotel so they could isolate everyone who tests positive.) They’re also using predictive modeling right from the start looking at testing results from different states– more than two-thirds of Wesleyan students come from the Northeast, which has very low infection rates right now — combining them with student tests to make sure they can contain any outbreaks, says Roth.

“We have been using a suite of data sources to try to predict what our likely a number of positive cases would be in the first ten days so we can isolate those cases, give those people the care they need, and break any infection chain of transmission,” Roth says. “We are going to limit the size of gatherings and we won’t have any intercollegiate sports this fall.” He’s also counting on the fact that only 80 percent of students are set to come back and that they are creating “family pods” within dormitories, giving people ways to interact but still limiting contacts.

Tying Everything Together

Cornell University’s Weeden believes schools must expand their data collection if they want to avoid having to send everyone home. Recently, she and her colleagues worked with a small private school that gave them access to de-identified data on their students’ classes, residence hall assignments, advisor groups, extracurricular activities, and dining hall assignments. They used that data to help the school identify particular classes and extracurricular activities that were especially central to the network –they created many direct ties between students or they shortened path lengths between pairs of students who weren’t tied together directly.

“The school also asked us to model different scenarios where particular types of activities – classes, sports, advising – moved on line and where the school shifted to block scheduling,” she says. “Breaking the semester up into two mini-terms; each student takes half as many classes as in a full term, but each class is twice as many hours of instruction. This gave them some tools to think about how they could structure the upcoming fall term, assuming it’s at least partially residential, to reduce the share of students connected through just one or two degrees of separation.”

While it’s too early to say if this kind of data, preparation and analytics will be effective, it proves yet again that data is certainly a new form of currency. And in some cases, having the best data might equate with having the most cold hard cash.

Explains NYU’s McIlwain: “Some schools are making a financial calculation and saying, ‘This is going to cost us all this money for testing and analysis and at the end of the day we’re only slightly more able to make decisions,” he explains. “It’s difficult to say that this is definitely going to be better than that because there’s still so much we don’t know and can’t know about COVID.”

Whatever schools decide, data will definitely be a part of the next steps, Weeden agrees. “Once past the planning stage, colleges and universities should develop real-time indicators, not only of the overall campus health situation but also of particular sets of students, courses, or activities that are especially high priority for testing that day due to their position in the network that connects students, faculty, and staff through courses, dorms, teams, or other activities.”

While a pandemic might not have been the optimal choice for teaching people the value of real data, it is a way to rapidly bring entire fields into the world of true data-based decision-making. Universities seem to be doing the best they can under the circumstances. Even if they don’t want to talk to any journalists about it.