Discussing Data Privacy Provocations: A Q&A with Maya Berry

Public interest technologists know that it can be difficult to balance privacy with data collection. While this is a challenge when you’re dealing with run-of-the-mill data, the problem is amplified when it comes to data about marginalized and vulnerable populations and communities. Data can help lift up and support these populations, but it can also potentially be used against a person or community and cause harm. The Common’s editor-in-chief Karen Bannan sat down and discussed this issue with Maya Berry, the executive director of the Arab American Institute to discuss this issue as it relates to her organization. Here are her answers.

Q: Tell us a little about the Arab American Institute. 

Berry: We were established in 1985 for the purpose of promoting the civic engagement and political empowerment of the roughly 3.7 million Arab Americans in this country. We do a great deal of work defending the civil rights and civil liberties of Arab Americans and, as a result of that work, advancing the rights of all Americans. Most of our work is related to that intersection — civil rights and civil liberties. In particular, we are a community who’s been targeted by national security policies of our country, so a lot of the things in place tend to have a disparate impact on how every American relates to their government and the work that they do. This includes policies around immigration reform, hate crimes, disinformation work, the decennial census. We’ve been working on the decennial census since before our founding in 1985. The first major campaign where we partnered with the Census Bureau was in 1990. I’m sitting in my office, looking at a poster that literally says, “We’re Arab Americans and Proud.”

Q: How does data privacy affect your organization?

Berry: [Data privacy] has been a barrier for us in terms of making sure that our community fully participates in the collection of data on the census. We’re one of these communities that has to often say, “Title 13 says when you fill out the census, this information can’t be tracked down to you individually.” And the reason for that has to do with both the combination of the legitimate concern regarding the individual person’s privacy, but also the problem of publicly available data and how that’s actually being used to specifically target our community. So when we hear the phrase “data privacy” it’s a lot of things. It’s the importance of knowing or demanding that our rights be protected in terms of our right to privacy, both when it’s a corporation that does the tracking, and whether it’s government policy that may violate that. But it’s also a concern about individual Arab Americans who’ve been sort of weary of policies that have violated their right to privacy.

Q: What makes the experience of the Arab American community unique when it comes to data privacy?

Berry: There is a perception that’s well founded that the government treats us as a suspect community. We call it the coined term “a securitized relationship.”  There is a concern that our data is shared with government agencies that may be seeking to get information about us. And when I say well founded, we all know and talk a great deal, for good reason, about census data being used for the internment of the Japanese American community during World War Two. The second example is something that took place after 9/11 when the Department of Homeland Security (DHS) made a request, which officials at the Census Bureau honored, and gave them information about Arab Americans. We’re familiar with how some of that information can be used. When you look at the NYPD profiling of American Muslims — this ended up being a Pulitzer prize winning investigation by the AP — also post 9/11 –when they decided they were going to target and surveil our community. In that case they used publicly available demographic data.

In fact, when you look at the New York Police departments, the information reports actually cite the Arab American Institute demographic information in them. That’s how police, local law enforcement, and agents decided which neighborhoods they were going to profile, most directly based on the demographic information that was available from our  website. So the concern about the privacy of data is quite legitimate in our community.

I’m an individual who’s well aware of what’s happening with technology today. If I have concerns about my own individual data and I try to limit the apps that I use. I try to prevent companies from targeting me specifically using different dates of birth and different information all the time to try to scramble that information as much as possible. It’s a different set of problems for an entire community that has to worry about how it is targeted and profiled by government policies.

Q: Where do you see the biggest challenges in terms of ensuring data privacy? Does it come from the people who are actually collecting the data or is it educating people as to how to handle, manage, store, and analyze data? How do you see this being resolved?

Berry: It is about how datais used after it is gathered. I don’t worry about it at the point of collection. I worry about it after; how it’s utilized. You can collect data, and it’s ideologically neutral; it’s benign in nature. It’s the intent. What happens after you collect it and how could it be used to target communities?

The example I cited post 9/11 — where a friend asked a friend of a friend at DHS for data — I think both of them saw themselves as benign actors. I don’t think either one of them thought there was negative intent there, but the outcome was a very serious data breach. After, there were protocols put into place so that it wouldn’t happen in the future. Ideally, what would happen in these situations is that smart people would think about the possibilities before there is a problem that requires a policy in place to correct it. My community continues to talk about that example when we try to tell them that the data is safe. We need to see people sitting down and having the conversations that need to be had and looking at who has access to data, what level of access is provided, and how much data is available.

Q: It seems like there’s a lack of transparency from government agencies on how they’re using the data outside of the census. 

Barry: Going back to our own set up here at the organization. We have multiple employees but we have one database, which gives different access permissions to different people for different parts of it. It’s not like one user can see everything there. The idea that there would be data protocols in place for small little scrappy nonprofits in D.C. but not at the federal government level isn’t right. When it comes to huge government agencies — the who can access what and the transparency piece — it should be there.

One of the things we’ve asked for is that we would like to know the number of times a request was put in. If you’re going to try to gain information on how many Egyptians are in this neighborhood, is there a way for us to know when it happens and what agency made that request? Those are certainly things that I think would bring, if not comfort, at least an understanding of what’s happening. The conversations we’ve been having around the differential privacy problem is — this is where I think there’s the challenge for communities like ours –what we want, the privacy protected cannot come at the expense of accurate data about our community We have got to strike a better balance in that regard.

Q: What should public interest technologists know when they are thinking about this issue and hopefully developing data privacy guidelines? 

Barry: We emphasize the importance of data driving policy.  Many are of the opinion that if data is not informing your policy, you’re likely going to have some problematic policy outcomes that you hadn’t considered. It’s almost like stepping back and understanding that, in some ways, even something as basic as what data looks like may be different for different communities.

We did focus groups in our communities back at the end of 2018, when we were starting to develop our programming for the decennial census. I was of the opinion that we should just tell everybody up front that we know that the government has misused census data when it comes to Arab Americans. The truth is they can have all the information they have about us, and we still have to fill out the census because of cost benefit analysis. We lose more than we gain if we don’t. My messaging completely failed. That’s not what they wanted to hear. They wanted really positive messaging. One of the things that I found most interesting is recent immigrants in a refugee community were all about disclosing everything. There was no sense of, ‘You’re violating my right to privacy. Why are you asking me all these questions?’ It was all about, ‘Take everything you need to know.’ Perhaps that’s not surprising. This is a community that spent two and a half years going through a vetting process before they were allowed to enter the United States.

Contrast that with an Arab American focus group in Miami, Florida who’d been here multiple generations. Their perspective was completely different. It was a very individualized sense of, ‘Why do you need to know this? And how will this help your policy development? Because you’re asking me to disclose something that is my personal data. Why do you need it to understand things broader?’ I think it is interesting that the more time someone spends in this country, the more there is an individualized sense of right to privacy than we see in newer immigrants. As data scientists and public interest technologists, you have to think about how the individual connects to your request and what it means to them individually, and that can really vary in communities.

Learn more about this issue here, here, here, and here