Interview with Kumar Sambhav

Concerned by shrinking newsrooms, Kumar Sambhav and fellow journalists founded The Reporters’ Collective, which is dedicated to encouraging and promoting investigative journalism. Recently Sambhav, in conjunction with international journalists, has uncovered the ways in which Meta’s ad platform gave India’s ruling party an edge over political competitors. Listen in as Varoon Bashyakarla is joined by Kumar Sambhav to discuss the latest reportage, the Indian state ID system and the challenges to creating meaningful legislation around data privacy.

The strength of a democracy should be that you know you have a multiparty system, but these smaller parties would never be able to get there. Those smaller parties would never be able to compete because they don't have that kind of ecosystem. So in that sense the skewing of having a level playing field, at least on the Meta platforms, Instagram and Facebook, has been sort of skewed tremendously in favor of BJP, which is very, very damaging for electoral democracy.Kumar Sambhav

About the Speaker:

Kumar Sambhav is an investigative journalist based in India. He is the founder of Land Conflict Watch, a multidisciplinary organization that conducts research on natural resources, and a co-founder of The Reporters' Collective, a non-profit media collective that carries investigative journalism. He is currently also an AI Accountability Fellow at the Pulitzer Center. For his reporting, Sambhav has received India’s prestigious Red Ink Award, the Press Council of India’s National Award and the Shriram Award for Excellence in Financial Journalism. You can connect with Kumar on Twitter at @Kum_Sambhav.

This interview is also available on YouTube, PeerTube, Vimeo

Please note that this interview has been edited for clarity and brevity.


You are a co-founder of The Reporters’ Collective, a nonprofit media collective that conducts investigative journalism. Can you tell us a bit about it?


You know how the space for journalism in India has been in the past few years – both in terms of legacy media's reportage quality, which has been deteriorating, and there has been a lot of censorship and pressure on free media as well. In fact, if you look at the Press Freedom Index, India features way below many of the developing countries across the world. So that score is pretty low. Me and my co-founder Nitin Sethi, who has been a journalist reporting on environment finance and climate change issues for several years, we’ve been working in different legacy media organizations and we realized something about the space for journalism that holds powerful people to account, be it government, corporations, or anybody who's sort of responsible for the well-being of people and citizens. Holding them accountable was the main objective of journalism in any country of this profession. And we saw that the space for doing that kind of journalism has been increasingly shrinking in India. Some newsrooms were critical, say, of people in power, and they would point out their wrongdoings, but that would mostly happen through opinion pieces and not through reportage.

So the concern that many of us felt was that really good reportage that is based on facts and investigations was not being done in many newsrooms and even people who were trying to do it were sort of lone wolves trying to do things in their own small spaces. And eventually, for those individuals, that space was also shrinking. So we at that point, about three years ago, we started informally working with each other. A bunch of journalists helping each other to do this kind of journalism, reading their documents, helping them fact check their articles, suggesting, playing devil’s advocate.

It was a sort of informal collective among many journalists in different newsrooms trying to do this. Eventually, even that space shrunk and some of us were actually thrown out of the newsrooms for doing this kind of work. We realized that there was a need to do this independently as a collective and then we started working together a little bit more formally and we registered a trust called The Reporters' Collective there. Apart from me and my co-founder, we have an editor who has experience with editing in mainstream newsrooms for about 20 years, we have a bunch of young reporters and journalists who are working full-time with us. And then, in addition to that, because obviously we work on very limited resources, we truly depend on donations from readers. So we try and help a lot of journalists outside the collective as associate members. So on the story basis if journalists want editorial support, legal support, help in investigations, then we work with them and then we publish their story. So that's largely the back story of The Reporters’ Collective.

Actually our first project was something on the electoral bonds scam. The current government came up with the opaque electoral funding mechanism. We investigated the whole process and how it was flagged by various authorities within the government, that it would sort of lead to dubious funding within elections and the ruling party actually benefits it from it the most. We exposed how there were flaws and how it was cleared.

Our objective is that while the space shrinks in other newsrooms and at other places we want to retain our focus on the government, on the corporations, on people in power and keep making them accountable through our reportage as much as we can.


Before we get into exploring the reporting you and Nayantara [Nayantara Ranganathan] have done on Facebook and Facebook's mishandling of the 2019 federal and state elections in India, tell me a little bit, Kumar, about your inspiration or motivation for this piece.


There had been a lot of reportage about Facebook's mischievous doings in different elections in different countries. Cambridge Analytica, then the Facebook files in the US in the UK elections as well. There's been a lot of reportage around that in India as well, in bits and pieces. But that was mostly focusing on how some of the topmost executives on the Indian Facebook team were sort of working very closely with the ruling political party in India. There had been reportage about how people in Facebook's public policy team in India often helped some of the politicians, and that pages linked to the ruling party allegedly got away with hate speech and misinformation.

So there's been this and that isolated reportage in different publications. But I felt that there's been not a similar kind of systematic investigation into how the platform might be, in a more systematic arranged way, helping the ruling party. We had seen Cambridge Analytica in other countries in other elections. So we wanted to look at what could be the real or actual impact of some of the policy understandings or policy decisions that Facebook's public policy team had taken. Or just the algorithm – how the business model and the algorithm of Facebook works.

There's been a lot of understanding and reportage – even academic research — about how that kind of bias of the algorithm might be affecting the political playing field in different countries. And our question was, can we look at similar kinds of questions in the Indian democratic election space? There is huge influence of big tech – Facebook, WhatsApp, Twitter – on the Indian polity. For us that was the motivation to really pin it down to say, look, this could be the scale and impact of what might be going behind the scenes.


I will try to very briefly summarize some of the high-level findings. The headline takeaway is that Meta has undermined democratic principles in India, at least in the context of the 2019 federal elections and state elections over this 22-month period that you analyzed, in which you and Nayantara looked at 500,000 political ads on Facebook. I should mention the world's largest democracy of 1.2 billion citizens elected a prime minister and held these 9 different state elections across the country. And the theme here seems to be that Meta was continually undermining democratic principles by giving BJP, the incumbent party, an unfair advantage. And that advantage played out in a couple of different ways. First through an ecosystem of surrogate and ghost advertisers. The BJP was evidently effectively able to double their presence on Facebook and Instagram there.

I should note that according to Indian electoral law these surrogate and ghost advertisers are actually not permitted. In an effort to control the influence of unknown sources of money, political parties are required to disclose their spending. What was interesting is that Facebook had previously cracked down on surrogate and ghost advertisers in the context of these elections, but that crackdown appeared to be selectively excluding the BJP and much more targeted at BJP’s primary opposition party, the Congress Party.

We know from Frances Haugen's whistleblowing that Facebook has been encouraging an industry group in India to encourage the electoral commission of India to be less stringent against social media during elections.

Another really interesting finding from your reporting is that the BJP had been charged less for ads. Granted, this may not be a simple and conscious decision by someone at Facebook choosing to hard code a discount for BJP. But insofar as Facebook's option pricing mechanism for ads is conscious and intentional the BJP seems to have an unfair advantage here. There are cases in which entities vying for ads can offer a lower price. And if their ad is deemed to be more relevant to the target audience, Facebook will actually choose to display that ad, effectively subsidizing it in the process. “Relevance” is defined by many different things but things that garner engagement tend to be “relevant” in Facebook's eyes and we know that divisive content tends to garner a lot of engagement. And this is not to say anything about the way in which Facebook executives have worked with the BJP campaign effectively how to optimize its use of Facebook. And so on the whole it seems that Congress, the main opposition party, was charged between 30 to 32% more than BJP was.

And this phenomenon has not been observed only in India. The NYU Ad Observatory observed a similar phenomenon in the case of the 2020 US presidential election, in which the Trump campaign paid less for ads relative to the Biden campaign.

Okay, that was a lot at once. Is there anything on this collection of reporting that you want to add before we talk about why all of this really matters and what it means for Indian democracy?


Let me just close that loop on the summary of the story. When you have a large network of surrogate advertisers and ghost advertisers supporting one party – and in this case the BJP’s visibility just in sheer number of impressions for its promotion campaigns was doubled by these surrogate advertisers, right? And then the money that was spent by these surrogate advertisers for doubling BJP’s visibility was really not accounted for in the BJP’s books.

So you have this additional almost double amount of money coming in to promote one party and doubling their visibility, and then when you have that kind of visibility on a platform like Facebook, there is a snowball effect. More visibility means the algorithm considers the content of what is more visible as more popular.

So these surrogate advertisers essentially game the algorithm or bluff the algorithm to make it believe that by default BJP’s content is more popular than other parties because there's this entire ecosystem of ghost advertisers promoting that content. Now, once the algorithm believes that this is more popular, it makes it cheaper. It gives it more subsidies. That means the ability of BJP to get even more eyeballs for less money increases and in competition the ability of other parties to gain the same number of eyeballs for spending the same amount of money diminishes. So they have to spend more and more to even get close to what BJP eyeballs would be.

With the snowball effect, the difference keeps becoming bigger. So what eventually happens is the largest party, which is able to game the algorithm through multiple dubious ways including surrogate advertisers, keeps getting bigger and bigger and the competitors keep becoming smaller and smaller on that platform. You can actually compare the visibility of Prime Minister Modi on Facebook and BJP followers as compared to say the leading opposition party, and its main leader, Rahul Gandhi. So Congress and Rahul Gandhi’s visibility is less compared to BJP.

But even now you consider, say, much smaller parties, younger parties in different states and in different regions. The strength of a democracy should be that you know you have a multiparty system, but these smaller parties would never be able to get there. Those smaller parties would never be able to compete because they don't have that kind of ecosystem.

So in that sense the skewing of having a level playing field, at least on the Meta platforms, Instagram and Facebook, has been sort of skewed tremendously in favor of BJP, which is very, very damaging for electoral democracy. The basic principles of having equal space in democracy are diminished here. In Indian laws and Indian constitution that is a very fundamental thing. The whole reason for putting a cap on money in election campaigns was to ensure that there is a level playing field and money and power doesn't skew this level playing field.

So that has been sort of violated in this case and I think it did have pretty far-reaching implications. After we came out with this investigation and this report, it was raised in the parliament. There was a lot of reporting around how the opposition parties in the parliament picked this up. But unfortunately – you know how things work with Meta and especially in countries like India – nothing really changes. And I see now after eight months after our reportage that there are ghost advertisers, surrogate advertisers, still continuing on the platform. Even after us pointing it out, and there was not even any counter reply or any sort of response from Facebook to counter these findings. Their Meta global integrity team reached out to me to get input and suggestions on how these lapses could be fixed and how the platform could be improved, but I don't see any of the changes happening.


Tell us a little bit about your methodology. I can imagine that there are people who will listen to this and who want to replicate or at least adapt some of the work that you've done in their own country and many of the pieces actually outline some of your methodology. But one of the parts I found most interesting was the identification of the surrogate advertisers, which in some cases seem to be a rather labor-intensive process. Obviously there's a need for this kind of work to be done and in a scalable manner.

Where do you think that responsibility should lie with Facebook or with the electoral commission of India? Walk us through a little bit of that process of how you identified these entities that had failed to disclose their relationship with political parties.


As we are giving more and more scrutiny through tech investigations, especially In India right now, there have been a lot of discussions about how journalistic investigations on tech need to be more transparent and the methodology should be out that people can replicate it and verify it, etc. Which I think should be the case with most investigations – not just tech related, especially with the kind of political atmosphere we are in and the kind of onslaught that journalists face nowadays – the kind of misinformation that is spread, manipulation of information is everywhere around us. There is a decline in the trust in journalists and journalistic work in general and I think one way to to really counter that and not let it affect good journalistic investigations is to put your methodology out, so that others can replicate it. So in that sense I think this is a very important question.

Within the limited the word count that generally journalistic publications and pieces have to adhere to, we tried to put our methodology in the pieces. If any more of this work is done then it would in fact just amplify the importance and the impact of these kind of investigations.

To give credit where due, the idea for this work at least in terms of the content and the resources that we needed for this work came from Nayantara and Manuel [Manuel Beltrán] , who are the founders of Ad.Watch. They came to us suggesting that there is a way to sort of scrape data from Facebook's Ad Library, which they had already done in almost 100 countries by now, which is there on their website. Ad Library provides an API, Application Programming Interface, for advertisers as well as researchers to analyze how the advertisements are working. Now the same API can also be used by researchers to download data, scrape data from the Ad Library and conduct their own research. The structure in which that data comes out or could be scraped is not very research-friendly. It's very jumbled up. You can't really segregate it properly. So to get around that process, Manuel and Nayantara wrote a code to scrape it in a way that can be easily analyzed on software like Excel and CSV and other programming software where analytical understanding could be developed on this data.

So we chose a period starting three months before the national elections of 2019 – that's exactly the period when Facebook also started differentiating between political ads and commercial ads and started archiving the political ads in the Ad Library. Starting from there for about 22 months to the point when we started our work. We scraped data from all the advertisements that were tagged as political by Facebook's own algorithm. That data has multiple attributes – so say, for example, one attribute is who is the funding entity for a particular advertisement. So you have an advertisement, you have the funding entity, you have the page on Facebook on which this advertisement was published. Then you can also find out how much money was spent on this ad – there is not a fixed amount but there is a range that Facebook gives. Similarly, it also gives details about how many impressions that ad would have got, and it gives from which date to which date these ads were running. It also gives information about which region or which area in the country those ads were shown.

Once we had this data for about 536,000 ads, we segregated these ads by the funding entity and the pages on which they were published. The funding entity here is the advertiser who's placing these ads and the pages are the publishing platform on Facebook itself on which these ads were first shown. And from there they spread to other places. A lot of time the funding entity is mentioned as the political party. The election commission of India requires all authorized political parties and candidates to get their ads verified by the election commission of India first before publishing them on any platform. That's the way to actually prevent any kind of surrogate advertiser. So if, say, the BJP has to place an ad, they'll send a formal request to the election commission of India that this is the ad going from BJP and we are publishing it here. The election commission of India approves the content to make sure it's not religious inflammatory, or it's not misinformation, etc.


So the standard practice is for the political entity to get the ad passed by the electoral commission, and then only once they get the green light to then try to place the ad?


Absolutely. That's the legal practice.


That process of getting consent or permission from the electoral commission does not include information about, for example, whom the target audience is?


No, it won't include the example of whom they are targeting – only the content is verified. And Facebook also has in its own policies. It has said that if a political party or an authorized agent of a political party is placing an ad, it also has to submit the authorization letter from the election commission of India. So in all such cases, the funding entity would be the political party or it would be the candidate from the political party. So their names will appear as the funding entity.

So what we did is we segregated all these ads as per the funding entity in the database and then for all the authorized political party pages. So there would be different units of political party, BJP National Unit, BGP State units. So these different units would place ads and their names would appear. Then there would be candidates who would also place ads through their own expenditure. So their candidates’ authorized names would appear there. What we did is segregated all the authorized candidates, authorized political parties into one bunch. And we categorize them according to political parties. Then once we did that we found that there were a bunch of these advertisers whose identity was very difficult to establish.

So there were a lot of funding entities whose names were not the names of any organization or the names of any individual – some gibberish name right in the database of Facebook. And there would sometimes be the link of the website in place of funding entities, and if you clicked on those websites they would not open – so there would be so many dead websites.

All these dubious kind of names of funding entities, we clubbed them separately. And then once we got this one bucket of such names where we can't really figure out who these funding entities are, we tried investigating what these websites are and who these people are. Now there were different ways of verifying who these funding entities are, so, one is this data that you download from the Ad Library; others are Facebook pages on which these advertisements are published. And those pages have some information about who the administrator of these pages is. And each ad will also have some information about funding entity, their contact details, their address, etc. So we went to the ads on Facebook on these pages and then we looked at these details of who the funding entity is, what if there are their contact numbers given, if their address is given or try to establish the link of the page on which it's published, and see who's running these pages, so there would be some information about those administrators.

Some of these websites actually opened at that point, but they would not give any information about who is behind them. So then we looked into their code, who has registered these websites, then tried and investigated into contact details of those people and established links. So there were multiple traditional online investigative methods. And then accordingly we assigned those identities to all these sort of dubious advertisers’ details that we found from from the database.

We also looked at the content of the advertisements that these advertisers were placing. From the content it was very clear that either they were promoting a certain party or they were sort of disparaging another political party. And accordingly we also clubbed them into whether they are pro-BJP or pro-Congress or pro-Adami party or against. And that's how we built these surrogate networks of different parties.

In many cases, actually we could find the links of these surrogate advertisers to the party. For example, there were the largest surrogate advertisers of BJP which do not give any public information that they are linked with the party. We could find that they are registered on the same address as the BJP headquarters. They also had similar phone numbers. So we knew that they have connections with the party. But in public, and to Facebook and to the election commission of India, they are not disclosing that. So their content could be distanced from the political party. Their funding could be distanced from the political party.

The second part was obviously about trying to find pricing for the advertisements and how the algorithm sort of auctions, and how that leads to BJP getting cheaper ads than the Congress. The pricing was impressions received by an ad per Indian rupees spent by these parties. So since we had a range of money spent and range of impressions, we calculated the advertiser’s price. We did this for different elections.


I am tempted now to talk a little bit about some of your other work on the Aadhaar system and its connection to voting. It would be helpful to just hear a little bit from you about what it is and how it's used in daily life.


Aadhaar has been one of the most controversial tech tools in the privacy space in India for a while. And in fact, the journey of Aadhaar in some sense has also been the journey of how the misuse of personal data has evolved in India.

Basically in India the one way that the government knows about its population and its people is through the census. So the census has been happening every 10 years in India. Remuneration officers from the government go to each and every household and collect a lot of data about each household and the individuals within that household. Now, the purpose of census is very different. It's supposed to be the bedrock for the government and the planning departments of the government to plan the resources for the country. Surprisingly, the census, which is conducted under an Independence-era law, had a provision that the census data could only be used as aggregates by any authority or any government. Nobody can share personal data of individuals or any family from the census with anyone. So there has been this protection about how personal identities of people are preserved and protected.

But in the early 2000s there has been more and more need felt by the national security agencies to sort of survey who is a citizen and who is not a citizen in India and then give them that tag. To do that, they started to build something called the National Register of Citizens. and it was done under new rules that they passed in 2003. For creating that register, they started first building a National Population Register, which was almost a parallel exercise to the census. But in this case, there was no protection for personal data. So the government would know each and every individual, each and every family, what they do, how much they earn, what their religion is, what their caste is... All that data was collected for the creation of NPR. And that was enforced by the national security agencies in the country.

But around the same time, the tech industry in India also saw an opportunity to build an identity-based tech and business ecosystem in India. So a bunch of technocrats in India proposed the idea of building a tech-based identity card, which would register the biometrics of every individual and give them a code and a number. And through that identity number and the biometric data that is stored in a national database, welfare schemes delivery will happen.

Now in India welfare delivery is huge. More than half of our population actually depends on these welfare benefits. So, that idea was taken up by the government when it was proposed and they started building this database called the national identity database. Eventually it got so much buy-in from the business community in India. There was a huge amount of business ecosystem that was created around it – from fintech companies to telecoms to banking. Tech-based identity became a very easy solution for identifying people. And even for providing these identity and verification services, several business entities cropped up. So you have a huge business ecosystem developing around it now using that. So then you have edu-tech companies, ad tech companies, fintech companies and everywhere a system is being built of how citizens’ personal data can be used to deliver services and benefit from it.

In early 2010, a lot of people – social activists, privacy activists – saw a problem in that and they feared that this could create a surveillance state. Because once you are linked to a technological solution and a database, anybody who has access to this database would know what you do, where you go, where you are moving, what you are earning. As well as for businesses to earn benefit from citizens’ personal data at the cost of the government, which is public money, so, building on the government infrastructure which is built on public money, built on citizens personal data private businesses would get profit out of it. And then there was this huge opportunity of misusing personal data for surveillance purposes.

So the court eventually partly agreed with these petitioners and they said, yes, there is a threat of surveillance. They also agreed that business parties were earning profit out of public money and citizens’ personal data. So they restricted the use of Aadhaar in 2016 to only the welfare delivery schemes. They said for nothing else you can use Aadhaar without the consent of individuals. Now you know how the consent in India works – half of the time you're not aware what is the meaning of consent, half of the time the consent could be just coerced by people. And to make things worse, the government changed the rules of taking consent. They said that if a business entity or a government entity takes consent from an Aadhaar holder to use their data once, they can extend on their own the consent for using it again and again for any other purpose as well. So the whole idea of purpose limitation got screwed here. And then they also said that unless somebody comes up to us and says you can't use it, the consent would be considered given.


You can't even call it consent.


Exactly, which is against the definition of consent itself. You know the definition is that unless it's an explicit, yes, you can't assume it. So here because it's not an explicit “no,” they are assuming that it's yes. So that's how Aadhaar became our very pervasive tech tool to have entered in private spaces of every individual in India. And you can't control what people are doing with your data. There are rules and regulations, but they could be changed or diluted at any point at the whims of the government. Nobody is going to make anybody accountable. And if there is a misuse, a lot of times you even don't get to know how your data is getting misused.

And that's when I think some of our reportage actually showed that there has been tremendous misuse in some of the elections of this Aadhaar data. And it has been swept under the carpet – when issues were raised and it came out, it was hushed up. And it was completely denied, but there is all the evidence out in public about how it has been misused and now the government is actually linking the voter IDs of individuals with Aadhaar, which as I said earlier, has led to disenfranchisement of voters and other kind of manipulations in elections.


Tell us a little bit about the unification of these systems with elections.


You know how technology and its persuasiveness or intrusiveness really evolved? The understanding of it was lacking and still lacks in a lot of well-intentioned bureaucrats and well-intentioned people in policy circles. The threats of it have really been established and understood in the past decade. So they sold these techno-solutions to all the governance problems and many of the people in the government bought these and started building these. And eventually they realized the harms of it. Which is what the case was with Mr Manaranja Kumar, as I pointed out in the reportage.

And the problem is, there are always goods and bads about everything. To ensure that there are no harms, to ensure that the good parts are established and it's not misused, you need to have very robust regulations and a framework where harm is prevented. And if there is a harm, then accountability for that harm is fixed. And that's why we need the data protection law. We need privacy laws.

In India the problem has been, while you have these technological solutions developing at a fast pace, you have businesses influencing the government policies on how technology should be used to create business solutions and governance solutions. You have national security agencies pushing for more and more strict surveillance and tools and laws that could have umbrella surveillance in different spaces. The regulation on privacy and data protection has been missing in India.

The supreme court, when they delivered the Aadhaar judgment in 2016, they asked the government to frame a data protection law. Sorry it was I think 2017 or ‘18 – I'll need to check those numbers, but it's been almost four years since then, and there have been committee after committee and drafts after drafts of the law. And finally right now you don't even have a draft law. They have been scrapped. And the government is saying they'll redraft the entire thing. So you never know for how many years India will not have a data protection law or privacy law. While these tools have been developed now for all good intentions of social registry which is actually a three sixty database of every individual in the country and you cannot say how the misuse of it would be prevented. There's no framework for it. Somebody goes rogue in the government, they get all this data and misuse it, there would be no protection. And as I said there had been instances in the past where the ruling political party in different states have misused the data from Aadhaar. There's no safeguards against it.

And to my surprise, I can understand that creating a 360-degree registry while there is already Aadhaar creating another registry and linking all the database which take a lot of time so that's why we still do not have a social registry functioning at the central state. So many states have already built social registries through the support of the World Bank. And in fact I interviewed in one of these states a private company that is building the front end for this registry. There are other companies which are involved in the backend of how data is managed, etc. But this company just building the dashboard the front end for the social registry. They have the access to all the data of that state for that fee. When I asked them, what is the security of this data and what is the assurance that this data is not misused, etc, this company is telling me that the data is only used by different government departments on a need-to-use basis. There is no legal protection against misuse.


One of the people you interviewed for one of your pieces was talking about how they adhere to a data minimization principle. Even though this data system is, in fact, the exact opposite. It's a data maximization mechanism.

What's so interesting about what's happening here is that it's very similar to what we have seen in Kenya, what we've seen in Brazil, of these national citizen registries being justified under the guise of national security. Tech companies who have a lot to gain from having a hand in the deal and civil society being completely excluded from the conversation. It sounds like there has been a whole like 360-degree profiling infrastructure developed that completely circumvents all of this Aadhaar discussion.


So what has happened is because Aadhaar was entangled in so many legal questions and the court cases were going on, now you have much more sophisticated software. There are actually private companies working to provide unique identification of customers to different businesses. Different online businesses, when they do their surveys of their customers and identify what they want at what time they want and they provide advertisements and product services to those customers. There are a bunch of companies that are working on giving data insights and intelligence about customers to these businesses.

Now the government has also started using the services of these private companies and their proprietary software and their AI tools to de-duplicate and create 360-degree profiles of citizens for government services without having to use tools like Aadhaar. So what they'll do is they'll combine all different government databases – public, private databases – and then collect all their data and then identify individuals on the basis of their address, name, phone number, and de-duplicate and find unique parameters to attribute to every individual and create these profiles. So now while there is a certain amount of success for civil society and privacy practitioners in holding the Aadhaar authority and the Aadhaar ecosystem accountable through that Supreme Court case which established that it can't be used without consent for any other purpose but only for welfare, you cannot counter so many other technologies that do the exactly same thing that they feared that Aadhaar would do.


Shortly after the Cambridge Analytica revelations went public, at the time in the United States there was discussion about the Honest Ads Act. The Honest Ads Act was an attempt to bring more transparency to political ads precisely to help better understand who's paying for them. I was talking to this political consultant and he was saying to me, you know, even if it passes, here are three plans I brought up my sleeve to continue doing exactly what I'm doing. And this is very much in the same vein – this pattern of technology being used to circumvent legislation.


Absolutely. I mean I genuinely wonder if regulations can ever be ahead of technology.


And I think in some ways it's great that our systems are not preemptive in that way.




Um, you know this fast-moving nimble technology and these intentionally slow moving legal bodies... there's a lot of friction there.

In one case that you report on, a private company stole the data of over 78.000.000 residents in two states and then used that information to build a mobile app for a political party. What your reporting covers is the way in which people were purged from voter rolls. We often think about this phenomenon of just data hoarding. But there's a kind of erasure that happens behind the scenes as well.


Again, because you are talking about how technologies always find ways to circumvent the legal systems, in this particular case explicitly when the states of Andra Pradesh and Telelangana's Aadhaar data was leaked to a private company which built an app to campaign during elections for the ruling part, what happened is when the Aadhaar infrastructure was being created, the Aadhaar authority at the center allowed the state governments to keep copies of the entire Aadhaar data of the residents of that particular state. So every state in India also had a copy of residents’ data from the Aadhaar central database.

Once the concerns were raised about misuse of this data and privacy activists and civil society raised to the supreme court how the state governments are making copies of this data and they might misuse it, the court asked the state governments to destroy this data. Whether this data is destroyed, whether this data gets deleted, whether it's hoarded by someone else in some other servers, whether copies are made and copies are renamed and the data format is changed is very difficult to monitor. And that's exactly what has happened in this case. This copy of the state Aadhaar data was leaked by the ruling political party to a private company, despite the Supreme Court order that it should be deleted. And then they misused it for political campaigns during the election.

And again that brings us back to the question of skewing the political competition. Because in this case, the ruling party has access to this data. The opposition parties wouldn't have that. And then, using very personal data, they personalize the campaign for each and every individual, almost like Cambridge Analytica. They had information about what schemes this person is getting benefits from; they had data about their political preferences. And accordingly, the political parties’ volunteers went to each and every individual and families and accordingly they sort of messaged them.


The property of the state came to be the property of the ruling party, and that's a massive problem, when this line between the temporary occupant of the state infrastructure – the incumbent party – and the infrastructure of the state are blurred. I think you know it's particularly these data intensive infrastructures that are so worrisome, because it's precisely in cases like this in which simply the access to that kind of information can massively skew the dynamic. It can massively enable a party, for example, to maintain a stranglehold on power can prevent any other party from having a chance at acquiring power. And obviously that's the complete antithesis of what these democratic ideals are about.

Please note that this interview has been edited for clarity and brevity.

Interested to stay connected to The Reporter’s Collective? Subscribe to their newsletter at The newsletter is sent only whenever a new story or feature has been published.

The influence industry is led since 2016 by Tactical Tech’s Data and Politics team addressing the pervasive data-driven technologies used by political groups within elections and political campaigns.

This interview was edited by Cassiane Cladis.

First published: January 30, 2023

Listen to another interview

Discover all of Interview Series 1

  • The Influence Industry Project: Interview Series 1

    Join us as we hear from international researchers, professional and organisers. Inspired by the excellent work of professionals within our international networks, these interviews are a way to amplify the work and voices of the community examining the global and local practices of influencing…

    Read more +