Government data in political hands: Aadhaar citizen ID and the 2024 Indian election campaigns

A sketch on a background image, mostly in turqoise color and with various shapes
Governmental policies in India are increasingly being digitised, making large amounts of personal data available to the ruling party, in particular. Such access potentially allows them to develop targeted campaigns before the 2024 elections. This article examines a few of these data collection systems, including the Aadhaar card, examining their original purpose, their creeping scope and their potential impact on democratic elections.

In the past decade, increasing Internet penetration, the so-called ‘data revolution’, and mushrooming political consultancies have had wide-reaching consequences on the political landscape across India and the way political campaigns are carried out. The influence of the burgeoning technocracy on Indian politics and election campaigns has raised concerns about the collection of personal data of individuals that is processed and stored with limited functional accountability. Also of concern is the security of the stored data, which many argue is being exploited for political campaigning, and ultimately for political gain, including by the political parties in power.

According to the Bipartisan Policy Center “The Cambridge Analytica controversy profoundly impacted the world of data privacy, political campaigning, and social media.” While this UK and US based scandal was covered worldwide, including by Tactical Tech, the company also had wide reaching consequences for many other countries. The scandal led to the Indian government opening inquiries and holding hearings to further investigate the topic of data use in political campaigns. In 2018, India’s Central Bureau of Investigation (CBI) investigated Cambridge Analytica, and ultimately filed a claim against the company for illegally accessing Indian data. These investigations, however, did not stop the increasing use of data in Indian politics.

Having detailed data on the demographics, religion and former caste classifications of the constituents in an electoral region could be a considerable advantage to a political campaign. Targeting groups based on their individual data points was already a tactic reported in the lead-up to previous elections. Political parties may work on the premise that voters will support a politician who represents or directly appeals to that specific group’s interests. For example, if the majority of the population in a certain locality is Baniya – a former Indian caste classification found chiefly in northern and western parts of the country – a political party might be more likely to field a representative from that community, even if the constituency has a diverse population.

However, caste and religious data is particularly sensitive, as it can be used for divisive and identity-based targeted campaigns. In particular, this data is steeped in bias and takes on added significance against the backdrop of the incumbent Indian government, which is often blamed for endorsing militant Hindu nationalism. India has the second largest Muslim population in the world – last counted at a little over 172 million (as per the 2011 census) – but still a minority given that Hindus make up about 80 percent of India’s roughly 1.4 billion population. India has been led by the BJP since 2014. The right-wing party has pushed controversial policies that critics say ignore Muslims’ rights and are intended to disenfranchise millions of Muslims.

With the goal of elevating the place of Hinduism in state and society, there is the real possibility that this data will be used to help inform policies that could end up undermining Indian MuslimsMichael Kugelman, in interview with the author

Just as data is used to inform policies in ways that may mirror the biases of political parties, it can also be used to inform the campaigns they will run, ultimately providing information to certain groups and ignoring others, or more dangerously, applying divisive and disruptive influence techniques to encourage, or discourage, certain groups to express their political opinion or even to vote.

Previously, data collection exercises in India were limited to the census, a mammoth task involving extensive documentation processes. In India, the last census was held in 2011. While the Covid-19 pandemic derailed the scheduled plans for the 2021 Census survey, the 2024 Lok Sabha elections in India have ensured that political parties have applied pressure to keep the plan in abeyance for now, worried that the ruling party will use this data to their advantage.

While the data is outdated, the census still remains a utilised data source. Political parties, especially the ones in power with first access to the census, can use this information to optimise their campaigning efforts to certain groups. In discussion about the importance of a census and data, Michael Kugelman, Director of the South Asia Institute at Woodrow Wilson International Center said,

A census is inherently a beneficial process as it is meant to help governments better allocate public services...But there’s always the risk that census data will be politicised and turned into an opportunity to shape a government’s thinking around planning for elections and other political activities.

The same data, and more, can now by collected and analysed from other sources. As Shivam Shankar Singh writes, “In many large states of India such as Uttar Pradesh and Bihar, the caste of over 70% of the people can be determined simply by analysing their names...With a relatively simple code, a political party can identify the caste and religion of a large number of voters by analysing their last names (surnames)...” This data can be used to design communication and electoral strategies catering to those communities.

However, this data can be inaccurate, either not representing shifts in the system and culture regarding identity, caste and religion, or simply because data is not so simple: one surname as highlighted by Shankar Singh - such as Choudhary - can signify multiple castes in Bihar. In these cases, political parties deploy local workers or employ private firms to cross-check and categorise the data. Political parties, and the consultancies that they might work with, can conduct broad surveys similar to the census, or at least along the same lines.

Translating this data into electoral strategy can require large amounts of time and human resources, which can thereby limit the scope of the data usage. However, the emergence of digital tools offering statistical analysis and consultants offering to carry out the work means that the merger between data mining, decoding, and electoral strategies is more amicable. For example, in the book ‘How to win an Indian Election: What Political Parties Don’t Want You To Know,’ former BJP data analyst Shivam Shankar Singh writes about how data from the Election Commission of India website can be collated and used to map vote swings and correlated with data from telephone companies and research firms, such as Apriori Data.

There are increasingly blurred lines between the data available for political campaigning and data available for governance. The government, and consequently, the ruling party, has even more access to vast amounts of data based on the data collection carried out to inform their work. As an example, the National Family Health Survey (NFHS) by the Ministry of Health and Family Welfare has conducted surveys of the population. These agencies only cover a sample size, but that still means that they collect information on around 200,000 or 250,000 households per site. This access can be to the advantage of ruling parties:

Since there is so much of profiling happening at the state level, the state knows what kind of government schemes to announce just before the elections to lure voters.Srinivas Kodali, in an interview with the author

The data collected through in-depth surveys allows whomever runs or commissions the study - in this case the ruling party - to gain a fair idea about the needs of the people living in a particular place.

Aadhaar is a 12-digit individual identification number issued by the Unique Identification Authority of India (UIDAI) on behalf of the government. The Aadhaar Card was first introduced in September 2010 by the then-PM Manmohan Singh. The number serves as a proof of identity and address anywhere in India. UIDAI chairman Nandan Nilekani had an ambitious goal for this project, claiming that within four years, over 600 million people would have an Aadhaar card. The card could be connected to their Know Your Customer (sometimes Know Your Client or KYC system) – a system which is used to identify and verify the identity of individuals. Further to this, in 2007, the Planning Commission of India devised a project for an Integrated Smart Card, whereby every member of a household could be identified and inter-linked to track and manage welfare schemes, including subsidies for rations like cooking gas, scholarships for schools, passports, bank accounts, pensions, driving licenses, and insurance policies. As of November 2022, around 1.3 billion Aadhaar cards had been generated.

Aadhaar, which also contains biometric data including fingerprints and iris scans, has become the thread that connects data and services offered to an Indian citizen and resident. Journalists, including Srinivas Kodali, have noted that through Aadhaar, each state government has the infrastructure to begin building 360-degree profiles of individuals and households to track welfare spending.

[The government] would know every individual household's electricity or water bills...they are now tracking what is the wealth of the household and whether the household is getting ration, whether the household is getting pensions....Any data that the government is collecting is going to the parties and whatever parties are collecting on the ground when they come to power, it comes back into the government systemSrinivas Kodalim in interview with the author

To further integrate Aadhaar into the fabric of Indian society, in 2021, the Lok Sabha passed the Election Laws Amendment Bill to create a legal framework for linking voter registration information belonging to the Election Commission of India (ECI) and the Aadhaar system – though this link was to remain voluntary. In February 2023, The Hindu reported that, according to the Election Commission, roughly 60% of voters had already linked their Aadhaar to their voter ID. Despite the voluntary mandate, reports have emerged that voters must declare that they do not have an Aadhaar number in order to avoid the connection between these identities being made. Journalists and opposition researchers noted their concerns: “Aadhaar is not proof of the right to vote...Aadhaar numbers were issued to all residents and not citizens,” meaning that the data used by political parties is not necessarily an accurate snapshot of the voters they are meant to be campaigning to and serving if they are elected.

The data available through Aadhaar-based profiles has quickly been seen as an asset for political parties. In a case in 2019, in Andhra Pradesh, data that was apparently from Aadhaar was found to be used by staff members in the election campaign for the Telugu Desam Party (TDP) of the area. Two months before the 2019 general election, a complaint was filed against IT Grids Pvt. Ltd, the company hired by the regional TDP to build their party workers app “Seva Mitra”, alleging the improper use of data. Ultimately the investigators found 78 million Aadhaar records of residents in Telangana and Andhra Pradesh in IT Grids’ possession. This data included individuals’ “Voter ID, individual’s color photographs and other details including welfare schemes available to an individual”. Furthermore, the firm had access to potentially illegally sourced photos of individuals and the data infrastructure in use also seemed to be taken from elsewhere, as it was similar to that “originally owned by Unique Identification Authority of India (UIDAI).”

image showing the connections of the Aadhaar cardThe Aadhaar card is increasingly intertwined with other government services. Source: Tactical Tech

The possession of such a database, it was noted, was a violation of the IT Act of 2008, holding the data on removable storage was considered a violation of the Aadhaar Act, 2016 and hosting the database on Amazon’s web services was considered a national security risk. The TDP countered that the information found in the “Seva Mitra” app was based on data that is publicly available and therefore not unethically gathered, arguing that “those complaining against it are doing it for political reasons.” The case was eventually handed to a special investigation team and is still under investigation.

Using this data doesn’t necessarily mean that political parties have good information on voters: errors in Aadhaar data were one-and-a-half times higher than errors in the electoral database. In states like Andhra Prasdesh, the government has a workforce to collect and update information and “every month these volunteers provide welfare and other subsidies by going door to door and collecting Aadhar-based eKYC of residents at their doors.” These constant updates and links between Aadhaar and other services has lead some privacy researchers to fear the slow creep of the aforementioned 360-degree database. In an affidavit to the Supreme Court last year, the UIDAI claimed that government agencies, “will never have or will not be able to build a 360-degree view of any of its customers or beneficiaries.” Despite this statement, Aadhaar numbers can be thought of, in many instances, as a link between many disparate databases into a single identity number.

The lack of transparency and accountability within the UIDAI and other responsible parties also leads some to worry about whether meaningful, systemic change is possible to curb the use of inappropriately and illegally accessed data. There are many ongoing public leaks – in 2019, the tech site Medianama, held a list of 37 cases of Aadhaar-related leaks and as recently as October 2023, a US cybersecurity firm found continued leaks of Aadhaar data across the web. Once the issues or leaks are discovered, there are reported difficulties in unraveling what went wrong after a security breach. Apar Gupta of the Internet Freedom Foundation, said that the, “lack of accountability is an inherent feature of the Aadhaar system.” Concern over security and privacy has grown and there is yet to be an adequate response from governing bodies.

In the first performance review of the UIDAI in 2022, India’s independent auditor, the Comptroller and Auditor General (CAG), reported that UIDAI had generally “deficient data management” and that there were several weak points in the system, particularly with regards to a lack of data archiving policy, limited regulations for third parties to keep or copy data that might put citizens and residents at risk, and duplicate and incomplete registrations. In light of the continuing proliferation of ways to collect and use information about people, stringent policy reforms are needed to put a check on how people’s data is used and processed by political parties.

The next Indian general election is expected to be held between April and May 2024 to elect the members of the 18th Lok Sabha. As reported by Al-Jazeera, “Indian lawmakers have already passed a data protection law that will dictate how tech companies process users’ data amid criticism that it will likely lead to increased surveillance by the government. The Digital Personal Data Protection Bill, 2023 gives the government the power to exempt state agencies from the law and gives users the right to correct or erase their personal data.”

However, opposition lawmakers and digital experts say:

legislation would allow the government and its agencies to access user data from companies and personal data of individuals without their consent as well as collect private data that could be used for their vested interests, election campaigning in a country where digital freedoms have been shrinking since Prime Minister Narendra Modi took office in 2014.Al Jazeera

Ahead of the 2024 general election, concerns are being raised over the increased chances for micro-targeting, voter profiling and even fraud. Election monitors, journalists and other civic organisations need to be aware of the vast amounts of data in use by all political parties, especially those in power, and the risks this data creates.

More case studies focused on India

Regulating Diffuse Actors in the 2024 Indian Elections by Amber Sinha

How an Indian political consulting firm works: A case study of I-PAC by Vasudevan Sridharan

About the Author:

Safina Nabi is an independent multi-media journalist. She covers South Asia and Indian-administered Kashmir and covers conflict, human rights, gender, health, culture, social justice, and the environment. Her work has appeared in a range of Indian and international publications like Guardian, Aljazeera, Slate, Vice, OpenDemocracy, Christian Science Monitor, and Article 14 among others. Safina has received grants and fellowships from Pulitzer Center, Reporters Without Borders, and Dart Center. Link to her complete published portfolio Safina can be reached on X, formerly Twitter, @soul_restlesss or Instagram @safinanabireports

Cassiane Cladis and Amber Macintyre also contributed to this piece.

If you want to find out more about the firms that support political parties engage with Google advertising, head over to The Influence Industry Explorer.

The influence industry is led since 2016 by Tactical Tech’s Data and Politics team addressing the pervasive data-driven technologies used by political groups within elections and political campaigns.

First published: December 20, 2023.

Read more case studies

Learn more about developing research