How Data Logic Alters Politics
Data-driven practices have proliferated in politics, in part due to a belief that they can help provide granular information about an electorate and support good decision making. In political campaigns, candidates engage data-driven tools to measure – among other factors – what voters believe in. However, as campaigners increasingly adapt their campaign tactics to data-driven conclusions, this can bring in issues including: invisible biases impacting how voters are targeted, for-profit algorithms controlling political content, and the quantification, and therefore simplification, of individual political opinions.
Politicians adjust their images in order to maintain or gain power; with current technology, information on how politicians are perceived, how successful their campaigns are, and what voters are saying and doing is readily available. This information can be used to inform a politician or campaigns’ actions, which leads to data-driven political communications.
To learn more about how personal data is applied to political messaging, check out Learning Module 1The adoption of these practices is due to an underlying trust and hope in the ability of data-driven processes to produce more impartial, accurate information on which to base decisions. This so-called “data logic” is defined by how scholars from different disciplines and practitioners in different sectors stress the importance of data practices in informing knowledge. This logic can be seen in the expectation that datafication, quantification, scale, technical systems and algorithms are efficient and effective ways to gain impartial knowledge. However, examining the assumptions of data logic is critical to understanding the value of the knowledge formed, the actions thus taken and consequently why the data-driven assumptions by the influence industry impact how politics take place. This section explores those assumptions.
De-bunking Assumptions
Assumption 1: Individuals are the sum of their data points
In order for to digitally collect and process data, it must be in a quantified format – either as numerical values or key words that isolate the phrases from the broader context. For example, electoral registers can use a unique identifier to connect someone’s name with their address and whether they are registered to vote — while also being able to count any repeats, such as how many people are at the same address or how many people are registered to vote.
The acknowledgment that data can and does contain errors from either technical malfunctions or human error is often overlooked in the rush for quantification. Not only are there explicit errors in the data, there are also mistakes in the interpretation of its meaning. While surveys, polls and account information allow individuals to choose what information and opinions to share about themselves, large parts of datasets are made up of “latent” or “trace” data, collected from users unknowingly as they go about other activities, through digital technologies such as cookies and tracking pixels.
One industry expert agreed to speak with us on the condition of anonymity described part of their social media listening technology:
“We don’t do questionnaires. We basically track data, everything [that] people [are] posting in a public forum and we arrange those simply around a specific keyword or a topic. Because we’re not asking people, introduce your attitudinal bias. So we just passively observe what people say naturally. I know how you can manipulate a poll question to give whatever answer you want. So we simply just observe. We don’t even bother asking questions, and that’s where it gets really scary because there was a service where you can track somebody’s like, 70 of their previous likes. You can deduce their attitude about social issues.”
The data collected represents specific actions of a user but may differ from how an individual defines themself in different contexts. For example what we do on an evening at home and who we want to be at work are two of the many complex ways of defining oneself. Not only do political entities use data points to make broad, sweeping assumptions about their voters, but also to restrict voters’ complex, varied political opinions to their search history or demographic facts. By relying on data collected on our past or present online activities, political groups may fail to recognise the nuances or complexity of the voters they are profiling.
Assumption 2: More data is always better than less data
The inherent trust in computer processing capabilities has led to companies collecting massive, almost incomprehensible amounts of data. This has manifested in large, centralised databases like that of the consumer credit reporting agency, Equifax, which collects details on 500 million consumers and 81 million businesses in 17 countries. Some estimate that by 2025, 463 exabytes of data will be created each day globally. The technologies that collect the information are able to track large groups of people at once in order to create profiles. These profiles are then used to divide the datasets into packaged audiences, such as the ones from Acxiom’s Personicx market segmentation tool. Now organisations are able to generate, store and analyse more data than ever before.
Yet with more data, more problems arise. Data sets that are large can also be more varied, making the analysis more complex and potentially leading to identifying false patterns. Furthermore, given that data can be collected with little sense of the question it will be used to answer, it can be difficult to determine if a sample is representative or inclusive enough to draw concrete conclusions. Despite frequent conflation of more being better, too much data results in too much noise and compromises the performance, profitability and security of any enterprise.
Finally, organisations can aim to achieve growth over quality, i.e. a specific number of members, supporters or votes is privileged over better quality of political conversation or engagement. While the number of votes in an election is important, there is more to political participation and practices than votes on election day. The quality of conversations, debate and active, safe engagement from voters are also critical to a functioning political system.
Read more about Data Breaches, Leaks and Hacks in Learning Hub Module 2: Data as a Political Asset
Hear for yourself:
"And with the amount of data being collected, the scale and the probability of those data being exposed also grows exponentially."
Listen as cyber-security expert, Bob Diachenko, discusses the data security angle of mass data collection.
Find the complete audio and transcript for Bob's interview or listen on PeerTube
Assumption 3: Algorithms are Unbiased and Impersonal
Computer software, algorithms, and technical systems that can run the same action repeatedly, quickly, and millions of times, are used and trusted to gather and analyse data and to provide accurate results. However, it is not uncommon for these standardised processes to contain technical errors or bugs, which can be repeated and magnified by the processing power of computers.
Even without technical errors, these algorithms must still be developed, interpreted and implemented by computer programmers, technical experts and data scientists – in other words, by people. Machine learning models and big data analytics contain and can fall into common human biases like implicit bias or in-group bias. These technologies have the potential to reinforce and even accelerate existing racial, political or gender inequalities, and can also paint a misleading and distorted picture of the facts on the ground.
These standardised data points can lead to politicians catering to certain segments of the population and ignoring large swaths of their constituents. Biases in political spaces may lead to individuals being reduced to what they will most likely support or dislike. Ultimately the content they see will be impacted by these biases, including groups not receiving all the information a political group has to share. All in Data sums up this phenomena well:
“Data bias exists, and in the age of big data, that bias can amplified and more easily hidden”‐ All in DataFurthermore, this kind of algorithmic reasoning can extend into testing behavioural aspects of website interactions, such as through A/B testing - the strategy used to test variations of content in order to measure the impact on users’ behaviour. Leading up to the 2016 United States Presidential election, the Trump campaign ran between 40,000 and 50,000 ad variants a day. One commentator wrote, “every product, brand, politician, charity, and social movement is trying to manipulate your emotions on some level, and they're running A/B tests to find out how.”
This algorithmic reasoning also extends to how campaigns think about engaging with people that trends towards behavioural science. In these instances, groups believe they can learn to manage and control people's behaviours.
While these strategies can lead to a change in behaviour from users, they represent an important hidden influence over our ways of thinking which should be transparent. Shadowy, non-consensual tests and behaviour nudges are not the only way to understand how behaviour changes are possible.