“All through my life, I have been tested.
My will has been tested, my courage has been tested, my strength has been tested.
Now my patience and endurance are being tested.”
— Muhammad Ali1
1 Introduction
More than twenty-five years ago, in a formative paper for the sociology of testing, Trevor Pinch observed: “we live in the age of the test” (Pinch, 1993, p. 27). If the claim was valid then, it is even more true today. As I write this essay, in the midst of the COVID-19 crisis, it is almost impossible to read a newspaper that does not have at least one headline about testing. Many of these are about diagnostic tests for the coronavirus and the results or shortages thereof. Prime ministers, health ministers, and ministers of the economy all proclaim the importance of testing. “Testing is our way out,” write economists Paul Romer and Rajiv Shah.2 While some call for increasing the number of tests by 10 or 25 times,3 others advocate increasing testing by 100 times — in the US context to millions or even tens of millions of tests per day.4 Many other headlines are about a second type of test: how a given person or institution is being tested in the crisis. As of this writing, if you were to type “put to the test” and the name of a prominent politician or important institution (be it parliament, or universities, or even abstractions like science, expertise, or religion) Google would likely refer you to hundreds if not thousands of items.5
It should not be surprising that testing and being tested are taking place in such abundance in this pandemic time of crisis. As a medical crisis, testing is vital for health care at the individual level and for public health at the social level. And it is entirely understandable that individuals, organizations, and institutions will be tested in a moment of political and social crisis. Taking the conjuncture of these two major types of tests as its background, this essay analyzes the similarities and differences between the different kinds of tests in order to understand their entanglements in the crisis. In the process, we shall see a great diversity of tests operating in multiple registers, themselves not clearly demarcated, often combining and sometimes conflating, for example, scientific and public discourse.
To do so, I will open by identifying three aspects of testing, drawn from the sociology of testing. First, tests are frequently proxies (or projections) that stand for something. Second, a test is a critical moment that stands out — whether because it is a moment deliberately separated out or because it is a puzzling or troublesome “situation” that disrupts the flow of social life. Third, when someone or something is put to the test, of interest is whether it stands up to the challenge.
These insights from Part 2 serve as the building blocks for addressing three major issues — representation, selection, and accountability — regarding testing in the time of the coronavirus crisis. Although the three aspects of testing are analytically distinguishable, they do not map exactly to the three issues. Thus, Part 3 explores the role of tests, models, and other projections in representing the coronavirus. In examining what is selected as standing out by the test results, Part 4 distinguishes individual, statistical, and algorithmic modes of selection in the pandemic. Part 5 analyzes similarities and differences between stress tests and test under stress to study the difficulties of public accountability.
This essay will not survey developments in the sociology of testing from Pinch’s 1993 paper6 to the present.7 Instead, it offers my immediate impressions of problems, processes, and practices related to testing and being tested that are taking place while events are still unfolding. That immediacy presents drawbacks but also opportunities. The danger, of course, is that these observations and reflections will soon appear dated. Will my insights pass the test of time? But here we are already in the realm of the sociology of testing — by which test and according to what time? There are many such tests and times.
2 Standing For, Standing Out, and Standing Up
Tests come in many forms, shapes, and sizes.8 They can be, for example, as a trial, a tryout, or a try-on (Formilan & Stark, 2020). There are blood tests and breath tests; tests of identity and tests of paternity; there are audits, but also auditions; wargames to prepare for battle, SATs to see if you have aptitude, and Rorschach tests to assess if you are paranoid. There are Turing tests to see if you are a computer and Captcha tests to see if you are a human. Tests of faith are at least as old as Abraham and Job. My point of departure for analyzing testing and being tested in the time of the pandemic starts with three basic insights from the sociology of testing.
1) A test is frequently a proxy that stands for something.
A test is a critical moment, meaningful because it claims to stand for something (Marres & Stark, 2020, p. 437). The Scholastic Aptitude Test (SAT) purports to be a meaningful guide predicting a given student’s performance in college or university. In place of admitting students and waiting to see how they actually perform in their first college courses (which would itself, as a probationary period, be a kind of trial, albeit quite expensive), a college requires students to take a multiple-choice test lasting a couple of hours. These claims are contestable and, in fact, are contested (Cooper, 2018; and for an excellent general discussion, Carson, 2007). Similarly, does a driving test predict safe practices behind the wheel? Can a multiple-choice exam and a 20 to 30 minute road test represent or stand for good driving on actual road conditions? As soon as one starts to think about how to test the test, one is in the territory of the sociology of testing. Sure, take a random sample (a part that stands for the whole) of people who apply for a driver’s licence. Test them all and then give all of them a license to drive regardless of whether they passed or failed the driving test. Wait three years and see whether passing/failing stood for accident records. No way, say insurance companies, parents, and the proverbial little old ladies at crosswalks.
Sociologists of testing refer to this representational aspect when they point to a test as a proxy. In the canonical paper on the subject, Trevor Pinch (1993, pp. 28–29) writes:
Testing always proceeds by a process of projection. If a scale model of a Boeing 747 airfoil performs satisfactorily in a wind tunnel, we can project that the wing will perform satisfactorily in actual flight. If a microphone works now, we can project that it will work later when being held by Mick Jagger.
Pinch emphasizes that this act of projection crucially depends on establishing a similarity relationship. In a careful and exceptionally clever study of what it means for a test to be representative of the real world, John Downer (2007) offers a detailed, if sometimes gruesome, account of how the US Federal Aviation Administration (FAA) tested new turbojet engine design by launching chickens into engine prototypes securely mounted on outdoor stands. In studying the difficulties of achieving the similarity relationship, Downer documents how the FAA tried all sorts of variations of “birdstrikes” — unplucked, frozen, or rubber chickens; varying by size and weight; volleys launching smaller numbers of larger birds vs larger numbers of smaller birds; speed and angle of launch, and so on. Downer is especially attentive to how the FAA attempted to balance realism and reproducibility. Test designs that more closely resembled real world conditions were changed in favor of more controllable conditions with the latter gaining more importance as the object of testing came to be defined more in terms of comparability. Although still important that the tests be “realistic,” what was being tested was the performance of new engines versus old ones:
When evaluating a new design, rather than approaching it ad novum, and asking, “How reliable will this engine be?”, they ask, “How is this engine different from its predecessors?” Then they will ask, “Will these differences make it more or less reliable than its predecessors?”" (Downer, 2007, p. 20).
To the extent that they are predictive, all tests, even so-called real world tests, are proxies, a part that claims to stand for something. The question is not how to escape the problem of representation, but how to deal with it. Broadly speaking there are two strategies, which in the past more or less mapped to the distinction field test versus laboratory (Gieryn, 2006) and today increasingly maps to field test versus model. What field tests gain in authenticity they lose in accuracy. On the other hand, the controlled conditions of the laboratory (or of the model) allow for reproducibility, but their abstract character makes it more difficult for them to be convincing and appear relevant to the problem at hand (Downer, 2007; Gieryn, 2006). Most schematically, then: whereas field tests make claims to be authentic, convincing, and relevant, laboratory-like models claim to be accurate, controlled, and reproducible.
2) A test is a critical moment that stands out.
As a concerted effort to reveal the properties or capacities of some entity (Marres & Stark, 2020), a test typically takes place in a deliberate and separate setting. In some types, testing occurs in a site that is physically separated from everyday life. A courtroom trial — with its elaborate rules about who can be admitted to the courtroom, who can sit where, and who can speak when — is one such example (Dodier & Barbot, 2016, pp. 308–313). Other types of tests can be distributed, even widely, in spatial terms but are conducted in relatively discrete moments in time. Tests that involve sampling can adopt physical or temporal slices or combinations of both. An audit is conducted at a particular moment in time; but what matters is that, for some prior time period, it selects information and data points for extra scrutiny (Power, 1997).
Tests stand out; they can be pointed to because they are not just ordinary moments. They have a special status. And yet, for many tests (although not for all), the materials selected for examination will have a very distinctive special status precisely because they must be absolutely ordinary. Once selected, they stand out to be assayed, analyzed, and assessed. But they must not be selected because they stand out.9 This is not the case for all types of tests, however. Tests of power, tests of principles, and tests of courage or character can happen in critical moments that are special precisely because they are experienced as extraordinary. Every test is situated. But certain types of critical moments are ones in which someone might think or even say, “Oh, we have a situation here.”10 You have surely been in such moments. They can involve a vague or perhaps acute sense of awkwardness that things are uncomfortably out of joint because it is unclear by which principle the situation is to be assessed, or they can be favorably out of joint exactly for the same reason (Stark, 2011).
3) When someone or something is put to the test, of interest is whether it stands up.
In a press conference on March 11, 2020 (her first public address on the coronavirus outbreak) German chancellor Angela Merkel warned that up to 70% of the population could contract the virus. In so doing, Merkel was the first leader of a major democratic country to provide shocking estimates of the magnitude of the problem. With Lothar Wieler, president of the Robert Koch Institute for public health, at her side and taking pains to emphasize that the information she was sharing came from experts, Merkel soberly told the German people:
This is putting our solidarity, our common sense and our openheartedness for one another to the test. I hope that we will pass it.11
One week later, on March 18, Merkel spoke to the nation in an unprecedented evening address. This time, as if to ensure that no one should miss it, the message of testing was evident in her opening words:
The coronavirus is changing daily life in our country dramatically at the present. Our idea of normality, of public life, social togetherness — all of this is being put to the test as never before… I firmly believe that we will pass this test if all citizens genuinely see this as THEIR task.12
The specific terms varied slightly — solidarity, sense of common purpose, care for one another, normality, public life, social togetherness — but in each instance they were being put to the test. In each case, the moment stands out in history: “Since German reunification, no, since the Second World War there has not been a challenge… so important.” Also, in each case, the challenge stands out as a test. And in each, as well, expressed with a hope, a belief, that “we will pass” the test.
As elaborated in more detail below, a stress test is a special type of diagnostic test. In engineering, for example, prior to actual construction, the struts of a bridge might be subject to extreme conditions to see whether they stand up to temperature, humidity, torques, and stresses. Banks are stress tested.13 And if you have a heart condition, you’ve likely been closely monitored on a treadmill in a cardiac stress test. As I will also explore in further detail below, there is another type of test, like the ones which Angela Merkel invoked, in which a person or institution, indeed a nation, does not simply take a test but is put to the test. Such tests do not stand for anything. These are tests that stand out as the real thing.
Does something stand up to the test? Does someone step up to the challenge and stand up to the test? Have you been tested?14
4 Who or What Stands Out in the Testing?: Three Models of Selection in Testing
If, as I argued in Part 3, tests can stand for something as part of a process of representation, here in Part 4 I look at who or what stands out in the testing. Think, for example, of an exam. When results are binary, some pass, others fail. When results are ranked, some stand out as substandard, some are judged mediocre, others as qualified, and a select few are outstanding. Now, instead of an academic exam, let’s think of an examination, a medical test for the coronavirus. Anticipating my argument, whereas in the previous section I pointed to how a test figures in structures and practices of representation, here I consider how a medical device can serve as a device for selection.
As an organizing framework, I propose three models, roughly corresponding to three phases27 of testing in the current crisis: 1) Individual, 2) Statistical, and 3) Algorithmic. Although in all three phases the unit of observation is at the level of the individual, as we shall see, the unit of analysis, as well as the method of analysis, varies.
Individual. In the first model of selection, the purpose of testing is to identify who is infectious. In the very early phase of the infection, when the virus is not being spread through rampant community transmission, the point is to locate and isolate those persons who carry the virus so that others with whom they have been in recent contact can be notified and, if necessary, also isolated. As in all models, the unit of observation is the individual, but here, the individual is also the unit of analysis, singularly identified. Test results are positive or negative. What stands out as a result of testing are named individuals.
Statistical. The purpose of testing in the statistical model of selection is to understand how the virus is distributed. In this phase when the infection is pandemic, test results are positive or negative for individuals, but the results of testing are analyzed at the population level. Data gathered from doctors’ and hospital visits (and later from drive-through or other emergency testing facilities) are analyzed by epidemiologists to identify properties of the virus — how contagious or fatal, for example (Ganyani et al., 2020; Li et al., 2020; Verity et al., 2020). As of this writing, public health officials are launching nation-wide targeted and random sample surveys, typically involving tens of thousands of people followed up for as long as a year.28 Using tests to detect the presence of the virus29 (and hence infectiousness) combined with serological tests to detect the presence of antibodies (i.e., having been infected and hence the possibility of immunity), the purpose of these surveys is to understand how great is the spread of the viral infection and the strength and length of immunity (as well as the distribution of infection and immunity according to various demographic30 and medical factors). Meanwhile, data collected by states and provinces,31 by university studies, and even by firms32 function as tests of tests to get a handle on the rate of false positives and false negatives, for example, among the many different serological tests currently in experimental use.
Like testing in the individual model of selection, here too in the statistical model, the unit of observation is the individual. But differently in this model, surveys and other programs are aggregating tests taken from individuals in order to conduct analysis using statistical methods at the population level. At its basis is an actuarial approach to risk calculation33 in order to guide decision-makers for public management of the pandemic, for example, through periodic enforcement of lockdown followed by relaxation of social distancing in waves or cycles34 over an extended period.
Algorithmic. Readers familiar with Michel Foucault’s (1995; 2007) work will recognize that my differentiation of testing as Individual (in phase 1) and Statistical (in phase 2) loosely corresponds to Foucault’s contrast between the models of the plague (individualizing discipline and exclusion) and that of smallpox (operating at the level of “population”). The ideas are similar but not the same, in part because I am focusing on testing, but also because I am interested in making a contrast to the possibility of a new model, one that Foucault could not have studied.35 As we shall see, testing in the phase that we are now entering in the COVID-19 pandemic has superficial resemblance to the earlier phases, but I will argue, it represents a new model of testing — algorithmic testing.
Whereas what stands out from the test in the statistical model identifies characteristics of the virus/pandemic at the population level, the algorithmic model is once again about identifying individuals — yet without simply returning to the methods and purposes of the individual model characteristic of phase 1. First, in addition to selecting those who stand out as infected, testing in this model also promises to select those who stand out as immune. Second, the methods of selection are different — turning from statistical methods to algorithmic predictions. And, third, the process of identification might or might not result in naming the individuals who are classified, whether as infected or as immune.
We begin with Digital Contact Tracing (DCT). Traditional public health methods have long used testing to identify carriers of dangerous communicable diseases, conventionally employing agents who questioned the infected individual and manually notified those with whom they had been in contact during some prior period. DCT operates differently. Various programs are currently under development.36 What they have in common is the use of an app for a smartphone (or bracelet) with Bluetooth technology that detects encounters, timestamps these, and stores them for three weeks or so. Note that the system architecture does not require GPS location data to be collected. Instead, devices that come near each other would share pseudonymized IDs (Lomas, 2020a; Pueyo, 2020). A person who tests positive for the coronavirus must report this to the system, which in turn automatically sends a message to those whose pseudonymized IDs are registered as having encountered the infected person.37 Without naming the infected person, the message instructs the recipient to get in touch with health authorities because they have been in contact with someone who has positively tested for the virus.
Before turning to selection on the basis of immunity, a word about how DCT differs from the more conventional statistical approach of epidemiology in phase 2. If DCT is introduced (and that is a very big if), it would be one of the most significant uses of digital technology since the network architecture that gave birth to the internet. One might argue that the distinctive figure of digital technology is that of the network. DCT certainly builds on network concepts and network architecture, but it explicitly highlights a different figure — the trace. By contrast, the basic figure of Phase 2 epidemiology — consistent with its roots in bureaucratic accounting and control — is the form, the standardized protocol that must be filled out and filed.38 With its bureaucratic boxes, the form is very different from the trace. Moreover, a trace is not a trail. One can follow a trail that was paved by others. A trace can follow the one who uniquely produced it. In its privacy-by-design mode, the network structure39 of digital contact tracing epitomizes the ethos of a laterally coordinated, technological mode of governance in which actors are individuated yet anonymous.
The results of testing for immunity presents something very different altogether — different not only technically and conceptually but also in its consequences. As we shall see, it selects persons into a system of classification (Fourcade & Healy, 2013; Fourcade, 2019). In serological tests for the presence of antibodies a positive test results can be interpreted to stand for immunity.40 That claim is contested, not least by the World Health Organization. Nonetheless, various proposals are being seriously considered (and in the Chilean case, recently adopted) in which such tests form the basis for “immunity passports” or “immunity certificates” (Wighton & Chazan, 2020; Proctor, Sample, & Oltermann, 2020; Gruener, 2020). Verified by health authorities and carried on one’s smartphone — perhaps as a QR code that could be easily scanned by an interested party — these digital credentials would allow individuals to share their COVID-19 status. The results of testing, in this case, provides for the selection, the naming, the certification of individuals with a distinctive health status that could (depending on the protocols) allow them employment, access to restaurants and places of entertainment, freedom of movement to meet the vulnerable or for intimate relations, and so on. The results of testing for immunity thus selects persons into a system of classification.
To date, no European or North American country has adopted an immunity passport system. But a tripartite classification system purporting to signal one’s riskiness to others has been in operation in China since mid-February 2020 when the “Alipay Health Code” was launched in Hangzhou, China, home to many of China’s biggest tech companies. The system assigns users one of three colored QR codes, indicating different levels of risk the person poses to public health.41 Persons with a green code are free to move around the city; yellow code holders are scrutinized and have restricted movement; and those with a red code are detained. The QR code is required for use of the subway, entering or exiting certain highways, and can even be used for entrance to or exit from the gated communities in which many Chinese live.
The system was developed for the city of Hangzhou by Ant Financial, an affiliate of e-commerce giant Alibaba, and can be accessed through the Alipay payment app. Initially adopted in three provinces with a total population of nearly 180 million (Hu, 2020), by the beginning of March it had been adopted in more than 200 cities across the country (Mozur, Zhong, & Krolik, 2020). Comparable programs now reside on TenCent’s ubiquitous app WeChat. Complaints in the Chinese press indicate that problems in the system range from fraud on the side of users to mistaken assignment of yellow or red codes on the side of the system.42 Given the large numbers involved, a 1% misassignment rate involves millions of people, and even smaller error rates are not inconsequential to tens or even hundreds of thousands.
According to Alipay and Chinese officials, risk levels for the Health Code are determined by test results and the user’s self-reported symptoms but also by big-data analytics of factors such as travel history, relationship to potential carriers of the virus, and time spent in an outbreak-stricken area (Hu, 2020). Details about how the system classifies people remain obscure (Mozur, Zhong, & Krolik, 2020). A major question today is how the Health Code will relate to the Chinese Social Credit System (CSCS) currently being rolled out in many variants throughout the country and involving the same online payment systems that are producing the QR health codes for COVID-19 monitoring.43 The promise of the CSCS is that it purports to measure not only one’s credit worthiness but to measure many vectors of activity in many different domains in which one could be credited as worthy. But a system for measuring correlates of worthiness could also be one for measuring riskiness.44
A system that combined the Health Code, the Social Credit System, and “close contact detection” (as digital contact tracing is known in China) could not only categorically classify the infectious and the immune but also produce a score of the variable level of riskiness45 which could be used to monitor movement, employment, and such. Instead of a test that stands out as a discrete moment in time and in a deliberate and separate setting, this is testing which is ubiquitous and continuous (Marres & Stark, 2020, pp. 433–437). More importantly, it would be based on a different regime of testing — algorithmic prediction.46
From using test results that measure whether you are infected (and hence at risk to others) or using results from DCT that measure whether you have been in contact with someone who is infectious, such ongoing testing would attempt to measure whether you engage in risky behavior as well as whether you associate with others who engage in risky behavior (and hence are at greater risk of behavior that could bring harm to others if you are infected but asymptomatic). But algorithmic prediction need not (and already in some cases it does not) stop there. Do you have personality traits that are predictive of risky behavior? And then a further step: Do you exhibit behavior that correlates with personality traits that are predictive of risky behavior?
In the ongoing, seamless testing of algorithmic prediction, the test subject is in a state combining ignorance and awareness of being tested that Bucher (2017) describes as the “algorithmic imaginary.” Such testing is already under development in producing risk profiles in the insurance industry.47 Particularly instructive is the case of Admiral, the UK insurance company. News got out in November 2016 that Admiral was using data from social media accounts to price driver’s insurance for first-time car owners. Admiral’s algorithm uses data from social media to make a personality assessment and then — on the basis of correlations between behavior on social media and actual claims data — analyzes the risk of insuring the driver. The story involved the company’s “firstcarquote” project that analyzed users’ habits in posts and likes on Facebook (Ruddick, 2016a). Those who write in concise sentences, use lists, and arrange to meet friends at a specified time and place rather than “tonight” were thought to be conscientious, safer drivers. More reckless, overconfident drivers, by contrast, show up in the use of phrases such as “never” or “always” rather than “maybe” and in the excessive use of exclamation marks [!!!] in their social media posts (Ruddick, 2016a; Wu, 2020).
As the story broke, Facebook pulled the plug on the project (Ruddick, 2016b). But Admiral continues to work with VisualDNA, a firm that uses algorithmic prediction to assess personality from online behavior (Fisher, 2017). We can’t know whether Alipay or WeChat are doing algorithmic prediction using big-data analytics of social media habits in making assessments of public health riskiness, but they have no need for cooperation with Facebook or other providers since those data (and much more) are already in their systems.
Will individualized health security “tests” and their resulting certificates based on machine learning be adopted in western democracies? Perhaps serological tests will prove unreliable thereby increasing the role of machine learning techniques to predict riskiness based on other behavioral factors. It is too early to assess the rollouts of digital tracing apps and individualized health security programs. Perhaps the various apps currently under development by Apple and Google, as well as by Alibaba and TenCent, will come to be widely adopted in the current pandemic. Or perhaps they will be tested now for application in pandemics yet to come. If so, in either case, we will see a move to a new model of testing: from statistical calculation of risk in a population to algorithmic prediction about the riskiness of particular persons.
5 Tests and Accountability: Who or What Stands Up to the Test?
In October 2016, government and public health officials from the national, regional, and local levels gathered for a three-day simulation conducted for the UK’s National Health System (NHS). The stress test, code-named Exercise Cygnus, was designed to test the preparedness of the hospital and health care system in response to a pandemic brought about by a virus that spreads rapidly and kills by causing acute respiratory illness.48 The exercise was mentioned in an official report (Swindells, 2017) presented at the Board meeting of NHS England in March 2017 as well as at a meeting on June 2017 of the New and Emerging Respiratory Virus Threats Advisory Group (Nervtag) which advises the government on pandemics (Doward, 2020a). The findings of the stress test were effectively buried: little to nothing was known about them until, in the midst of the coronavirus pandemic, journalists broke the story (Nuki, 2020; Doward, 2020a; Gardner & Nuki, 2020) that the stress test had exposed that British hospitals and medical care delivery would be quickly overwhelmed by a severe viral outbreak because of shortages of personal protective equipment, critical care beds, and morgue capacity.
The Exercise Cygnus stress test revealed that UK health system was unprepared. Viewed from the sociology of testing, however, the question is less the test results than what resulted from the test: in addressing the consequences of the test, in addition to the questions that it settled we must be especially attentive to the new controversies that it generates (Robinson, 2016; Marres & Stark, 2020). Whereas in Part 3 we examined the aspect of testing as standing for (testing and representation) and in Part 4 we analyzed what stands out from the test (testing and selection), here in Part 5 we study who or what stands up to the test in order to address the problem of testing and accountability.
Stress tests are a good way to begin the discussion about accountability. At first appearance it might seem that a stress test — as a simulation — is quite different from being put to the test in a moment of crisis. As a proxy, the stress test stands for something but should not be confused with the real thing, at least not in the way that being “tested” under stress in real time (“trial by fire” goes the phrase) is very much the real thing. As we shall see, the comparison gets more interesting when — instead of starting from the distinction of proxy versus “real thing” — we look at the distinction between test results and what results from the test.
The insight starts with the quite obvious observation that an organization might fail the stress test but subsequently pass the “ultimate test” of coping with actual calamity. That could be so, for example, because the organization learned from the inadequacies that were revealed by the stress test. Alternatively, the organization could pass the stress test and fail in a real crisis, perhaps because the rehearsal, the drill, the stress test did not simulate or provoke conditions that were stressful enough to expose weaknesses. In the latter case, the results of the test are complacency; in the former, an opportunity for reflexivity and a call to action.
It might seem that being tested in the real world is different. To fail the proxy (the simulated stand in), gives leadership the chance to learn; but failing the real-world test gives the public the chance to learn as it makes its judgment. It would be perfect,49 but accountability is not that simple. The difference between who learns is true, but that difference does not rule out the problem that there is no automatic and direct transmission between test results and the results of the test. Just as for the stress test, so for accountability in being put to the test, the thing that matters is the consequences of the test.
Let’s return to the NHS stress test. In this case the problem begins already from the fact that the test results themselves remain secret. Why did Theresa May’s Tory government choose not to release the findings of Exercise Cygnus? A former senior government official with direct involvement in the exercise was quoted by The Sunday Telegraph (a newspaper generally known for its favorable view of Tory policies): “There has been a reluctance to put Cygnus out in the public domain because frankly it would terrify people” (Nuki, 2020). To be fair to the government, one could understand that “failing” a stress test might be difficult to explain to the public. In fact, one could argue that a stress test that did not result in a breakdown of the health system was inadequate because it did not include enough extreme conditions in its simulations. That is, a stress test could fail as a stress test because it did not induce enough failures.
The problem of an organization passing a failed test is not unique to health care systems. It can be seen, for example, in the Challenger space shuttle disaster (Vaughan, 1996), and it also shows up acutely in the use of stress tests for regulation as in the case of banking. Financial stress tests of banks, as Nathan Coombs (2020) argues, serve two purposes. One is to identify weaknesses in the structure of the bank’s assets and liabilities with the aim of making corrections that protect depositors and creditors as well as the stability of the financial system. The other is to assure the public that a particular bank and/or the banking system is sound, thereby mitigating the risk of a run on the bank in a time when the bank, or the banking system in general, are under stress.50 As Coombs (2020) demonstrates, these learning versus legitimating functions need not be in contradiction, but they do require real skill in both educating the public and managing an effective performance.
The report to the NHS England Board on March 2017 indicates that the government gave priority to the legitimation function. It refers to the Cygnus stress test without mentioning any findings, noting any changes, or making any specific recommendations; and it concludes the “Summary and recommendations to the Board” by inviting the Board to “Receive assurance that NHS England and the NHS in England are prepared to respond to an emergency and have resilience in relation to the continued provision of safe patient care” (Swindells, 2017, p. 6). Why does Boris Johnson’s government not disclose the test results today? Whether the NHS passed or failed the Cygnus stress test in 2016 is, of course, no longer the question. The issue is whether the Johnson government corrected shortcomings revealed by the test as part of its pandemic emergency preparedness planning and, once alerted to the coronavirus, how its policies took those findings into account. Government officials refuse to discuss the Cygnus test, but calls for a public inquiry and a pending lawsuit by an NHS doctor on behalf of thousands of NHS medical workers51 might keep these issues before the public.
As I mentioned in the opening paragraphs of this essay, in the past months we have heard much about communities, organizations, leaders, and indeed entire nations facing a “stress test.” At one level, the use of the term is careless: a stress test is a very particular kind of test. As a simulation, it is a kind of prediction that, because deliberately designed to identify weakness, is intended as a means to avert the predicted.52 Sometimes, of course, the term is used appropriately: how a leader responds to a small crisis, for example, could be seen as a proxy (a prediction via projection) about how she will behave in a major crisis sometime in the future. But the slippage of usage is understandable because often one senses that the term is not meant to be taken so literally and, in fact, has a quite different referent — meaning that someone or something has been tested under stress. It is this kind of test that one thinks of when one uses or hears expressions such as “put to the test” or “truly tested.” Whereas failing a stress test is provisional (because what really matters is whether you learn from it), failure or passing in the case of an “ultimate” test is definitive, end of story. Not at all a try-out, a try-on, or a trial run, it seems more like a trial yielding a judgment.
But is accountability really like a trial? We might like it to be so, but, as a matter of public judgment, being put to the test turns out, despite all the differences, to share some similarities to the stress test. One can pass or fail, but still what matters are the consequences.
In the case of the COVID-19 pandemic, who or what is being put to the test? And by what measures? A graph, featured in a recent issue of the South China Morning Post, Hong Kong’s newspaper of record, owned by the Alibaba Group, suggests one test. The interactive graph invites the online reader to make direct comparisons between countries. The default comparison, presented here in Figure 1 shows cumulative coronavirus cases for Hong Kong, China, and the United States, and prominently displays the disparity — even without per capita figures in which the disparity would appear even greater. The graph scarcely needs a commentary: whether by number of cases, or number of deaths, absolute, or per capita, the United States has failed the test in a head to head test with China.
But perhaps you doubt the figures from China. If so, The Financial Times will allow you to make country by country comparisons for almost any countries in the world. You might start by comparing the United States to an East Asian democracy: the USA has almost 2000 times more new deaths than South Korea (49 times more total deaths on a per capita basis). Closer to home, for striking differences of similar magnitude, compare COVID-19 death rates in New York State to Washington State or New York City to San Francisco.
But why should counts of cases or deaths be the measure for accountability? One might fail the test of number of deaths, but still not be held lacking if the public looks to other factors such as effectiveness in mobilizing resources once the crisis was in evidence or expressions of compassion to help heal a wounded city or state. If you are not already thinking of Governor Cuomo, recall that he suggests that the measure is not the number of deaths but the number of lives saved — reminding us that models (themselves kinds of stress tests) are part and parcel of the reality being tested and can figure not merely as counts but also in accountings.53
Others will claim that the real test is getting the economy moving again, pointing not to counts of deaths but to counts of jobs and GDP. To this respond economists (frequently writing together with epidemiologists) who say that public health and national wealth are not in trade off (Barbera, Dowdy, & Papageorge, 2020) and all manner of debates involving cost benefit analysis and how to measure the value of a human life (Greenstone & Nigam, 2020). Still others argue that nations and leaders can pass the test of reducing the loss of life as well as the test of minimizing the depth of economic recession, but they fail the true test if this comes at the cost of liberties. But then the question is which concerns over which liberties? Is it liberals’ concerns about surveillance, privacy, and the Fourth Amendment; or the concerns about freedom of religion, freedom of assembly, and freedom to control one’s body motivating the recent demonstrations of megachurch ministers, gun-toting defenders of the Second Amendment, and anti-vaccine activists?
Thus, going from counts to accountability is not so simple. Intervening between them is the big question of what counts? (Stark, 2020). When there are many values and principles in play, the person or institution being “tested” will be judged as passing by some test but could be held as failing by others. The problem of a crisis is that it is an extreme “situation” that stands out and calls out for a test. But unlike personal situations in which the structure of the setup makes it more likely that some principle of evaluation will be more likely to be in effect to frame the accountings and yield a judgment,54 the court of public opinion is not at all a trial.
The recent experience in the United States (recall that the impeachment proceedings of Donald Trump was literally a “trial”), gives one pause about whether trials and formal inquiries are the best venue for rendering public officials accountability. For Trump there is only one number that counts: a majority vote — in the Electoral College. Here the comparison to Angela Merkel is telling. Whereas the only test that Donald Trump passed was that of a failed test (the impeachment trial), he might yet be re-elected President of the United States. More effective in managing the crisis, Angela Merkel could not manage her own conservative party and likely will not remain in office as Chancellor of the German Federal Republic.
Is accountability unattainable? If our leaders and also our organizations sometimes fail the test and yet remain unaccountable, shall we despair and thereby fail to try? The problem of stress tests and tests in times of stress is that all the counts, countings, and accountings make accountability difficult when there are so many different answers to the question “what counts?” But that is not a failure. It is not a weakness but is, in fact, the strength of our societies.55 The danger, and especially in times of crisis, is to use a single yardstick, a unitary principle, a common and singular organizing value whether that one logic be market, politics, religion or whatever. What crisis puts to the test is the fundamental value of liberal democracies that there should not be a single test of accountability.
References
Adams, J. (2020). What Are COVID-19 Models Modeling? The Society Pages. April 8, 2020. https://thesocietypages.org/specials/what-are-covid-19-models-modeling/
Ali, M., & Ali, H.Y. (2015). The Soul of a Butterfly: Reflections on Life’s Journey. New York: Random House.
Bach, J. (2020a). The Red and the Black: China’s Social Credit Experiment as a Total Test Environment. British Journal of Sociology, 71(3), 489–502. https://doi.org/10.1111/1468-4446.12748
Bach, J. (2020b). Merit, Morality, and Market: The Chinese Social Credit Experiment. In D. Stark (Ed.), The Performance Complex: Competition and Competitions in Social Life. Oxford, UK: Oxford University Press (in press).
Barbera, R.J., Dowdy, D.W., & Papageorge, N.W. (2020). Economists and Epidemiologists, Not at Odds, but in Agreement: We Need a Broad Based COVID-19 Testing Survey. Johns Hopkins University, Coronavirus Resource Center. https://coronavirus.jhu.edu/from-our-experts/economists-and-epidemiologists-not-at-odds-but-in-agreement-we-need-a-broad-based-covid-19-testing-survey
Boltanski, L., & Thévenot, L. (1999). The Sociology of Critical Capacity. European Journal of Social Theory, 2(3), 359–377. https://doi.org/10.1177/136843199002003010
Bucher, T. (2017). The Algorithmic Imaginary: Exploring the Ordinary Affects of Facebook Algorithms. Information, Communication & Society, 20(1), 30–44. https://doi.org/10.1080/1369118X.2016.1154086
Bui, Q., Katz, J., Parlapiano, A., & Sanger-Katz, M. (2020). What 5 Coronavirus Models Say the Next Month will Look Like. The New York Times, April 22. https://www.nytimes.com/interactive/2020/04/22/upshot/coronavirus-models.html
Callon, M. (1986a). The Sociology of an Actor-Network: The Case of the Electric Vehicle. In M. Callon, J. Law & A. Rip (Eds.), Mapping the Dynamics of Science and Technology. London: Palgrave Macmillan.
Callon, M. (1986b). Some Elements of a Sociology of Translation: Domestication of the Scallops and the Fishermen of St. Brieuc Bay. In J. Law (Ed.), Power, Action, and Belief: A New Sociology of Knowledge? London: Routledge.
Callon, M., & Latour, B. (1981). Unscrewing the Big Leviathan: How Actors Macro-Structure Reality and How Sociologists Help Them to Do So. In K. Knorr-Cetina & A.V. Cicourel (Eds.), Advances in Social Theory and Methodology: Toward an Integration of Micro- and Macro-Sociologies (pp. 277–303). London: Routledge & Kegan Paul.
Canca, C. (2020). Why “Mandatory Privacy-Preserving Digital Contact Tracing” is the Ethical Measure against Covid-19. Medium, April 10. https://medium.com/@cansucanca/why-mandatory-privacy-preserving-digital-contact-tracing-is-the-ethical-measure-against-covid-19-a0d143b7c3b6
Carson, J. (2007). The Measure of Merit: Talents, Intelligence, and Inequality in the French and American republics, 1750–1940. Pronveton, NJ: Princeton University Press.
Cevolini, A., & Esposito, E. (2020). From Pool to Profile: Social Consequences of Algorithmic Prediction in Insurance. Big Data & Society, forthcoming.
Chazan, G., & Mancini, D. (2020). Germany to Run Europe’s First Large-Scale Antibody Test Programme. Financial Times, April 9. https://www.ft.com/content/fe211ec7-0ed4-4d36-9d83-14b639efb3ad
Coombs, N. (2020). What do Stress Tests Test? Experimentation, Demonstration, and the Sociotechnical Performance of Regulatory Science. British Journal of Sociology, 71(3), 520–536. https://doi.org/10.1111/1468-4446.12739
Cookson, C. (2020). UK Coronavirus Study to Test 300,000 People for Infection and Immunity. Financial Times, April 23. https://www.ft.com/content/b71adf22-33d1-47e7-8dd8-c808cb710690
Cooper, P. (2018). What Predicts College Completion? High School GPA Beats SAT Score. Forbes, June 11. https://www.forbes.com/sites/prestoncooper2/2018/06/11/what-predicts-college-completion-high-school-gpa-beats-sat-score/#7f122d624b09
Dodier, N., & Barbot, J. (2016). The Force of Dispositifs. Annals HSS (English Edition), 71(2), 291–317. https://doi.org/10.3917/anna.712.0421
Downer, J. (2007). When the Chick Hits the Fan: Representativeness and Reproducibility in Technological Tests. Social Studies of Science, 37(1), 7–26. https://doi.org/10.1177/0306312706064235
Doward, J. (2020a). Government under Fire for Failing to Act on Pandemic Recommendations. The Guardian, April 19. https://www.theguardian.com/world/2020/apr/19/government-under-fire-failing-pandemic-recommendations
Doward, J. (2020b). If Ministers Fail to Reveal 2016 Flu Study They “Will Face Court.” The Guardian, April 26. https://www.theguardian.com/uk-news/2020/apr/26/doctor-sue-results-operation-cygnus
Esposito, E. (2013a). The Structures of Uncertainty: Performativity and Unpredictability in Economic Operations. Economy and Society, 42(1), 102–129. https://doi.org/10.1080/03085147.2012.687908
Esposito, E. (2013b). Economic Circularities and Second-Order Observation: The Reality of Ratings. Sociologica, 7(2), 1–10. https://doi.org/10.2383/74851
Esposito, E. (2019). The Future of Prediction. From Statistical Uncertainty to Algorithmic Forecast. Manuscript, Bielefeld University.
European Commission. eHealth Network. (2020). Mobile Applications to Support Contact Tracing the EU’s Fight Against COVID-19: Common EU Toolbox for Member States. https://ec.europa.eu/health/ehealth/key_documents_en
Ferguson, N.M., Laydon, D., Nedjati-Gilani, G., Imai, N., Ainslie, K., Baguelin, M., Bha- tia, S., Boonyasiri, A., Cucunubá, Z., Cuomo-Dannenburg, G., & Dighe, A. (2020). Impact of Non-Pharmaceutical Interventions (NPIs) to Reduce COVID-19 Mortality and Healthcare Demand. Report 9. Imperial College COVID-19 Response Team, London, March 16. https://doi.org/10.25561/77482
Fisher, T. (2017). Social Media Intelligence and Profiling in the Insurance Industry. Medium. April 24. https://medium.com/privacy-international/social-media-intelligence-and-profiling-in-the-insurance-industry-4958fd11f86f
Formilan, G., & Stark, D. (2020). Underground Testing: Name-Altering Practices as Probes in Electronic Music. British Journal of Sociology, 71(3), 572–589. https://doi.org/10.1111/1468-4446.12726
Foucault, M. (1995). Discipline and Punish: The Birth of the Prison. New York: Vintage.
Foucault, M. (2007). Security, Territory, Population: Lectures at the Collège de France, 1977–1978. New York: Springer.
Fourcade, M. (2019). Ordinal Citizenship. British Journal of Sociology. Lecture at the London School of Economics, October 25.
Fourcade, M., & Healy, K. (2013). Classification Situations: Life-Chances in the Neoliberal Era. Accounting, Organizations and Society, 38(8), 559–572. https://doi.org/10.1016/j.aos.2013.11.002
Ganyani, T., Kremer, C., Chen, D., Torneri, A., Faes, C., Wallinga, J., & Hens, N. (2020). Estimating the Generation Interval for COVID-19 Based on Symptom Onset Data. medRxiv. March 8, 2020. https://doi.org/10.1101/2020.03.05.20031815
Gardner, B., & Nuki, P. (2020). England’s Deputy Chief Medical Officer “JVT” Warned Ministers about PPE Three Years Ago. The Telegraph, April 26. https://www.telegraph.co.uk/news/2020/04/26/englands-deputy-chief-medical-officer-warned-ministers-britain/
Giamio, Cara. 2020. The Spiky Blob Seen Around the World. The New York Times, April 1. https://www.nytimes.com/2020/04/01/health/coronavirus-illustration-cdc.html
Gieryn, T.F. (2006). City as Truth-Spot: Laboratories and Field-Sites in Urban Studies. Social Studies of Science, 36(1), 5−-38. https://doi.org/10.1177/0306312705054526
Greenstone, M., & Nigam, V. (2020). Does Social Distancing Matter? Working Paper no. 2020-6: University of Chicago, Becker Friedman Institute for Economics. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3561244
Gruener, D. (2020). Immunity Certificates: If We Must Have Them, We Must Do It Right. White paper 8, April 20: Edmond J. Safra Center for Ethics, Harvard University. https://ethics.harvard.edu/files/center-for-ethics/files/12immunitycertificates.pdf
Hart, V., Siddarth, D., Cantrell, B., Tretikov, L., Eckersley, P., Langford, J., Leibrand, S., Kakade, S., Latta, S., Lewis, D., Tessaro, S., & Weyl, G. (2020). Outpacing the Virus: Digital Response to Containing the Spread of COVID-19 While Mitigating Privacy Risks. White paper, April 3: Edmond J. Safra Center for Ethics, Harvard University. https://ethics.harvard.edu/outpacing-virus
Horowitz, J. (2020). In Italy, Going Back to Work May Depend on Having the Right Antibodies. The New York Times, April 4. https://www.nytimes.com/2020/04/04/world/europe/italy-coronavirus-antibodies.html
Hu, M. (2020). Beijing Rolls Out Colour-Coded QR System for Coronavirus Tracking despite Concerns over Privacy, Inaccurate Ratings. South China Morning Post, March 2. https://www.scmp.com/print/tech/apps-social/article/3064574/beijing-rolls-out-colour-coded-qr-system-coronavirus-tracking
Jacobs, A. (2020). F.D.A. Approves First Antigen Test for Detecting the Coronavirus. The New York Times, May 9. https://www.nytimes.com/2020/05/09/health/antigen-testing-fda-coronavirus.html
Kissler, S.M., Tedijanto, C., Goldstein, E., Grad, Y.H., & Lipsitch, M. (2020). Projecting the Transmission Dynamics of SARS-CoV-2 through the Postpandemic Period. Science, April 14. https://science.sciencemag.org/content/early/2020/04/24/science.abb5793
Lakoff, A. (2015). Real-Time Biopolitics: The Actuary and the Sentinel in Global Public Health. Economy and Society, 44(1), 40–59. https://doi.org/10.1080/03085147.2014.983833
Lakoff, A. (2017). Unprepared: Global Health in a Time of Emergency. Oakland, CA: University of California Press
Latour, B. (1988). The Pasteurization of France. Cambridge, MA: Harvard University Press. (Original work published 1984, Paris: Editions A.M. Métailié).
Latour, B. (1999). Pandora’s Hope: Essays on the Reality of Science Studies. Cambridge, MA: Harvard University Press.
Latour, B. (2004). Politics of Nature: How to Bring the Sciences into Democracy. MA: Harvard University Press (Original work published 1999, Paris: Editions La Découverte).
Laurencin, C.T., & McClinton, A. (2020).The COVID-19 Pandemic: A Call to Action to Identify and Address Racial and Ethnic Disparities. Journal of Racial and Ethnic Health Disparities, April 18. https://doi.org/10.1007/s40615-020-00756-0
Li, Q., Guan, X., Wu, P., et al. (2020). Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia. New England Journal of Medicine, 382, 1199–1207. https://doi.org/10.1056/NEJMoa2001316
Lomas, N. (2020a). EU lawmakers Set Out Guidance for Coronavirus Contacts Tracing Apps. Techcrunch, April 16. https://techcrunch.com/2020/04/16/eu-lawmakers-set-out-guidance-for-coronavirus-contacts-tracing-apps/
Lomas, N. (2020b). Germany’s COVID-19 Contacts Tracing App to Link to Lab for Test Result Notification. Techcrunch, April 23. https://techcrunch.com/2020/04/23/germanys-covid-19-contacts-tracing-app-to-link-to-labs-for-test-result-notification/
MacKenzie, D. (1989). From Kwajalein to Armageddon? Testing and the Social Construction of Missile Accuracy. In D. Gooding, T. Pinch & S. Schaffer (Eds.), The Uses of Experiment: Studies in the Natural Sciences (pp. 409–436). Cambridge: Cambridge University Press.
MacKenzie, D.A. (2006). An Engine, Not a Camera: How Financial Models Shape Markets. Cambridge, MA: MIT Press.
Markortoff, K. (2020). Workers without Degrees Hardest Hit by Covid-19 Crisis. Study. The Guardian, April 20. https://www.theguardian.com/business/2020/apr/20/uk-workers-without-degrees-face-deeper-job-insecurity-amid-coronavirus-pandemic
Marres, N., & Stark, D. (2020). Put to the Test: For a New Sociology of Testing. British Journal of Sociology, 71(3), 423–443. https://doi.org/10.1111/1468-4446.12746
Mattern, S. (2020). Andrew Cuomo’s COVID-19 Briefings Draw on the Persuasive Authority of PowerPoint. Artnews.com, April 13. https://www.artnews.com/art-in-america/features/andrew-cuomo-covid-briefings-powerpoint-slideshow-authority-1202683735/
Mozur, P., Zhong, R., & Krolik, A. (2020). In Coronavirus Fight, China Gives Citizens a Color Code, with Red Flags. The New York Times, March 1. https://www.nytimes.com/2020/03/01/business/china-coronavirus-surveillance.html?searchResultPosition=2
Navon, D., & Eyal, G. (2016). Looping Genomes: Diagnostic Change and the Genetic Makeup of the Autism Population. American Journal of Sociology, 121(5), 1416–1471. https://doi.org/10.1086/684201
Neff, G., & Stark, D. (2004). Permanently Beta: Responsive Organization in the Internet Era. In P.N. Howard & S. Jones (Eds.), Society Online: The Internet in Context (pp. 173–188). Thousand Oaks, CA: SAGE.
Nelkin, D., & Tancredi, L. (1989). Dangerous Diagnostics: The Social Power of Biological Information. New York: Basic Books.
Nielsen, R.K. (2012). Ground Wars: Personalized Communication in Political Campaigns. Princeton, NJ: Princeton University Press.
Nuki, P. (2020). Exercise Gygnus Uncovered: The Pandemic Warnings Buried by the Government. The Sunday Telegraph, March 28. https://www.telegraph.co.uk/news/2020/03/28/exercise-cygnus-uncovered-pandemic-warnings-buried-government/
Ozgode, O. (2015). Governing the Economy at the Limits of Neoliberalism: The Genealogy of Systemic Risk Regulation in the United States, 1922–2012. Dissertation, Columbia University.
Pinch, T. (1993). Testing — One, Two, Three… Testing!: Toward a Sociology of Testing. Science, Technology, & Human Values, 18(1), 25–41. https://doi.org/10.1177/016224399301800103
Power, M. (1997). The Audit Society: Rituals of Verification. Oxford: Oxford University Press.
Pueyo, T. (2020). How to Do Testing and Contact Tracing. Medium, April 28. https://medium.com/@tomaspueyo/coronavirus-how-to-do-testing-and-contact-tracing-bde85b64072e
Proctor, K., Sample, I., & Oltermann, P. (2020). “Immunity Passports” Could Speed Up Return to Work after Covid-19. The Guardian, March 30. https://www.theguardian.com/world/2020/mar/30/immunity-passports-could-speed-up-return-to-work-after-covid-19
Resnick, B. (2020). Why It’s So Hard to See into the Future of Covid-19. Vox, April 18. https://www.vox.com/science-and-health/2020/4/10/21209961/coronavirus-models-covid-19-limitations-imhe
Resnick, B., & Irfan, U. (2020). What Immunity to Covid-19 Might Actually Mean. Vox, April 23. https://www.vox.com/science-and-health/2020/4/23/21219028/covid-19-immunity-testing-reinfection-antibodies-explained
Roberts, S. (2020). This is the Future of the Pandemic. The New York Times, May 8. https://www.nytimes.com/2020/05/08/health/coronavirus-pandemic-curve-scenarios.html
Robinson, J.H. (2016). Bringing the Pregnancy Test Home from the Hospital. Social Studies of Science, 46(5), 649–674. https://doi.org/10.1177/0306312716664599
Rosental, C. (2013). Toward a Sociology of Public Demonstrations. Sociological Theory, 31(4), 343–365. https://doi.org/10.1177/0735275113513454
Ruddick, G. (2016a). Admiral to Price Car Insurance Based on Facebook Posts. The Guardian, November 2. https://www.theguardian.com/technology/2016/nov/02/admiral-to-price-car-insurance-based-on-facebook-posts
Ruddick, G. (2016b). Facebook Forces Admiral to Pull Plan to Price Car Insurance Based on Posts. The Guardian, November 2016. https://www.theguardian.com/money/2016/nov/02/facebook-admiral-car-insurance-privacy-data
Sánchez-Páramo, C. (2020). COVID-19 Will Hit the Poor Hardest. Here’s What We Can Do about It. World Bank Blog, April 23. https://blogs.worldbank.org/voices/covid-19-will-hit-poor-hardest-heres-what-we-can-do-about-it
Sarac-Lesavre, B., & Laurent, B. (2019). Stress-Testing Europe: Normalizing the Post-Fukushima Crisis. Minerva, 57(2), 239–260. http://doi.org/10.1007/s11024-018-9362-4
Sarasin, P. (2020). Understanding the Coronavirus Pandemic with Foucault? foucaultblog, Universität Zürich, March 31. http://dx.doi.org/10.13095/uzh.fsw.fb.254
Setzer, E. (2020). Contact-Tracing Apps in the United States. Lawfare, May 6. https://www.lawfareblog.com/contact-tracing-apps-united-states
Siegel, E. (2016). Predictive Analytics: The Power to Predict Who Will Click, Buy, Lie or Die. Hoboken (NJ): Wiley.
Stark, D. (2011). The Sense of Dissonance: Accounts of Worth in Economic Life. Princeton, NJ: Princeton University Press.
Stark, D. (2013). Observing Finance as a Network of Observations. Sociologica, 7(2), 21–25. http://dx.doi.org/10.2383/74854
Stark, D. (2014). On Resilience. Social Sciences, 3(1), 60–70. https://doi.org/10.3390/socsci3010060
Stark, D. (2017). For What It’s Worth. Research in the Sociology of Organizations, 52, 383–397. https://doi.org/10.1108/S0733-558X20170000052011
Stark, D. (2020). The Performance Complex. In D. Stark (Ed.), The Performance Complex: Competition and Competitions in Social Life (Introductory Essay). Oxford University Press, in press.
Stark, D., & Paravel, V. (2008). PowerPoint in Public: Digital Technologies and the New Morphology of Demonstration. Theory, Culture & Society, 25(5), 31–56. https://doi.org/10.1177/0263276408095215
Swindells, M. (2017). Emergency Preparedness, Resilience and Response (EPRR). Board Paper, March: NHS England. https://www.england.nhs.uk/wp-content/uploads/2017/03/board-paper-300317-item-10.pdf
Tufekci, Z. (2020). Don’t Believe the COVID-19 Models: That’s Not What They’re For. The Atlantic, April 2. https://www.theatlantic.com/technology/archive/2020/04/coronavirus-models-arent-supposed-be-right/609271/
UK Department of Health & Social Care. (2020). Coronavirus (COVID-10) Scaling Up Our Testing Programmes, April 4. https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/878121/coronavirus-covid-19-testing-strategy.pdf
Vaughan, D. (1996). The Challenger Launch Decision: Risky Technology, Culture, and Deviance at NASA. Boston: University of Chicago Press.
Verity, R., Okell, L.C., Dorigatti, I., al. (2020). Estimates of the Severity of Coronavirus Disease 2019: A Model-Based Analysis. The Lancet Infectious Diseases, March 30. https://doi.org/10.1016/S1473-3099(20)30243-7
Wighton, D., & Chazan, D. (2020). Germany Will Issue Coronavirus Antibody Certificates to Allow Quarantined to Re-enter Society. The Telegraph, March 29. https://www.telegraph.co.uk/news/2020/03/29/germany-will-issue-coronavirus-antibody-certificates-allow-quarantined/
Wu, T. (2020). Bigger Brother. New York Review of Books, April 9. https://www.nybooks.com/articles/2020/04/09/bigger-brother-surveillance-capitalism/
Ali, M., & Ali, H.Y. (2015). The Soul of a Butterfly: Reflections on Life’s Journey. London: Random House.↩︎
Romer, P., & Shah, R. (2020). “Testing is Our Way Out.” Wall Street Journal, April 2. See also Romer P., & Garber, A.M. (2020). “Will Our Economy Die from Coronavirus.” The New York Times, March 23.↩︎
“Testing is a key pillar of our strategy to protect the NHS (National Health Service) and save lives,” for example, writes Matt Hancock, UK Secretary of State for Health and Social Care, in a Policy Paper that calls for immediately scaling testing levels from 10,000 to 100,000 and then to 250,000 tests per day. See UK Department of Health & Social Care (2020). https://www.gov.uk/government/publications/coronavirus-covid-19-scaling-up-testing-programmes↩︎
Siddarth, D., & Weyl, E.G. (2020). “Why We Must Test Millions a Day.” COVID-19 Rapid Response Impact Initiative, White Paper 6, April 8: Edmond J. Safra Center for Ethics, Harvard. For technical background see Hart et al., (2020).↩︎
Not enough that the crisis is “testing trust” in governments and in experts (see, for example an opinion piece in the Washington Post https://www.washingtonpost.com/opinions/2020/01/22/governments-need-peoples-trust-stop-an-outbreak-where-does-that-leave-us/). John Authors, a senior editor for Bloomberg News writes that “The pandemic is putting profound philosophical questions to the test.” https://www.bloomberg.com/opinion/articles/2020-03-29/coronavirus-pandemic-puts-moral-philosophy-to-the-test↩︎
Pinch’s paper was not the first word on the sociology of testing. See, for example, important prior contributions by Callon (1986a), MacKenzie (1989), and Nelkin & Tancredi (1989).↩︎
For a partial assessment of the sociology of testing and a proposal for new directions for the subfield, see Marres & Stark (2020).↩︎
For a visual exploration of a wide variety of tests, see my five minute video, “Have you been tested?” available at http://blogs.cim.warwick.ac.uk/diversityandperformance/special-testing/↩︎
The element of standing out exists in tension with that of standing for, e.g., is the sample, the moment, the protocol representative? As the reader can sense, my three basic organizing principles are difficult to corral.↩︎
On such situations, see especially, Boltanski & Thévenot (1999) and Stark (2011 & 2017).↩︎
Bennhold K., & Eddy, M. (2020). “Merkel Gives Germans a Hard Truth About the Coronavirus.” The New York Times, March 11. https://www.nytimes.com/2020/03/11/world/europe/coronavirus-merkel-germany.html↩︎
Emphasis in the original. Official German government transcript. https://www.bundesregierung.de/breg-de/themen/coronavirus/statement-chancellor-1732296↩︎
On the origins of financial stress testing traced back to postwar tests of whether Allied bombing contributed substantially to winning the war in Europe (via simulations of the consequences of nuclear war and input output tables of the economy for Nixon’s wage and price controls) see Stark (2014); and Ozgode (2015). On war games, stress tests, and preparedness simulations see especially Lakoff (2017).↩︎
“Have you been tested?” is the opening sentence of the introductory essay that Noortje Marres and I wrote as guest editors for a special issue of the British Journal of Sociology. “Put to the Test: For a New Sociology of Testing” (Marres & Stark, 2020) was written in the Fall of 2019, well before the outbreak of the COVID-19 pandemic. Little could we have known that the dual meaning of being tested would become so apposite within a matter of weeks. For an eloquent statement about the conjuncture of the two meanings see the short video by Noortje Marres available at http://blogs.cim.warwick.ac.uk/diversityandperformance/special-testing/↩︎
See especially, Callon & Latour (1981); Callon (1986a & 1986b); Latour (1988). Not all ANT adherents will agree, but I would have preferred the label Netwok-Actor Theory over Actor-Network Theory. Especially to an American social scientist, the latter sounds like the unit of action is the Actor, i.e., first come the actors and then comes the Network. With more connotations of action distributed across the network of humans and non-humans, Network-Actor Theory would signal that the unit of action is the network.↩︎
For an acute analysis of the failure of this division of representation see Latour (2004).↩︎
Several days after these paragraphs were written, demonstrations did occur on streets in several American streets (Texas and Michigan prominent among them) protesting public health measures to close businesses, schools, and public gatherings. To my knowledge, about this issue, no mass protest demonstrations have occurred in other countries.↩︎
I am grateful to Gernot Grabher for pointing out that the emblematic locus of political demonstration — the city square — remains important. But the powerfulness of the demonstration now, however, is not measured in “turn out” (“thousands took to the street”) but instead in “turn-off” (“thousands were locked up at home”).↩︎
The most famous image of the coronavirus, created by Alissa Ekert, a medical illustrator at the US Centers for Disease Control and Prevention, took more than a week to create. Designers tested different colors, textures, and lighting. Lighting to have the spikes of the virus cast long shadows was calibrated “to help display the gravity of the situation and draw attention.” And a color scheme of red on gray, with orange and yellow accents was adopted so that the illustration would “have to go along with the branding.” Eckert quoted in Giamio (2020).↩︎
The analogy to the gun control debate is perhaps instructive. Guns kill people, say the liberals. No, people kill people, says the NRA. Actor-Network Theory (ANT) says, people with guns kill people. (Latour, 1999, pp. 176–180). By analogy: People are killed by people with the virus. Or, the same thing, the virus kills with people. In both cases, the action is distributed across the human and non-human.↩︎
Modelers prefer “project” to “predict.” The meaning of terms such as projection and forecast varies from one discipline to the next. For useful overviews of modeling in the COVID-19 pandemic see Adams (2020), Resnick (2020), and Tufekci (2020).↩︎
Notwithstanding that a small coding error could have major effects, detecting coding errors is, of course, prosaic. Of larger consequence is scrutiny about the kinds of assumptions that are built into the models that governments have been using for making policy. To my knowledge, none has been released.↩︎
“We have uncertainty on top of uncertainty on top of uncertainty,” exclaims Jeffrey Shaman, co-author of the COVID-19 epidemiological model at Columbia University’s Mailman School of Public Health, quoted in Bui et al. (2020). See also Adams (2020); Tufekci (2020).↩︎
In the Imperial College Report cited above, they write, for instance: “In total, in an unmitigated epidemic, we would predict approximately 510,000 deaths in GB and 2.2 million in the US, not accounting for the potential negative effects of health systems being overwhelmed on mortality. For an uncontrolled epidemic, we predict critical care bed capacity would be exceeded as early as the second week in April, with an eventual peak in ICU or critical care bed demand that is over 30 times greater than the maximum supply in both countries” (p. 7). And then go on reporting the simulated effects of different interventions.↩︎
On Cuomo’s persuasive use of PowerPoint see Mattern (2020).↩︎
YouTube: New York Governor Cuomo holds news conference on coronavirus response 4/6 https://www.youtube.com/watch?v=_mziwN4fjG0 (at approximately 24:00 minutes).↩︎
These phases do not correspond to the ideas of “waves” (e.g., “second wave of the pandemic”) or phases of re-opening economies after lockdown (e.g., Phases 1,2,3,4… announced by mayors, governors, or prime ministers).↩︎
Germany was the first country to run a large-scale antibody testing program involving three different but interrelated studies. In the US, sero-surveys are being conducted by the CDC and the National Institutes of Health (Chazan & Mancini, 2020). A study in the UK will test over 300,000 people over the next 12 months, involving 20,000 representative households in the first wave of the study (Cookson, 2020).↩︎
For example, the Reverse Transcription Polymerase Chain Reaction Test (RT-PCR), that measures amounts of RNA of the virus or a new antigen test just approved by the Federal Drug Administration (FDA), that detects protein fragments of the virus. The latter has promise for home testing along the lines of the home pregnancy test (Jacobs, 2020).↩︎
An important finding concerns alarming class and ethnic disparities in the prevalence of COVID-19 and mortality rates: the working class, racial minorities, and the global poor are dying disproportionately in this pandemic (Markortoff, 2020; Laurencin & McClinton, 2020; Sánchez-Páramo, 2020).↩︎
The Italian region of Veneto has been a leader in testing, recently launching a study collecting 100,000 blood samples across the region. The city of Vo’ swab-tested the entire population of 3,000 and will carry out antibody testing and genome sequencing on the entire population (Horowitz, 2020).↩︎
Public health officials in the Italian region of Emilia Romagna have issued calls for proposals to partner with local enterprises to test all their workers on an ongoing basis. Data from the firms will be used by epidemiologists to learn more about the virus (the length and strength of immunity, for example) and to test rates of false positives and false negatives in the serological tests (interview with an epidemiologist familiar with the program). With its “Back on Track” project, Ferrari is among the first firms to begin participating in the program, see “La fase 2 di Maranello ‘È come un Gran Premio’,” La Repubblica, May 5, 2020.↩︎
“An actuarial device is invented for a world in which the possible threats to collective life can be known through careful demographic and epidemiological research; the problem is one of accumulating statistical knowledge to guide cost-effective intervention” (Lakoff, 2015, p. 6). The problem with the actuarial approach is, according to Lakoff: “In the case of a novel pathogen, the virulence of an encroaching epidemic cannot be determined based on accumulated data about the past” (p. 15). In contrast, a “sentinel device (…) is devised in order to stimulate action when decision is imperative but knowledge is incomplete” (p. 6). Some of the real-time models discussed in the previous section call to mind Lakoff’s sentinel devices as attempts to track and respond to transformations in real time.↩︎
See, for example, a recent paper in Science (Kissler et al., 2020) and statements by one of its senior authors, epidemiologist Marc Lipsitch, quoted in the New York Times May 7 (Roberts, 2020). The Lipsitch model is based on analysis of historical data from eight prior influenza epidemics combined with simulations of transmission dynamics using COVID-19 data.↩︎
See Sarasin (2020) for a brief, thoughtful essay that cautions against “the semantic cudgel of ‘biopolitics’” in favor of more differentiated models of power in Foucault’s writings about infectious diseases.↩︎
These include Singapore’s TraceTogether, MIT’s Safe Paths, Stanford’s COVID Watch, a partnership of Apple and Google, a team at Oxford University, and numerous national programs in Europe adopting the Pan-European Privacy–Proserving Proximity Tracing (PEPP–PT) principles. For a useful discussion of the issues and a comprehensive listing of serious programs under development in Europe see European Commission e-Health Network (2020). In addition to these programs, there are likely dozens of fly-by-night operations peddling DCT apps to unsuspecting city and state governments (see Setzer, 2020).↩︎
Digital contact tracing systems differ in their privacy protection. Some systems (such as those under development in France and the UK) store the time-stamped pseudonymized IDs on central servers. Decentralized systems such as the PEPP–PT have so-called privacy-protection-by-design because the pseudonymized IDs are only stored (and temporarily at that) on the devices of the individual users (Lomas, 2020b). See Canca (2020) for an argument that making such systems mandatory (as opposed to consensual) significantly enhances effectiveness with no sacrifice in privacy.↩︎
On the file as the basic technology of bureaucracy, see Yates (1989) and Stark (2011, p. 169). We will know we are free of the legacy of bureaucracy when we no longer have digital files. A relational database is a significant step in that direction.↩︎
Because the systems use Bluetooth rather than GPS technology, the trace is not in geographical space but in network space.↩︎
For a helpful layperson’s guide to serological testing for antibodies and immunity, see Resnick & Irfan (2020).↩︎
“How Big Data Is Dividing the Public in China’s Coronavirus Fight — Green, Yellow, Red.” South China Morning Post, February 22, 2020. https://www.scmp.com/news/china/society/article/3051907/green-yellow-red-how-big-data-dividing-public-chinas-coronavirus↩︎
On privacy concerns that personal data from the Health Code is being shared with law enforcement authorities see Mozur, Zhong, and Krolik (2020). Moreover, the Alipay app “allows users to check the health codes of others by entering their identity numbers.” (Hu, 2020). Similarly, the “close contact detector” used in China allows users to make inquiries for up to three ID numbers (South China Morning Post, Feb 12, 2020).↩︎
In fact, the system is not (or is not yet) even a system so much as a multiplicity of experiments involving 30 national ministries, hundreds of local and regional governments, numerous bike and ride share programs, e-commerce firms, and other entities including competing online payment companies (see especially Bach, 2020a & 2020b).↩︎
On this topic, see http://blogs.cim.warwick.ac.uk/diversityandperformance/special-testing/ for my conversation with Jonathan Bach, the author of several studies of the CSCS (Bach, 2020a & 2020b).↩︎
The notion of combining three programs that are already operating (or in advanced development) is far from fanciful since there are no effective restrictions against such data sharing in China.↩︎
On algorithmic prediction, see especially Esposito (2019). Siegel nicely expresses the difference between, on the one hand, statistical forecasting and, on the other, predictive analytics (PA) using machine learning on big data: “Whereas forecasting estimates the total number of ice cream cones to be purchased next month in Nebraska, PA tells you which individual Nebraskans are most likely to be seen with a cone in hand” (Siegel, 2016, p. 56). In the field of elections, the shift is from surveys that statistically forecast the proportion of probable voters with certain opinions to algorithmic prediction of which particular individual voters will vote for a candidate in the election. (For an early appreciation of such developments, see Nielsen, 2012).↩︎
On the turn from actuarial models based on statistical calculation of pooled risk to individual risk profiles based on machine learning models of predictive analytics in the insurance industry, see Cevolini & Esposito (2020).↩︎
See Lakoff (2017) for an account of an exercise in June 2001, code-named “Dark Winter,” simulating an outbreak of deadly smallpox in the United States.↩︎
Or, in the words of Yogi Berra, “If the world were perfect, it wouldn’t be.”↩︎
In their excellent study of stress tests of nuclear power plants in the wake of the Fukushima disaster, Sarac-Lesavre and Laurent (2019) show that the function of reassuring the public dominated the process. Stress tests were conducted in 145 nuclear power plants in Europe. All passed. At first, it seemed that the Fukushima crisis had created an opportunity for a new regulatory approach, but the tests were reviewed by national regulatory bodies and not by independent experts (as was being demanded by leading NGOs in the field). Whereas banking regulation in the wake of the 2008 financial crisis strengthened the hand of EU supranational regulators by creating a “European gaze” that made banks comparable with each other, the refusal to rank countries and the logic of “continuous improvement” meant the restabilization of national regulatory frameworks and the failure to develop cross-sectoral crisis management capacities such as European rapid response force (Sarac-Lesavre & Laurent, 2019).↩︎
“Exercise Cygnus and How Failing NHS Foreshadowed Covid-19 Reaction”. The National, March 31, 2020; “If Ministers Fail to Reveal 2016 Flu Study They ‘Will Face Court.’” The Guardian, April 26, 2020.↩︎
A stress test is a very interesting diagnostic test. Not unlike a medical diagnostic, it can be thought of as testing the well-being (robustness) of the organization or thing being tested. But it is very much a projection, simulating a future in the attempt to identify problems in the present. Not unlike the diagnostic, of course, is that while the test results definitely matter, most consequential is what results from the test.↩︎
On accounts and accountings in economic life, see Stark (2011); on performance metrics in social life, see my introductory essay (Stark, 2020) to an edited volume.↩︎
Boltanski & Thévenot (1999).↩︎
On the heterarchy of evaluative principles in public life, see Stark, 2011, pp. 204–212.↩︎