# On Auditing Elections When Precincts Have Different Sizes

## Abstract

We address the problem of auditing an election when precincts may have different sizes. Prior work in this field has emphasized the simpler case when all precincts have the same size. Using auditing methods developed for use with equal-sized precincts can, however, be inefficient or result in loss of statistical confidence when applied to elections with variable-sized precincts.

We survey, evaluate, and compare a variety of approaches to the variable-sized precinct auditing problem, including the SAFE method [11] which is based on theory developed for equal-sized precincts. We introduce new methods such as the negative-exponential method NEGEXP'' that select precincts independently for auditing with predetermined probabilities, and the PPEBWR'' method that uses a sequence of rounds to select precincts with replacement according to some predetermined probability distribution that may depend on error bounds for each precinct (hence the name PPEBWR: probability proportional to error bounds, with replacement), where the error bounds may depend on the sizes of the precincts, or on how the votes were cast in each precinct.

We give experimental results showing that NEGEXP and PPEBWR can dramatically reduce (by a factor or two or three) the cost of auditing compared to methods such as SAFE that depend on the use of uniform sampling. Sampling so that larger precincts are audited with appropriately larger probability can yield large reductions in expected number of votes counted in an audit.

We also present the optimal auditing strategy, which is nicely representable as a linear programming problem but only efficiently computable for small elections (fewer than a dozen precincts). We conclude with some recommendations for practice.

# 1 Introduction

Post-election audits are an essential tool for ensuring the integrity of election outcomes. They can detect, with high probability, both errors due to machine mis-programming and errors due to malicious manipulation of electronic vote totals. By using statistical samples, they are quite efficient and economical. This paper explores auditing approaches that achieve improved efficiency (sometimes by a factor of two or three, measured in terms of the number of votes counted) over previous methods.

Suppose we have an election with precincts, , ..., . Let denote the number of voters who voted in precinct ; we call the size'' of the precinct . Let the total number of such voters be . Assume without loss of generality that .

We focus on auditing precincts as opposed to votes because this is the common form of auditing encountered in practice. If one is interested in sampling votes, then the results in Aslam et al. [1] apply because the votes can be modeled as precincts of equal size (in particular, of size one). In this paper, we are interested in the more general problem, that is, when precincts have different sizes.

Precinct sizes can vary dramatically, sometimes by an order of magnitude or more. See Figure 2.

Methods for auditing elections must, if they are to be efficient and effective, take such precinct size variations into account.

Suppose further that in precinct we have both electronic records and paper records for each voter. The electronic records are easy to tally. For the purposes of this paper, the paper records are used only as a source of authoritative information when the electronic records are audited. They may be considered more authoritative since the voters may have verified them directly. In practice, more care is needed, since the electronic records could reasonably be judged as more authoritative in situations where the paper records were obviously damaged or lost and the electronic records appear undamaged.

Auditing is desirable since a malicious party, the adversary,'' may have manipulated some of the electronic tallies so that a favored candidate appears to have won the election. It is also possible for a simple software bug caused the electronic tallies to be inaccurate. However, we focus on detecting malicious adversarial behavior because it is the more challenging task.

A precinct can be audited'' by re-counting by hand the paper records of that precinct to confirm that they match the electronic totals for that precinct. We ignore here the important fact that hand-counting may be inaccurate, and assume that any discrepancies are due to fraud on the part of the adversary. In practice, the discrepancy might have to be larger than some prespecified threshold to trigger a conclusion of fraud in that precinct.

See the overviews [8,13,6] for information about current election auditing procedures. In this paper we ignore many of the complexities of real elections; these complexities are addressed in other papers. We do so in order to focus on our central issue: how to select a sample of precincts to audit when the precincts have different sizes. See Neff [12], Cordero et al. [5], Saltman [17], Dopp et al. [7], and Aslam et al. [1], for additional discussion of the mathematics of auditing, and additional references to the literature.

## 1.1 Outline

We begin with an overview of the auditor's general approach in Section 2. In Section 3 we review the adversary's objectives and capabilities. Section 4 then reviews the auditor's strategy. Some known results for auditing when all precincts have equal size are discussed in Section 5. We next review in Section 6 the SAFE'' method, which deals with variable-sized precincts using the mathematics developed for equal-sized precincts, by first deriving a lower bound on the number of precincts that must have been corrupted, if the election outcome was changed. Section 7 introduces basic auditing methods, where each precinct is chosen independently according to a precomputed probability distribution. A particularly attractive basic auditing method is introduced in Section 8; this method is called the negative-exponential'' (NEGEXP) auditing method. We then consider audits where precincts are not chosen independently. Section 9 introduces the method of sampling with probability proportional to error bounds, with replacement (PPEBWR); a special case of this procedure is PPSWR, sampling with probability proportional to size, with replacement.'' Section 10 discusses vote-dependent auditing, where the probability of auditing a precinct depends on the actual vote counts for each candidate. Section 11 gives experimental results using data from Ohio and Minnesota. Section 12 presents a method based on linear programming for determining an optimal auditing procedure, which unfortunately appears to be computationally too expensive for practical use. Section 13 closes with discussion and recommendations for practice.

# 2 Auditing Objectives and Costs

We assume here that the election is a winner-take-all (plurality) election from a field of candidates.

After the election, the auditor randomly selects a sample of precincts for the post-election audit. In each selected precinct the paper ballots are counted by hand; the totals obtained in this manner are then compared with the electronic tallies. We assume that the paper ballots are maintained securely and that they can be accurately counted during the post-election audit.

The auditor wishes to assure himself (and everyone else) that the level of error and/or fraud in the election is likely to be low or nonexistent, or at least insufficient to have changed the election outcome. If the audit finds no (significant) discrepancies between the electronic and paper tallies, the auditor announces that no fraud was discovered, and the election results may be certified by the appropriate election official.

However, if significant discrepancies are found between the electronic and paper tallies, additional investigations may be appropriate. For example, state law may require a full recount of the paper ballots. Stark [19] gives procedures for incrementally auditing larger and larger samples when discrepancies are found, until the desired level of confidence in the election outcome is achieved.

When planning the audit, the auditor knows the number of reported (electronic) votes for each candidate  in precinct , and the total size  (total number of votes cast) of each precinct . The auditor also knows the reported margin of victory, denoted of the winning candidate over the runner-up--this is the difference between the number of votes reported for the apparently victorious candidate and the number of votes reported for the runner-up. Larger audits are appropriate when the margins of victory are smaller (see, e.g., Norden et al. [13]).

## 2.1 Auditing objective

We believe that the audit should be designed to achieve a pre-specified level of confidence in the election outcome, i.e., when an election is ultimately certified, one should be confident, in a statistically quantifiable manner, that the election outcome is correct. It is the correct (and efficient) approach. Naive methods that audit a fixed fraction of precincts tend to waste money when the margin of victory is large, and provide poor confidence in the election outcome when the margin of victory is small. See McCarthy et al. [11].

In order to ensure that an election outcome is correct, one must be able to detect levels of fraud sufficient to change the outcome of the election. We thus assume the auditor desires to test at a certain significance level that error or fraud is unlikely to have affected the election outcome. A well-designed audit can reduce the likelihood that significant fraud or error has gone undetected. A significance level of means that the chance that error large enough to have changed the election outcome will go undetected is one in twenty.

Let denote the confidence level'' of the audit, where Thus, a test at significance level provides a confidence level of . This is independent of the way fraud was committed (at the level of the machine, precinct, vote or other) because we only model the overall fraud in our formulas. We follow Stark [19] in adopting as our null hypothesis the (electronic) election outcome is incorrect'', so that is an upper bound on the probability that the null hypothesis will be rejected (i.e. that the electronic outcome will be accepted) when the null hypothesis is true (the electronic outcome is wrong).

## 2.2 Choosing a sample

Depending on the precinct sizes, the reported votes for each candidate, and thus the reported margin of victory, the auditor determines how to select an appropriately-sized random sample of precincts for auditing.

We explore three methods by which the auditor chooses a sample:

• [BASIC] The auditor determines a probability for each precinct that it will be audited, based on the precinct's size and on the overall margin of victory, and then independently selects each precinct with its specified probability. Such basic auditing strategies are discussed in Sections 7-8.

• [WITH REPLACEMENT] The auditor determines a probability for each precinct that it will be selected during a round, and does rounds of drawing with replacement to select the precincts to be audited. With sampling with replacement,'' a selected precinct is placed back into the collection of precincts and thus may be drawn more than once. Any precinct drawn at least once will be audited. Examples of this approach are the PPEBWR (sampling with probability proportional to error bounds with replacement) method of Section 9 and the PPSWR method: sampling with probability proportional to size, with replacement.''

• [OPTIMAL] The auditor determines a probability for each subset of precincts specifying the probability that that subset will be audited. This includes the optimal auditing strategy of Section 12.

## 2.3 Auditing cost

If all precincts have the same size, one may measure the cost of performing an audit in terms of the (expected) number of precincts audited. If precincts have a variety of sizes, the (expected) number of votes counted appears to be a better measure of auditing cost. The auditing cost is most reasonably measured in person-hours, which will be proportional to the number of votes recounted. The overall cost may have a constant additive term for each precinct (a setup cost), but this should be small compared to the cost to audit the votes.

We assume the adversary wishes to corrupt enough of the electronic tallies so that his favored candidate wins the most votes according to the reported electronic tallies. Without loss of generality, we'll let candidate be the adversary's favored candidate. The adversary tries to do his manipulations in such a way as to minimize the chance that his changes to the electronic tallies will be caught during the post-election audit.

Let denote the actual number of (paper) votes for candidate  in precinct , and let denote the reported number of (electronic) votes for candidate  in precinct . With no adversarial manipulation, we will have for all and . We ignore in this paper small explainable discrepancies that can be handled by slight modifications to the procedures discussed here.

We thus have for all : the total number of paper votes cast in precinct is equal to the number of electronic votes cast in precinct ; this number is , the size'' of precinct . (Our techniques can perhaps be extended to handle situations where such reconciliation is not done; we have not yet examined this question closely.)

Let denote the total actual number of votes for candidate : and let denote the total number of votes reported for candidate : The adversary's favored candidate, candidate , will be the winner of the electronic report totals if

We assume for now that the election is really between candidate and candidate , so that the adversary's objective is to ensure that candidate is reported to win the election and that candidate is not. There may be other candidates in the race, but for the moment we'll assume that they are minor candidates. It is also convenient to consider invalid'' and undervote'' to be such minor candidates'' when doing the tallying.

The adversary can manipulate the election in favor of his or her desired candidate by shifting the electronic tallies from one candidate to another. He or she might move votes from some candidate to candidate . Or move votes from candidate to some other candidate. These manipulations can change the election outcome, and yield a false margin of victory.'' The margin of victory plays a key role in our analysis.

Let denote the actual margin of victory'' (in votes) of candidate over candidate : Let denote the reported margin of victory'' (in votes) for candidate over candidate : Note that will be known to the auditor at the beginning of the audit, but that will not.

The adversary may be in a situation initially where (i.e. ); that is, his or her favored candidate, candidate , has lost to candidate . The adversary must, in order to change the election outcome, manipulate the (electronic) votes so that (i.e. so that ) and do so in a way that goes undetected.

The error'' in favor of candidate introduced in the margin of victory computation in precinct by the adversary's manipulations is (in votes):

Here is the reported margin of victory for candidate , while is his actual margin of victory, so their difference is the amount of error introduced by the adversary in the margin of victory.

An upper bound on the amount by which the adversary can improve the margin of victory in favor of his candidate in precinct  is:

 (1)

Each vote moved from candidate to candidate improves the margin by votes, and each vote moved from candidate () to candidate improves the margin by vote. (See also Stark [19].)

Let denote the total error (in votes, from all precincts) introduced in the margin of victory computation by the adversary: Clearly, That is, the reported margin of victory is equal to the actual margin of victory, plus the error introduced by the adversary.

The adversary has to introduce enough error so that the reported margin of victory becomes positive, even though the initial (actual) margin of victory is negative. Thus, the amount of error introduced satisfies both of the inequalities: and The second inequality is of most interest to the auditor, since at the beginning of the audit the auditor knows but not . For convenience, we shall use in the sequel, and let denote the fraction of votes represented by the margin of victory: (recall that denotes the total number of votes cast: ).

We assume here that the adversary wishes to change the election outcome while minimizing the probability of detection--that is, while minimizing the chance that one or more of the precincts chosen have been corrupted. If the post-election audit fails to find any error, the adversary's candidate might be declared the winner, while in fact some other candidate (e.g. candidate ) actually should have won.

The adversary might not be willing to corrupt all available votes in a precinct; this would generate too much suspicion. Dopp and Stenger [7] suggest that the adversary might not dare to flip more than a fraction of the votes in a precinct. The value is also denoted WPM in the literature, and called the Within-Precinct-Miscount.

Our auditing methods in this paper depend heavily on the use of such upper bounds on , that is, on the maximum amount by which the adversary can change the margin of victory in each precinct. We use to denote such an upper bound on . Following Dopp and Stenger, we would have as an upper bound for :

 (2)

We call this the Linear Error Bound Assumption''. The factor of occurs since we assume that the adversary is able to switch votes from candidate  to candidate .

We may also presume that the adversary knows the general form of the auditing method. Indeed, the auditing method may be mandated by law, or described in public documents. While the adversary may not know which specific precincts will be chosen for auditing, because they are determined by rolls of the dice or other random means, the adversary is assumed to know the method by which those precincts will be chosen, and thus to know the probability that any particular precinct will be chosen for auditing.

We let denote the set of corrupted precincts, and let denote the number of corrupted precincts. In this discussion, we assumed that reconciliation'' is performed when the election is over, confirming that the number of votes recorded electronically is equal to the number of votes recorded on paper; an adversary would presumably not try to make these totals differ, but only shift the electronic tallies to favor his candidate at the expense of other candidates. If reconciliation'' is not performed and an adversary reduces the number of votes cast in a precinct that is, say, known to be favorable to the opponent, our techniques can still discover the fraud within the desired confidence level. This happens if the resulting change in the margin of victory (expressed in votes) is at most the error bound of the resulting precinct. This condition holds when the adversary decreases the total number of votes cast in the precinct by at most a factor of ( for ). Arguably, if the final number of votes cast is reduced even more, such a dramatic corruption should be detected.

# 4 Auditing Method

## 4.1 Types of audits

There are many different ways to perform an audit; see Norden et al. [13] for discussion. In this paper we focus on how the sample is selected; an auditing method is one of following five types:

A fixed audit determines the amount of auditing to do by fiat--e.g., it selects a fixed number of precincts (or votes) to be counted (or perhaps a fixed percentage, instead of a fixed number). It does not pay attention to the precinct sizes, the reported margin of victory, or the reported vote counts. Fixed audits are simple to understand, but are frequently very costly or statistically weak.

If an audit is not a fixed audit, it is an adjustable audit--the size of the audit is adjustable according to various parameters of the election. There are four types of adjustable audits, in order of increasing utilization of available parameter information.

The first (and simplest) type of adjustable audit is a margin-dependent audit. Here the selection of precincts to be audited depends only on the reported margin of victory . An election that is a landslide (with a very large margin of victory) results in smaller audit sample sizes than an election that is tight.

In order for an audit to provide a guaranteed level of confidence in the election outcome while still being efficient (it does not audit significantly more votes/precincts than needed), it must be margin-dependent (or better). The remaining three types of adjustable audits are refinements of the margin-dependent audit. Margin-dependent audits have been proposed by Saltman [17], Lobdill [10], Dopp and Stenger [7], McCarthy et al. [11], among others.

The second type of adjustable audit is a size-dependent audit. Here the selection of precincts to be audited depends not only on the reported margin of victory but also on the precinct sizes . A size-dependent audit audits larger precincts with higher probability and audits small precincts with smaller probability. This reflects the fact that the larger precincts are juicier targets'' for the adversary. Overall, the total amount of auditing work performed may easily be less than for an audit that does not take precinct sizes into account.

The third type of adjustable audit is a vote-dependent audit. Here the selection of precincts to be audited depends not only on the reported margin of victory and the precinct sizes , but also on the reported vote counts . A vote-dependent audit can reflect the intuition that if precinct reports more votes for candidate (the reported winner) than precinct reports, then precinct should perhaps be audited with higher probability, since it may have experienced a larger amount of fraud. See Section 10; also see Calandrino et al. [3].

The fourth type of adjustable audit is a history-dependent audit. Here the selection of precincts to be audited depends not only on the reported margin of victory , the precinct sizes , and the reported vote counts , but also on records of similar data for previous elections. A precinct whose reported vote counts are at odds with those from previous similar elections becomes more likely to be audited.

Here we consider what we call an error-bound-dependent audit, where the auditor computes for each precinct an error bound on the error (change in margin of victory) that the adversary could have made in that precinct. An error-dependent audit is a special case of a size-dependent audit, if the error bound for precinct depends only the size of the precinct, as in the Linear Error Bound Assumption of equation (2) where the error bound is simply proportional to the precinct size. The linear error bound assumption leads, for example, to sampling strategies of the form probability proportional size,'' as we shall see, since our probability proportional to error bound'' strategy becomes probability proportional to size'' when error bound is proportional to size.''

However, the error-dependent audit could be a special case of a vote-dependent audit, if the error bound depends on the votes cast in precinct . We explore this possibility in Section 10. In any case, it is useful to formally decouple'' the error bound from the precinct size; we let denote the sum of these error bounds.

## 4.2 High-level structure of an audit

The post-election audit involves the following steps.

1. Determine the relevant parameters of the election (margin of victory , precinct sizes , reported vote counts , and error bounds ).

2. Select a sample of precincts to be audited.

3. Count by hand all the paper ballots for every precinct in the sample. If precinct is audited, then the actual vote counts and the votes that were changed become known to the auditor. If no discrepancy is observed, precinct is deemed to be good (i.e. uncorrupted); otherwise precinct is detected as being bad (i.e. corrupted).

4. If no errors are found in any audited precinct, announce that candidate (the reported winner of the electronic totals) is the winner of the election. Otherwise, trigger some enlarged examination (escalate the audit).

We do not discuss triggers and escalation in this paper, although such discussion is very important and needs to be included in any complete treatment of post-election auditing (see Stark [19]).

## 4.3 Selecting a sample

How should the auditor select precincts to audit? The auditor wishes to maximize the probability of detection: the probability that the auditor audits at least one bad precinct (with nonzero error ), if there is sufficient error to have changed the election outcome. The auditor's method should be randomized, as is usual in game theory; this unpredictability prevents the adversary from knowing in advance which precincts will be audited.

We first review auditing procedures to use when all precincts have the same size. We then proceed to discuss the case of interest in this paper, that is, when precincts have a variety of sizes.

# 5 Equal-sized Precincts

This section briefly reviews the situation when all of the precincts have the same size (so ). We adopt the Linear Error Bound Assumption () of equation (2) in this section. Let denote the number of precincts that have been corrupted. Since an adversary who changed the election outcome must have introduced sufficient error, so that (see Dopp et al. [7]) is the minimum number of precincts the adversary could have corrupted.

When all precincts have the same size, the auditor should pick an appropriate number of distinct precincts uniformly at random to audit. See Neff [12], Saltman [17], or Aslam et al. [1] for discussion and procedures for calculating appropriate audit sample sizes.

The probability of detecting at least one corrupted precinct in a sample of size is By choosing so that

 (3)

one has a test at significance level (i.e., at confidence level'' ): with probability at least one or more corrupted precincts will be detected, if there are at least corrupted precincts (for detailed explanation see Aslam et al. [1].)

Rivest [16] suggests approximating equation (3) by a Rule of Thumb'': one over the (fractional) margin of victory . For equal-sized precincts (with ), this gives remarkably good results, corresponding to a confidence level of at least .

# 6 The SAFE Auditing Method

The SAFE'' auditing method by McCarthy et al. [11] is perhaps the best-known approach to auditing elections; it adapts the approach for handling equal-sized precincts discussed above to handle variable-sized precincts.

In 2006 Stanislevic [18] presented a conservative way of handling precincts of different sizes; this approach was also developed independently by Dopp et al. [7]. This method is the basis for the SAFE auditing procedure.

It assumes that the adversary corrupts the larger precincts first, yielding a lower bound on the number of precincts that must have been corrupted if the election outcome was changed. The auditor can then use in an auditing method that samples precincts uniformly. More precisely, the auditor knows that if the adversary changed the election outcome, he or she must have corrupted at least precincts, where is the least integer such that (Recall our assumption that .) Then the auditor draws a sample of size precincts uniformly, where satisfies (3); this ensures a probability of at least that a corrupted precinct will be sampled, if the adversary produced enough fraud to have changed the election outcome.

# 7 Basic Auditing Methods

This section reviews basic'' auditing methods, where each precinct is audited independently with a precinct-specific probability determined by the auditor. Many interesting auditing procedures are basic auditing procedures. We try restricting our attention to basic'' methods in an effort to make some of the math simpler; although we shall see in Section 9 that the math is actually fairly simple for some non-basic methods.

This section assumes that the auditor will audit each precinct independently with some probability , where . The auditing method is thus determined by the vector . The probabilities sum to the expected number of precincts audited; they do not normally sum to because commonly we audit more than one precinct. The expected workload (i.e., the expected number of votes to be counted) is

 (4)

because we audit each set of votes with probability . We assume that vectors , , and , are public knowledge and known to everyone, including the adversary. (We ignore the fact that in practice, it might be difficult for the adversary to obtain some of this information, in which case the auditor's success at detecting fraud might even be somewhat greater than we calculate here.)

In the basic auditing procedures we describe in this paper, the chance of auditing a precinct is independent of the error introduced into that precinct by the adversary. Thus, we can assume that the adversary makes the maximum change possible in each corrupted precinct: . This helps the adversary reduce the number of precincts corrupted and reduces the chance of him being caught during an audit.

A basic auditing method is not difficult to implement in practice in an open and transparent way. A table is printed giving for each precinct its corresponding probability of being audited. For each precinct , four ten-sided dice are rolled to give a four-digit decimal number . Here is the digit from the -th dice roll. If , then precinct is audited; otherwise it is not. The probability table and a video-tape of the dice-rolling are published. See [5] for more discussion on the use of dice.

One very nice aspect of basic auditing methods is that we can easily compute the exact significance level for . Given , one can use a dynamic programming algorithm to compute the probability of detecting an adversary who changes the margin by  votes or more. This algorithm, and applications of it to heuristically compute optimal basic auditing strategies, are given by Rivest [15].

# 8 Negative-exponential Auditing Method (NEGEXP)

This section presents the negative exponential'' auditing method NEGEXP, which appears to have near-optimal efficiency, and which is quite simple and elegant. Depending on the details of the audit being performed, either NEGEXP or the PPEBWR of the next section may be the better practical choice.

The negative-exponential'' auditing method (NEGEXP) is a heuristic basic auditing method. Intuitively, the probability that a precinct is audited is one minus a negative exponential function of the error bound for a precinct. See Figure 1.

The value'' to the adversary of corrupting precinct  is assumed to be , the known upper bound on the amount of error (in the margin of victory) that can be introduced in precinct . In a typical situation might be proportional to ; this is the Linear Error Bound Assumption.

Intuitively, the auditor wants to make the adversary's risk of detection grow with the value'' a precinct has to the adversary; this motivates the adversary to leave untouched those precincts with large error bounds. The adversary thus ends up having to corrupt a larger number of smaller precincts, which increases his or her chance of being caught in a random sample.

The motivation for the NEGEXP method is the following strategy for the auditor: determine auditing probabilities so that the chance of auditing at least one precinct from the set of corrupted precincts depends only on the total error bound of that set of precincts. For example, the adversary will then be indifferent between corrupting a single precinct with error bound or corrupting two precincts with respective error bounds and . The chance of being caught on or being caught on at least one of and should be the same.

This implies that the auditor does not audit each with probability , where

 (5)

and where is some fixed constant. Thus, if , we have

from which we can conclude that as desired. Since is constant, is constant.

Our NEGEXP auditing method thus yields, from (5),

 (6)

see Figure 1. The name negative exponential'' refers to the negative exponential appearing in this formula.

With the NEGEXP method, as the error bound increases, the probability of auditing increases, starting off at for and increasing as increases, and levelling off approaching asymptotically for large . The chance of auditing passes as exceeds .

The value can be thought of as approximating a threshold'' value: precincts with larger than have a fairly high probability of being audited, while those smaller than have a smaller chance of being audited. As decreases, the auditing gets more stringent: more precincts are likely to be audited.

An auditor may choose to use the NEGEXP auditing method of equation (6), and choose to achieve an audit with a given significance level.

The design of NEGEXP makes this easy, since NEGEXP has the property that for any set of precincts that the adversary may choose to corrupt satisfying the chance of detection is at least

 (7)

The reason is that the probability of detecting at least one corrupted precinct is one minus the probability of not detecting any of the corrupt precincts in . The latter is the product of the probability of not detecting any precinct in , that is , yielding the desired chance of detection .

This holds no matter what set of precincts, , the adversary chooses.

How can an auditor audit enough to achieve a given significance level? The relationship of equation (7) gives a very nice way for the auditor to choose : by choosing

 (8)

the auditor achieves a test with significance at least : there is probability at least of catching an error of at least , no matter what set of precincts the adversary uses. For example, by choosing , the auditor tests at significance level for margin-shift error of size or greater. If we use equation (8) to determine , then we have
 (9)

With the Linear Error Bound Assumption, this becomes
 (10)

However, an auditor may want to adjust the probabilities to achieve a desired expected number of precincts audited or a desired expected number of votes counted. He or she can use any of several standard packages for root-finding to find a value of that meets the given constraints. In any case, it is easy to print out a table of the precinct probabilities , so that one can utilize a suitable dice-based protocol for actually picking the precincts. We also note that if ,

when is small relative to , so that the NEGEXP method can be viewed as an approximation to a method whereby precincts are selected with probability proportional to their size (PPS).

This completes our description of the NEGEXP auditing method. Section 11 presents experimental results for this method. In the next section, we describe a different method (PPEBWR), which turns out to be nearly identical (but slightly better) in efficiency to the NEGEXP method, and which in some circumstances may be easier to work with, although it is somewhat less flexible.

# 9 Sampling with Probability Proportional to Error Bound with Replacement (PPEBWR)

This section presents the PPEBWR'' (sampling with probability proportional to error bound, with replacement) auditing strategy. It is simple to implement, and does at least as well as the NEGEXP method. Indeed, the PPEBWR is an excellent method in many respects, and we recommend its use, although the NEGEXP may be more useful when additional flexibility is required (e.g. having multiple races with overlapping jurisdictions).

Consider auditing an election with non-uniform error bounds where . Let be the (minimum) level of error one wishes to detect; is the margin of victory. Consider the following sampling-with-replacement procedure. Form a sampling distribution over the precincts:

 (11)

and draw samples with replacement according to . Eliminate duplicates, and audit the set of precincts obtained. It is easy to use dice to select the precincts to be audited in a public and transparent manner. The probabilities of equation (11) can be computed, and then their cumulative values are computed: and printed out. For each of rounds, four decimal dice are rolled, and the four digits , , , and are combined to yield a four-digit decimal number . Then is marked for auditing if The printed tables and a videotape of the dice-rolling are made publicly available. This approach only requires rolling random numbers, whereas the basic methods of Sections 7-8 require rolling random numbers. When the Linear Error Bound Assumption holds, the PPEBWR method performs sampling with probability proportional to size within each round. We call the overall method sampling with probability proportional to size, with replacement, or PPSWR''. The use of sampling with probability proportional to size (PPS) is well-known in a number of fields, including statistics and survey-sampling (see Hansen and Hurwitz [9] and Cochran [4, Ch. 9A]) and financial auditing, where dollar-unit sampling (DUS) samples accounts with probability proportional to their book value (see [14]). Some results from this literature may also be useful or relevant to auditing elections. Indeed, Stark has suggested that some of our results may be alternatively derivable from results (such as the Stringer bound [21]) in this literature. We introduce notation to distinguish the per-round selection probabilities (denoted by ) from the overall selection probabilities (denoted by ). The probability of selecting precinct at least once in rounds is one minus the probability of not selecting it in any round. The probability of not selecting precinct in one round is and over rounds is . Hence, the probability of selecting precinct at least once in rounds is
 (12)

Precinct is audited if and only if it is not missed during each of the selection rounds, and denotes this overall probability that precinct is audited.

While the per-round probabilities are proportional to size, the overall probabilities are generally not: note that as gets large the overall probability of selection of each precinct approaches . Actually, the overall probabilities turn out to be nearly identical (but slightly less) than those computed by the NEGEXP method.

We now show how to determine the number of rounds for a desired audit significance level . Any set of precincts whose total error bound is at least will have probability weight at least . Similar to the derivation in (12) where we replace by , the probability that at least one such precinct is detected is at least

We want this to be at least for the desired confidence level of ; solving

for , we obtain that

 (13)

is the minimum sufficient sample size. Thus, drawing at least samples, with replacement, will guarantee catching fraud of size sufficient to have changed the election outcome, with probability at least .

We can show that the probability with which any given precinct is audited is slightly smaller than the negative-exponential audit probability leading to a slightly more efficient sample size. Our experimental results have shown that the difference in audit sizes of the two methods is nevertheless small.

The costs of the PPEBWR strategy are easy to compute.

The expected number of precincts audited is and the expected number of votes audited is

Note that in both NEGEXP and PPEBWR the confidence level achieved is at least no matters what strategy the adversary follows (within the assumptions made). This includes the best possible strategy in which the adversary is aware of our auditing scheme and minimizes his detection probability; he/she still cannot lower this probability beyond .

# 10 Vote-dependent Auditing

This section drops the assumption that error bounds are proportional to precinct size, i.e., that How else can the auditor obtain a bound on the error? Instead of having a size-dependent audit, he or she may have a vote-dependent audit, using the fact that if

here we are measuring the margin of victory between candidate 1 and candidate 2.

If we are unsure who the runner-up'' is, we can take the maximum bound over any such runner-up'': Note that the candidates'' used for the invalid'' or undervote'' tallies should be excluded--they cannot be winners or runners-up. These bounds will usually be larger than those obtained via a within-shift bound , thus giving worse results. However, in a two-candidate race if a precinct votes almost entirely for the electronic runner-up, the new bound may be smaller.

Stark [19, Section 3.1] suggests pooling'' several obviously losing candidates to create an obviously losing pseudo-candidate'' to reduce the error bounds; this can also be applied here.

# 11 Experimental Results

We illustrate and compare the previously described methods for handling variable-sized precincts using data from Ohio. These results show that taking precinct size into account (e.g. by using NEGEXP or PPSWR) can result in dramatic reductions in auditing cost, compared to methods (such as SAFE) that do not.

## 11.1 Ohio 2004 CD-5

Mark Lindeman kindly supplied a dataset of precinct vote counts (sizes) for the Ohio congressional district 5 race (OH-05) in 2004. A total of votes were cast in 640 precincts, whose sizes ranged from 1637 (largest) to 132 (smallest), a difference by a factor of more than 12. See Figure 2.

Let us assume a margin of victory of : . Assume the adversary changes at most of a precinct's votes, and assume a confidence level of ().

If the precincts were equal-sized, the Rule of Thumb [16] would suggest auditing precincts. The more accurate APR formula (3) suggests auditing 93 precincts (here precincts). The expected workload would be 45852 votes counted. But the precincts are quite far from being equal-sized. If we sample 93 precincts uniformly (using the APR recommendation inappropriately here, since the precincts are variable-sized), we now only achieve a 67% confidence of detecting at least one corrupted precinct, when the adversary has changed enough votes to change the election outcome. The reason is that all of the corruption can fit in the 7 largest precincts now.

The SAFE auditing method [11] would determine that (reduced from for the uniform case, since now the adversary need only corrupt the 7 largest precincts to change the election outcome). Using a uniform sampling procedure to have at least a 92% chance of picking one of those 7 precincts (or any corrupted precinct) requires a sample size of 193 precincts (chosen uniformly), and an expected workload of 95,155 votes to recount.

With the NEGEXP method, larger precincts are sampled with greater probability. The adversary is thus prodded to disperse his corruption more broadly, and thus needs to use more precincts, which makes detecting the corruption easier for the auditor. The NEGEXP method computes , and audits a precinct of size with probability . The largest precinct is audited with probability 0.408, while the smallest is audited with probability 0.041. The expected number of precincts selected for auditing is only 92.6, and the expected workload is only 50,937 votes counted.

The PPEBWR method gave results almost identical to those for the NEGEXP method. The expected number of distinct precincts sampled was 91.6, and the expected workload was 50402 votes counted. Each precinct was sampled with a probability within 0.0031 of the corresponding probability for the NEGEXP method.

We see that for this example the NEGEXP method (or the PPEBWR method) is approximately twice as efficient (in terms of votes counted) as the SAFE method, for the same confidence level.

The program and datasets for our experiments are available at https://people.csail.mit.edu/rivest/pps/varsize.py, https://people.csail.mit.edu/rivest/pps/oh5votesonly.txt (Ohio).

The SAFE method may often be a poor choice when there are variable-precinct-sizes, particularly when there are a few very large precincts. One really needs a method that is tuned to variable-sized precincts by using variable auditing probabilities, rather than a method that uses uniform sampling probabilities.

# 12 Optimal Auditing Method

The optimal auditing method can be represented as a probability distribution assigning a probability to each subset , where indicates the probability that the auditor will choose the subset of precincts, , for auditing. Since there are such subsets, representing these probabilities explicitly takes space exponential in .

The optimal strategy can be found with linear programming, if the number of precincts is not too large (say a dozen at most). The linear programming formulation requires that for each subset of total error bound or more votes, the sum of the probabilities of the subsets having nonnegative intersection with needs to be at least .

In addition to these constraints, the probabilities must form a distribution; i.e., they each must be nonnegative, and their sum must be 1.

Finally, the objective function to be minimized is the expected number of votes to be recounted:

For example, suppose we have precincts with sizes and error bounds , an adversarial corruption target of votes, and a target significance level of . Then an optimal auditing strategy, when the auditor is charged on a per-vote-recounted basis, is:

Here denotes the empty subset; subsets not shown have zero auditing probability. The expected cost of this optimal auditing strategy is 76 votes recounted. (The above strategy also optimizes (at 1.9) the expected number of precincts recounted; however, it is not always the case that the same probability distribution optimizes for both precincts counted and votes counted: a small counterexample occurs for and .)

This approach is the gold standard'' for auditing with variable-sized precincts, in the sense that it definitely provides the most efficient procedure in terms of the stated optimization criterion. (We note that it is easy to refine this approach to handle the following variations: (1) an optimization criterion that is some linear combination of precincts counted and ballots counted and (2) a requirement that exactly (or at least, or at most) a certain number of precincts be audited.)

However, as noted, it may yield an auditing strategy with as many as potential actions (subsets to be audited) for the auditor, and so is not efficient enough for real use, except for very small elections.

# 13 Discussion and Recommendations

## 13.1 Recommendations for practice-PPEBWR

We recommend the use of the PPEBWR method for use in an audit in a simple election. It gives the most efficient audit, for a given confidence level, of the audit methods studied here (other than the optimal method, which is too inefficient for practical use). Figure 3 summarizes the PPEBWR audit procedure recommended for use.

In an election containing multiple races (possibly with overlapping jurisdictions), the NEGEXP method is the more flexible. See Section 13.2 for discussion.

If the error bounds are computed using only the Linear Error Bound Assumption, so that , then the probability of picking precinct is just , so that we are picking with probability proportional to size''--this is then the PPSWR procedure. When the Linear Error Bound Assumption is used, one is assuming that errors larger than in a precinct will be noticed and caught by other means''; one should ensure that this indeed happens. (Letting runners-up pick precincts to audit could be such a mechanism.)

Other considerations may result in interesting and reasonable modifications. Letting runners-up pick precincts to audit is probably helpful, although these precincts should then be ignored during the PPEBWR portion of the audit.

The escalation'' procedure for enlarging the audit when significant discrepancies are found is (intentionally) left rather unspecified here. We recommend reading Stark [19] for guidance. At one extreme, one can perform a full recount of all votes cast. More reasonably, one can utilize a staged procedure, where the error budget is allocated among the stages; only if enough new discrepancies are discovered in one stage does auditing proceed to the next.

## 13.2 Recommendations for practice-NEGEXP

Figure 4 summarizes the NEGEXP audit procedure recommended for use. The NEGEXP method seems intrinsically more flexible than the PPEBWR method.

NEGEXP can handle multiple races with overlapping jurisdictions such that each precinct is audited at most once even when it is marked for auditing in multiple races. As with any basic auditing method, each precinct is audited independently with a precinct-specific probability. Assume that when a precinct is audited, we audit all races voted on in that precinct. Since the results for each race may imply a different auditing probability for the precinct, it suffices to audit the precinct with the maximum of the probabilities corresponding to the different races.

In a similar manner, the NEGEXP method can be used when the auditing probabilities need to be changed (e.g. because of the effect of late-reporting jurisdictions). Assume that the auditing probability changes from to . If the precinct was audited in the first audit, nothing additional needs to be done. If the precinct was not audited and , nothing needs to be done because we already audited the precinct with a larger probability than we need to. Otherwise (when ), a dice-roll with probability should be used to determine if the precinct should now be audited. The additional dice-roll ensures that the overall probability of auditing the precinct in discussion is , the final auditing probability.

## 13.3 Discussion

If the election is not a plurality (winner-take-all), little changes except that the notion of a margin of victory'' needs to be appropriately modified, so that the notion of a candidate'' is replaced by that of an election outcome''. (Elaboration omitted here.) Our auditing problem is closely related to the classic notion of an inspection game'', with an inspector'' (the auditor) and an inspectee'' (the adversary). Inspection games fit within the standard framework of game theory. With optimal play, both auditor and adversary use randomized strategies. See Avenhaus et al. [2] for discussion.

It would be preferable in general, rather than having to deal with precincts of widely differing sizes, if one could somehow group the records for the larger precincts into bins'' for pseudo-precincts'' of some smaller standard size. (One can do this for say paper absentee ballots, by dividing the paper ballots into nominal standard precinct-sized batches before scanning them.) It is harder to do this if you have DRE's with wide disparities between the number of voters voting on each such machine. See Neff [12] and Wand [22] for further discussion.

# 14 Conclusions

We have presented two useful post-election auditing procedures: a powerful and flexible negative-exponential'' (NEGEXP) method, and a slightly more efficient sampling with probability proportional to size, with replacement'' (PPEBWR) method.

# Acknowledgments

Thanks to Mark Lindeman for helpful discussions and the Ohio dataset. Thanks also to Kathy Dopp, Andy Drucker, Silvio Micali, Howard Stanislevic, Christos Papadimitriou, and Jerry Lobdill for constructive suggestions. Thanks to Phil Stark for his detailed feedback and pointers to the financial auditing literature.

## Bibliography

1
ASLAM, J., POPA, R., AND RIVEST, R. L.
On estimating the size and confidence of a statistical audit.
In Proceedings EVT'07 (2007).
https://www.usenix.org/events/evt07/tech/full_papers/aslam/aslam.pdfhttps://www.usenix.org/events/evt07/tech/full_papers/aslam/ aslam.pdf.

2
AVENHAUS, R., STENGEL, B. V., AND ZAMIR, S.
Inspection games.
In Handbook of Game Theory, R. J. Aumann and S. Hart, Eds., vol. III. Elsevier, January 30 1998.
https://citeseer.ist.psu.edu/212144.html.

3
CALANDRINO, J. A., HALDERMAN, J. A., AND FELTEN, E. W.
Machine-assisted election auditing.
In Proc. EVT'07 (2007).
https://www.usenix.org/event/evt07/tech/full_papers/calandrino/calandrino.pdfhttps://www.usenix.org/event/evt07/tech/full_papers/calandrino/ calandrino.pdf.

4
COCHRAN, W. G.
Sampling Techniques (3rd ed.).
Wiley, 1977.

5
CORDERO, A., WAGNER, D., AND DILL, D.
The role of dice in election audits -- extended abstract, June 16 2006.
IAVoSS Workshop on Trustworthy Elections (WOTE 2006). https://www.cs.berkeley.edu/ daw/papers/dice-wote06.pdfhttps://www.cs.berkeley.edu/~daw/papers/dice-wote06.pdf.

6
DOPP, K.
History of confidence election auditing development (1975 to 2007) and overview of election auditing fundamentals, 2007.
https://electionarchive.org/ucvAnalysis/US/paper-audits/History-of-Election-Auditing-Development.pdfhttps://electionarchive.org/ucvAnalysis/US/paper-audits/History-of-Election-Auditing-Development.pdf.

7
DOPP, K., AND STENGER, F.
The election integrity audit, 2006.
https://electionarchive.org/ucvAnalysis/US/paper-audits/ElectionIntegrityAudit.pdfhttps://electionarchive.org/ucvAnalysis/US/paper-audits/ElectionIntegrityAudit.pdf.

8
ELECTIONLINE.ORG.
Case study: Auditing the vote, March 2007.
https://electionline.org/Portals/1/Publications/EB17.pdfhttps://electionline.org/Portals/1/Publications/EB17.pdf.

9
HANSEN, M. H., AND HURWITZ, W. N.
On the theory of sampling from finite populations.
Ann. Math. Stat. 14 (1943), 333-362.

10
LOBDILL, J.
Considering vote count distributions in designing election audits, Oct 9 (rev. 11/26/2006) 2006.
https://vote.nist.gov/Considering-Vote-Count-Distribution-in-Designing-Election-Audits-Rev-2-11-26-06.pdfhttps://vote.nist.gov/Considering-Vote-Count-Distribution-in-Designing-Election-Audits-Rev-2-11-26-06.pdf.

11
MCCARTHY, J., STANISLEVIC, H., LINDEMAN, M., ASH, A., ADDONA, V., AND BATCHER, M.
Percentage-based versus SAFE vote tabulation auditing: a graphic comparison.
The American Statistician 62, 1 (February 2008), 11-16.

12
NEFF, C. A.
Election confidence--a comparison of methodologies and their relative effectiveness at achieving it (revision 6), December 17, 2003.
https://www.votehere.com/papers/ElectionConfidence.pdfhttps://www.votehere.com/papers/ElectionConfidence.pdf.

13
NORDEN, L., BURSTEIN, A., HALL, J. L., AND CHEN, M.
Post-election audits: Restoring trust in elections, 2007.
Brennan Center for Justice at New York University [School of Law and Samuelson Law, Technology and Public Policy Clinic at UC Berkeley School of Law], https://www.brennancenter.org/dynamic/subpages/download_file_50089.pdfhttps://www.brennancenter.org/dynamic/subpages/

14
ON NONSTANDARD MIXTURES OF DISTRIBUTIONS, P.
Statistical models and analysis in auditing: A study of statistical models and methods for analyzing nonstandard mixtures of distributions in auditing.
National Academy Press, Washington, D.C., 1988.

15
RIVEST, R. L.
On auditing elections when precincts have different sizes, 2007.
https://people.csail.mit.edu/rivest/OnAuditingElectionsWhenPrecinctsHaveDifferentSizes.pdfhttps://people.csail.mit.edu/rivest/OnAuditing ElectionsWhenPrecinctsHaveDifferentSizes.pdf.

16
RIVEST, R. L.
A simple rule of thumb for election audit size determination, 2007.
https://people.csail.mit.edu/rivest/Rivest-ASimpleRuleOfThumbForElectionAuditSizeDetermination.pdfhttps://people.csail.mit.edu/rivest/Rivest-ASimpleRuleOfThumbForElectionAuditSizeDetermination.pdf.

17
SALTMAN, R. G.
Effective use of computing technology in vote-tallying.
Tech. Rep. NBSIR 75-687, National Bureau of Standards (Information Technology Division), March 1975.
https://csrc.nist.gov/publications/nistpubs/NBS_SP_500-30.pdfhttps://csrc.nist.gov/publications/nistpubs/NBS_SP_500-30.pdf.

18
STANISLEVIC, H.
Random auditing of e-voting systems: How much is enough?, Revision August 16, 2006.
https://www.votetrustusa.org/pdfs/VTTF/EVEPAuditing.pdfhttps://www.votetrustusa.org/pdfs/VTTF/EVEPAuditing.pdf.

19
STARK, P. B.
Conservative statistical post-election audits, Nov 15 2007.
https://statistics.berkeley.edu/ stark/Preprints/conservativeElectionAudits07.pdfhttps://statistics.berkeley.edu/~stark/ Preprints/conservativeElectionAudits07.pdf.

20
STARK, P. B.
Election audits by sampling with probability proportional to an error bbound: Dealing with discrepancies, Feb 20 2008.
https://www.stat.berkeley.edu/ stark/Preprints/ppebwrwd08.pdfhttps://www.stat.berkeley.edu/~stark/Preprints/ppebwrwd08.pdf.

21
STRINGER, K.
Practical aspects of statistical sampling in auditing.
In Proceedings of the Business and Economic Statistics Section (Washington, D.C., 1963), American Statistical Association, pp. 405-411.

22
WAND, J.
Auditing an election using sampling: The impact of bin size on the probability of detecting manipulation, Feb 2004.
https://wand.stanford.edu/elections/probability.pdfhttps://wand.stanford.edu/elections/probability.pdf.

On Auditing Elections When Precincts Have Different Sizes

This document was generated using the LaTeX2HTML translator Version 2002-2-1 (1.71)

Copyright © 1993, 1994, 1995, 1996, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
Copyright © 1997, 1998, 1999, Ross Moore, Mathematics Department, Macquarie University, Sydney.

The command line arguments were:
latex2html -split 0 -show_section_numbers -local_icons -no_navigation varsize.tex

The translation was initiated by Raluca Ada Popa on 2008-06-30