Check out the new USENIX Web site. next up previous
Next: Empirical Results Up: Encoding Scheme Previous: Performance


Security

One potential security weakness of this scheme is the fact that an adversary who captures the device can conceivably reconstruct $\underline{s}$ from not just one element of the table per row, but instead using any $m$ elements of the table. For example, if the adversary had reason to believe that the first $m/2$ rows contained no distinguishing features--and thus all table elements in these rows were valid--then the adversary could set $t'[i] = t[i]$ for $0 \le i < m$ and $\underline{U}' = (\underline{u}_i)_{0 \le i < m}$ and use these to solve for $\underline{s}$. Therefore, it is important to use only features that are more likely than not to be distinguishing.

We now describe the fastest attack on our scheme of which we are aware. An attacker who captures the device on which the key is generated but who has no information about the user's distinguishing features may attack the system by repeatedly guessing a feature descriptor $b$ at random and solving (4). If there are $d$ distinguishing features then each guess will be successful with probability $2^{-d}$, but will require the attacker to solve a system of $m$ linear equations, which is quite time consuming. A faster approach is to choose feature descriptors $b_0, b_1, \ldots$ such that each differs from the last in one bit. Then, computing $\underline{U}_{b_i}^{-1}$ requires only one new Gaussian elimination step per $b_i$, and the further optimizations outlined in Section 4.2 can also be applied in this case.

The expected time for this attack to succeed can be computed as follows. Assume that $b_i$ and $b_{i+1}$ differ in exactly one position that is chosen at random. Let $G(c)$ for $c \ge 2$ denote the expected number of Gaussian elimination steps performed until reaching a $b_i$ with no errors (i.e., that is consistent with all of the user's distinguishing features), assuming that $b_0$ has $c$ errors. Note that $b_{i+1}$ has a different number of errors than $b_i$ with probability $d/m$, and if it has a different number of errors, then it decreases the number of errors (by one) with probability $c/d$ and increases it with probability $(d-c)/d$. Hence, we get the following equations for $G(c)$.

\begin{eqnarray*}
G(2) & = & 0 \\
G(c) & = & \frac{m}{d} + \frac{c}{d}G(c-1) + \frac{d-c}{d}G(c+1) \\
& & \mbox{ for } 2 < c \le d
\end{eqnarray*}



Solving for these linear equations at $c = d/2$ yields the expected number of Gaussian elimination steps before recovering the key, since a random feature descriptor contains an expected $c = d/2$ errors. Moreover, after the Gaussian elimination step for each $b_i$, $m(m-1)$ multiplications are required to test $b_i$ on average. So, the total cost of this attack is as shown in Figure 3. (Actually, this is a conservative lower bound, since the cost of each Gaussian elimination step itself is not counted.)

Figure 3: A lower bound on the expected number of multiplications to exhaustively search for the key, assuming that the $d$ distinguishing features are uniformly distributed; see Section 4.4.
\begin{figure}
\begin{center}
\leavevmode \epsfxsize =3.2in \epsfbox{brute.eps}
\end{center}
\end{figure}

In the empirical evaluation that we perform in Section 5, we evaluate our approach at $m=60$, which is the smallest value of $m$ shown in Figure 3. Here, if $70\%$ of the features are distinguishing, then this attack conservatively requires an expected $2^{44}$ multiplications. If $80\%$ of the features are distinguishing, then this attack requires at least $2^{50}$ multiplications on average. Obviously the security of the scheme improves as $m$ and $d/m$ are increased, and this is a goal of our ongoing work.


next up previous
Next: Empirical Results Up: Encoding Scheme Previous: Performance
fabian 2002-08-28