Attack 3 - Steal bulk IDs from the database

This attack requires a bit more intelligence. They will try to obtain legitimate ids in bulk, before the real goods are publicly available, and attach the IDs to their fakes.


Assuming they're trying to sell to the FD market, even if they succeed, the problem they immediately face is where and when they can sell the fakes, given that they must pass ALL the tests.


They can only sell them in a location and at a time which must both be consistent with the chain of custody. And they must get rid of all their stock before the real thing hits the street and begins to be registered in its own right. In most cases this gives them, at best, a window of opportunity of a few hours, perhaps a day or two, to sell the fakes in the vicinity of the legitimate traders, who aren't going to be too happy to see the fakes materialising in their neighbourhood the day before they get hold of the authentic goods. At the very least we can see that this will be a pretty hostile environment to have to work in.


Nevertheless, we need to protect ourselves against such an attack and reduce, still further, the chances of its success.


There are only two places they can obtain the data. First is the central database. This, obviously, will be accessible via the web and the web is notoriously insecure, so they will try to hack into the database. We will employ best practice and state of art defences against that as a matter of course. In addition, however, we realised that we have a means of removing, from the database, that which is of greatest value to the counterfeiter, viz the actual unique identifiers.


What we will do is this:
We provide software (or the algorithms and protocols for those who want to amend their existing software) which creates the unique IDs. The first version has already been written. We call the unique IDs "Validation References" (VRs). The software also creates a 20 character (160 bit) hash digest of the VR using the public domain SHA1 one-way hashing algorithm (or 32 character - 256 bit using SHA256). It is this hash value which we store on the Authentic One (A1) database.


Hash digests are irreversible largely because the multiple iterations of the function tend to nudge the binary data one bit to the right which means we lose the least significant bit on each iteration. As this has an equal chance of being a 1 or a 0, and we don't store a record of which it was, then after sufficient iterations, there is effectively zero chance of reassembling the original from the resulting hash, even if you run the algorithm backwards.


The only way to "decrypt" such a hash is to use so called "brute force". If, to use a ludicrous example, the VR is just one character long, and we're allowed to use any valid ASCII character, then, to find a match, we just need to try each of the 256 options in turn until the resulting hash matches the target. This would take a decent programmer with a modern PC all of about 1/5th of a second.


In practice, however, we couldn't make use of all 256 ASCII characters because many of them aren't printable and most of them can't be easily entered by a consumer from their standard keyboards. These limitations - which will remain until electronic readers (another Bluetooth project?) become standard and ubiquitous - really mean that we're limited to the letters of the alphabet and digits 1 to 9. We don't want to use 0 because its too similar to O.


And human beings don't like having to type things like "nK3fD" and make large numbers of errors when they are forced to do so. So we need to stick to capitals only. In the end, we're actually limited to just 35 characters: A-Z and 1-9.


Our PC can brute force that in about 1/35000th of a second. So we need to make the VR a little longer. At a length of 5 characters, the brute force takes about a minute to try out all possibilities. On average, it will find a match about half way through the search. So the average search for a match on a 5 character VR will be 30 seconds. And that's just on a PC. If we take it up to one of IBMs latest supercomputers, that 30 seconds drops back down to around 1/10,000th of a second.


However, increase the length of the VR up to 10 characters and each attempt to brute force a single hash match takes even Big Blue over one and a half hours. Given that this would only yield a single match for the counterfeiter, and that time on a computer like that is pretty expensive, and that not many counterfeiters have access to such computing power (yet), we might conclude that 10 characters is long enough.


However, Moore's Law suggests that in 18 months time that same calculation will either be half the cost or twice the speed. In 15 years, that calculation will probably require just 5 seconds of computer time. In fact, we may, in that timescale, see the introduction of the quantum computer which could perform calculations like that virtually instantaneously.


When the QC comes along, we'll have a few more years before it is readily available and we are forced to abandon the mechanism. Nevertheless, for the time being, the method is secure and, with a VR of sufficient length, will prevent all currently practical attempts at brute forcing the data.


For reasons which will become clear shortly, the actual VR length we recommend is 20 characters. In current computing terms that requires a brute force session lasting several times the age of the universe (to crack just one VR). That certainly qualifies as a reasonable margin of error.

The 20 characters would also incorporate 5 check digits so that the resulting VR would actually be 25 characters in length and emerge looking like a Microsoft license key eg:
3FE7Y QQ8TG BHOL4 UR88V PL92Z
where every 5th character would be a check character for the previous 4 (and a cumulative check) which would ease the manual entry task. The experience that Microsoft has with such strings (even without check digits) suggests that it is not an insurmountable hurdle for the average consumer. You can see a demonstration of how this might be implemented here: http://www.authentic1.com/a1/ptrentry.cfm


We haven't said so but it should be evident that the VR will be randomly generated. We have not carried out any analysis of the required quality of randomness but if this is shown to be a potential problem we will need to ensure appropriate high quality randomness is available. There are a number of strong pseudo random number generators available but ultimately, for example, if required, we could even provide Perfect Security's ps_rand (http://www.onetimepad.net/en/PS-Rand.html) tool as part of our VR generating application. This produces very high quality randomness based on the Intel PIII on-chip randomness function. We have tested its output in the context of creating one time pads for other purposes and several hundred megabytes have passed all the Diehard tests with flying colours.


It is conceivable that an attacker might attempt to obtain valid VRs by writing a program to automate the consumer logon procedure, entering random VRs of their own and noting those which don't "fail".


This rates as a naïve attack in its own right as it clearly fails to appreciate the scale of the task. If there were, for instance, 10 billion unregistered legitimate VRs on the database, then, given that the potential number of VRs is a massive 1031, each attempt at finding a match would have an approximate chance of 1 in 1021 of success. Ignoring the delays inherent in web transactions, lets assume that the attacker can attempt 1 million VRs per second. On average it would still take around 15 million years to find a single match.


So now the counterfeiter knows that there is simply no point attacking the central database; at least, not with a view to obtaining a bunch of valid usable VRs. It is worth emphasising that the VR hashes won't even appear in the Database until the goods leave the Factory gates. At which point, the counterfeiter has a matter of a few days to crack into the database, steal the hashes, brute-force decrypt them back into plaintext VRs, mark all his fakes with those VRs and distribute them to locations matching the distribution of the real products, arriving at those locations in time to be sold in the window between first legitimately acceptable arrival time and first registrations of the real thing. We modestly suggest that this attack will fail.