3.5.2 Store cryptographic keys securely in the fewest possible locations and forms.Hmm. Before we even get too far, there is a redundancy with 3.5.2 and 3.6.3. Why even have 3.5.2 if 3.6 covers the items in more detail? I digress ...
3.6 Fully document and implement all key-management processes and procedures for cryptographic keys used for encryption of cardholder data, including the following:
3.6.3 Secure cryptographic key storage.
3.6.6 Split knowledge and establishment of dual control of cryptographic keys.What the authors of the DSS were thinking was that PCI compliant merchants would implement cold war-esque missile silo techniques in which two military officers would each place a physical key into a control console and punch in their portion of the launch code sequence. This is technically possible to do with schemes like Adi Shamir's key splitting techniques. However, it rarely makes sense to do so.
Consider an automated e-commerce system. The notion of automation means it works on its own, without human interaction. If that e-commerce system needs to process or store credit card numbers, it will need to encrypt and decrypt them as transactions happen. In order to do those cryptographic functions, the software must have access to the encryption key. It makes no sense for the software to only have part of the key or to rely on a pair of humans to provide it a copy of the key. That defeats the point of automation.
If the pieces of the key have to be put together for each transaction, then a human would have to be involved with each transaction-- definitely not worth the expense! Not to mention an exploit of a vulnerability in the software could result in malicious software keeping a copy of the full key once it's unlocked anyway (because it's the software that does the crypto functions, not 2 people doing crypto in their heads or on pen and paper!).
If a pair of humans are only involved with the initial unlocking of the key, then the software gets a full copy of the key anyway. Any exploit of a vulnerability in the software could potentially read the key, because the key is in its running memory. So, on the one hand, there is no requirement for humans to be involved with each interaction, thus the e-commerce system can operate more cheaply than, say, a phone-order system or a brick-and-mortar retailer. However, each restart of the application software requires a set of 2 humans to be involved with getting the system back and online. Imagine the ideal low-overhead e-commerce retailer planning vacation schedules for its minimal staff around this PCI requirement! PCI essentially dictates that more staff must be hired! Or, that support staff that otherwise would NOT have access to a portion of the key (because they take level 1 calls or work in a different group) now must be trusted with a portion of it. More hands involved means more opportunity for collusion, which increases the risk by increasing the likelihood of an incident, which is NOT what the PCI folks are trying to accomplish!
The difference between a cold war missile silo and an e-commerce software application is the number of "secure" transactions each must have. Missile silos do not launch missiles at the rate of several hundred to several thousand an hour, but good e-commerce applications can take that many credit cards. When there are few (albeit more important) transactions like entering launch codes, it makes sense to require the attention of a couple different people.
So splitting the key such that an e-commerce software application cannot have the full key is stupid.
But then there is the coup-de-grace in the "Testing Procedures" of 3.5.2:
3.5.2 Examine system configuration files to verify that keys are stored in encrypted format and that key-encrypting keys are stored separately from data-encrypting keys.This is the ultimate in pointless PCI requirements. The real world analogue is taking valuables and stashing them in a safe that is unlocked with a key (the "data-encrypting key" or DEK in PCI parlance). Then, stash the key to the first safe into a second safe also unlocked by a key (the "key-encrypting key" or KEK in PCI parlance). Presumably, at this point, this second key will be like something out of a James Bond film where the key is actually in two parts, each possessed by one of two coordinating parties who reside in two geographically distinct locations. In practice, however, the second key is typically just slipped under the floor mat and the two safes are sitting right next to one another. It takes a little longer to get the valuables out of the safe, but does little to actually prevent a thief from doing so.
In an e-commerce system, it's no different. All of the same pointlessness of splitting keys (as described above) still applies, but now there is an additional point of failure and complexity: the KEK. Encryption does add overhead to a software application's performance, though the trade-off is normally warranted. However, if at each transaction the software needs to use the KEK to unlock the DEK and then perform an encrypt or decrypt operation on a Credit Card number, the overhead is now double what it was previously. As such, most software applications that use KEKs don't do that. Instead, they just use the KEK to unlock the DEK and keep the DEK in memory throughout the duration of operation, until presumably the software or server need to be taken offline for reconfiguration of some kind. With an encryption key in memory, there's still the plausible risk that an exploit of a vulnerability in the software application could result in disclosure of the key or the records that are supposedly protected by it. Even if the key was once in memory, recent research reminds us that data remanence of RAM retains sensitive data like keys for longer than we might expect.
And if a software application is allowed to initiate and unlock its keys without a human (or pair of humans in the case of key splitting), such as the case when the e-commerce application calls for a distributed architecture of multiple application servers in a "farm", then there really is no point of having a second key. If the application can read the second key to unlock the first key, then so could an attacker that gets the same rights as the application. The software might as well just read in the first key in an unencrypted form, which would at least be a simpler design.
If the threat model is the server administrator stealing the keys, then what's the point? Administrators have FULL CONTROL. The administrator could just as easily steal the DEK and the KEK. And no, the answer is not encrypting the KEK with a KEKEK (Key-Encrypting-Key-Encrypting Key), nor with a KEKEKEK, etc., because no matter how many keys are involved (or how many safes with keys in them), the last one has to be in plaintext (readable) form! The answer is, make sure the Admins are trustworthy (which means do periodic background checks, psych evaluations if they are legal, pay them well and do your best to monitor their actions).
If the threat model is losing (or having stolen) a backup tape of a server's configuration and data, then, again, a KEK offers little help, since an attacker who has physical access to a storage device has FULL CONTROL and can read the KEK and then decrypt the DEK and the credit card records.
It is also commonly suggested by PCI assessors that KEKs be stored on different servers to deal with the threat of an administrator stealing the DEK. But that is really just the same as the problems above, just rehashed all over again.
If the application on Server A can automatically read in the KEK on Server B, then (again) a vulnerability in the application software could potentially allow malware to steal the KEK from Server B or cache a copy of it once in use. If the admin of Server A also has admin access to Server B, then it's the same problem there, too; the admin can steal the KEK from Server B and unlock the DEK on Server A. If the admin does NOT have access to the KEK, then it's the two-humans-required-for-support-scenarios all over again, like key splitting. If the KEK cannot be read by Server A when it is authorized to do so (such as the mechanism for reading the KEK from Server B is malfunctioning), then Server B's admin must be called (Murphy's Law indicates it will be off-hours, too) to figure out why it's not working. And in a small to medium sized e-commerce group, like in an ideal low-overhead operation, it will almost always be the same person or couple of people that have admin access to both. In the large scale, an admin of Server A and an admin of Server B can just choose to collude together and share the ill gains from the theft of the KEK, the DEK, and the encrypted credit card records.
What about a small e-commerce application, where the web, application, and database tiers are contained in one physical server for cost reasons? In that case, perhaps the "scope" of PCI compliance would have previously been a single server, but using a secondary server to house a single KEK now introduces that secondary server into the littany of other PCI requirements, which will likely erode the cost benefit of a single server to house all tiers of the application in the first place.
In the off-chance that a backup copy of Server A's configuration and data is lost or stolen, there will be a false sense of temporary hope if Sever B's backup is not also lost or stolen. However, collusion and/or social engineering of an admininstrator of Server B still applies. Also, this will allow an attacker time to reverse engineer the software that is now in the hands of the attacker. Is there a remote code execution bug in the software that could allow an attacker to leak either the full encryption key or individual credit card records? Is there a flaw in the way the crypto keys were generated or used that reduces the brute-force keyspace from the safety of its lofty 2^128 bit perch? Did the developers forget to turn off verbose logging after the support incident last weekend? Are there full transaction details in the clear in a log file? A better approach would be to just use some sort of full storage volume encryption INSTEAD of transaction encryption, such that none of those possibilities would occur. But in practice, that is rarely done to servers (and somehow not mandated by the PCI DSS).
So storing the KEK on multiple servers just introduces more support complexity than it reduces risk from data theft.
And if we now know that these requirements don't guarantee anything of substance that would pass Kerckhoff's Principle (i.e. security by obscurity does not make these key storage techniques more "secure"), then we can also say that having multiple keys, separate KEK storage, and key-splitting all violate 3.5.2 because the keys are not stored "securely in the fewest possible locations and forms".
To Recap: Splitting keys is not feasible in most cases; it negates the benefit from having less people involved with e-commerce. Encrypting keys with other keys is an attempt to implement computer security perpetual motion machines. If you really are paranoid, implement full volume encryption on your servers. If you're not, well, ditch the transaction crypto unless you just happen to have CPU cycles to spare. If you must be "PCI Compliant" (whatever that means outside of "not paying fines"), then implement your e-commerce app to have an "auditor mode" by default, where it requires two people to each type in part of a key for the application to initiate. Then let it have a normal "back door" mode where it just uses the regular key for everything. [Most PCI Assessors are not skilled or qualified enough to validate the difference by inspecting the software program's behavior. They really just want a screen shot to go into their pretty report. Of course, this requires a detailed understanding of the ethics involved, and your mileage may vary.]
And remember: "you're only paranoid if you're wrong."
UPDATED: It was pointed out that there is even one more crazy method that PCI Assessors think can turn this polished turd into alchemist's gold: "encode" the KEK into the software's binary. By "encode" they mean "compile", as in have a variable (probably a text string) that contains the value of the KEK. Rather than have the software read in that KEK value from a file, have it just apply the KEK from the static or constant variable in the decryption operation that unlocks the DEK. This is equally stupid as the above. If the point of having a KEK and a DEK is to prevent someone who has access to the file system from unlocking credit card records, then the PCI folks completely missed "intro to pen testing 101" which describes the ultra l33t h4x0r tool called "strings". Any text strings (ASCII, UNICODE, what have you) can be extracted by that ages-old command line utility. So, if the threat model is somebody who stole the contents of a server's disks-- they win. If the treat model is a server administrator-- they win. Not to mention, the common practices of software developers is to store source code in a repository, presumably a shared repository. Any static variable that is "encoded" into a binary will live in source code (unless, I guess, the developer is sadistic enough to fire up a hex editor and find/replace the ASCII values in the binary after it's compiled) and source code lives in repositories, which means even more opportunities for collusion. This type of crypto fairydust magic is pure fiction-- it just doesn't work like they think it does.