If anyone has any suggestions on this topic, please comment or tweet or email me!
On page 10 of the PCI DSS v2.0 document, before the actual requirements, there is a section on determining the scope of an assessment, which includes these lines:
The first step of a PCI DSS assessment is to accurately determine the scope of the review. At least annually and prior to the annual assessment, the assessed entity should confirm the accuracy of their PCI DSS scope by identifying all locations and flows of cardholder data and ensuring they are included in the PCI DSS scope. To confirm the accuracy and appropriateness of PCI DSS scope, perform the following:
- The assessed entity identifies and documents the existence of all cardholder data in their environment, to verify that no cardholder data exists outside of the currently defined cardholder data environment (CDE)…
The key word in that whole part is that pesky, “should.” Changing that word would make this an unnumbered unrequirement. In my case, my particular QSA has opted to make this a requirement of the scope, i.e. I need to scan my entire network for stray bits of cardholder data.
Let me say I completely agree with this need. There is everything to gain from a scan like this. And not only should it be necessary, but having the ability to perform a scan like this would mean being able to leverage it for other purposes, like client-specific data, porn (conditionally), or anything else hiding in places it shouldn’t.
But, this isn’t a small deal (windows servers, linux servers, file servers, encoded files, databases, workstations, email servers…), and I don’t actually know of any tools to actually do all of this short of buying into a DLP product whose first phase of implementation probably involves exactly this task: scan everything to see what needs protecting. That’s a heavy pill (full DLP licensing cost) to swallow for just one task (the initial scan). I’m actuall quite amazed that DLP providers aren’t yet offering this as a standalone service/product.
I have stuck my fingers into a few tools, and so far none are satisfactory. Disclaimer: I have only done *extremely* limited testing, and have not even begun to tackle the database aspect.
PANBuster recently hit the blog posts, though everyone regurgitates the same old intro blurbs without any real details. PANBuster is a small non-installed exe file that you can run on the command line of a system and it will scan a target file or path for PAN data. The scan is quicker and more lightweight then other options. But the results haven’t been all that exciting as I find more hits with other tools (both false and potential positives). The biggest drawback, however, is the lack of any UNC or network path support. Extreme bummer. Scripting would probably mean interrogating servers for all physical drives, and remote execing the install file. Really messy.
Spider from Cornell (currently Spider4 aka Spider2008) is a tool that can be installed and run from a local GUI, but can also be command line-driven as well. Executing a scan via the command line is a bit tricky, but certainly can be done. Executing a subsequent scan will not succeed when unattended unless you do some magic (ok, you delete the locally saved scan state file) each time. Configuration can be governed by an XML file, but the values are arcane at best (wtf does option 1048 mean?) and not documented. The fat GUI app also is really actually executed even when done by a command line, and then exits out. Any strangeness and it’ll sit there waiting for an operator to click an “Ok” button.
On the plus side, Spider *can* technically be scripted, and I already have a plan of action to do so with PowerShell. It will save hits to a discrete log (the file names and paths, but not the actual hit data; that can be saved in an encrypted local database). It can also scan UNC paths, including admin shares with the proper permissions. That alone is a huge plus.
On the negative side, scans are long, can include tons of hits, has no scan result management at all, and really doesn’t make me feel very warm. I’d expect a month of execution to scan my network, but I’d have to constantly check it to make sure it’s not hung on something.
SENF is a tool from UTexas and I’ve not tried it out extensively yet. Like Spider, it is made for educational institution purposes where the institution holds system users responsible for the data on them, and thus provides the tools plus instructions so they can scan their own systems and send in reports. SENF is written in Java, which doesn’t excite me, and none of the literature appears to support UNC or network-bound scanning of any type. I’ve not gone far enough to actually try it yet. Examples of use are few and far between, and the tool does not come with predefined reg expressions…
Tools like CardRecon and IdentityFinder are commercial tools, but just fill the same need as the above options: scanning a discrete single machine and/or local drives. I’m not about to install an agent or tool on 500+ workstations and 200+ servers if I don’t have to.
DLP solutions pretty much universally tout their first phase of deployment to be automated discovery of sensitive information that then needs protection. I’ve not seen more than limited demos of DLP solutions, so I can’t comment on them, but the capital outlay for something to fill this need is annoying. Still, I’m close to actually going through the motions to get some ideas on how they solve this issue.
Forensics tools like EnCase can also help in this regard, but are expensive and also not specifically tailored for network scanning; again they’re a bit more suited to discrete system scanning.
Questions to peers have yielded zero actionable answers. The end result so far is my own conclusion that no one is actually scanning their whole network to validate their expected scope, and this need has been unfulfilled.