when production data is allowed to visit the slums

Adam over at EmergentChaos posted this blurb, which I’m also going to quote, in regards to an Accenture data loss incident:

Connecticut hired Accenture to develop network systems that would allow it to consolidate payroll, accounting, personnel and other functions. Information related to Connecticut’s employees was contained on a data tape stolen from the car of an Accenture intern working on an unrelated, though similar project for the State of Ohio. (The tape also contained personal information on about 1.3 million Ohio residents.) The intern apparently had been using the Connecticut program as a template for the Ohio project.

Holy shit, do I hate when developers insist on using protection data in development environments. It is amazing how difficult that fight can be to get them to use test data, or to take production data and thoroughly scrub it on the first copy down. Of course, later on they want “refreshes” downward, or they start sharing amongst themselves when one wins the fight for their project…

Couple that with the fight to allow them to put such data on their laptop, and you get a lot of bad blood pretty quickly over just two out of a gazillion issues.

It is going to be very important in coming years that companies who allow their data to be used by someone else will want written statements about who has access to their data and where. Will it be on development systems in the squishy internal network, or available for an intern to query out and take home? Can you provide names of everyone that will have access? If you have any DBA duties, start preparing for this storm now! These questions are being whispered now*, but often aren’t taken too seriously…yet.

1- Know who has access to what data, including queries as well as full database access.
2- Provide a process for requests and approvals for access to databases.
3- Know who ran what and took out what data. If an intern pulls a bunch out, you better well know it when they do. Know how to pull those logs and massage them for the answers.

These are just a few basic management questions that, if not answered, will leave them in a position to make uninformed decisions and actions.

As a side note, other questions are being and should be asked about the whole lifecycle of that data when it leaves the nest. Is the transmission of that data secure (SFTP, FTP, Web, Email…)? Is the first stop for that data secure and/or temporary (your contact’s email box, the ftp server…)? How does the data get to the desired location or is it kept somewhere internally before being used (uploaded to a file server, sits in someone’s PST file, gets backed up to tape from the ftp server…)? When at the end location (database, hopefully), who has access to it? When the work is done or the contract terminated, what is the data removal process (tapes, servers, databases, official backups, backups the developers have made…). Yes, it’s more than the DBA, but really the easiest place to start is with the DBA duties.