Having discussed information classification in organizations and various roles associated with that, in this section, we discuss an interesting method for protecting sensitive information. It involves protection of sensitive information with techniques other than encryption. ‘Data obfuscation’ is one of the solutions for data theft. Data Obfuscation is related to data encryption (cryptography ). Although data encryption is a hot topic in the security domain, it is not a new subject but has received an increasing amount of attention, largely owing to electronic commerce(e-commerce). Data Obfuscation.
Protecting credit card number, medical data and other sensitive information have become more important than ever before and on meaningless and unreadable in their encrypted form. They also must be reasonably secure, that is they must not be easily decrypted without the proper key. Anything less than that will be referred as obfuscation. These are data that are rendered unusable by some means but are not considered as a serious form of encryption. A question that may arise is why would you want to merely obfuscate data, rather than use a string encryption algorithm? Let us understand this concept with an example. Data Obfuscation.
Definition with Example
A good example would be an audit report on a (medical system are known to have high privacy concerns). This report may be generated for an external auditor and contains sensitive information. The auditor will be examining the report for information that indicated possible cases of fraud or abuse. Assume that the management has required that patient names, permanent account number (PAN) ID (an identification taxpayer) and other personal information (PI) should not be made available to the auditor except on an as-needed basis. The data need to be presented to the auditor, but in a way that allows the examination of all data, so that to data mining (DM) experts. Encryption would be a poor choice under these scenarios because the data would be rendered into American Standard Code for Information Interchange (ASCII) values outside of the range of normal ASCII characters. Data Obfuscation.
This would be impossible to read. A better choice might be to ‘obfuscate’ the data with a simple substitution cipher. While this is not considered encryption, obfuscate may be suitable for this situation. While the auditor finds a possible case of abuse, he will need the real name and PAN of the party involved. He could obtain this by calling a customer service representative at the insurance company that supplied the report and ask for the real supplies the real data. The obfuscated data are read to the customer service representatives, who then inputs, and ask for the real information. The obfuscated data are read to the customer service representative, who then inputs it into an application that supplies the real data. The importance of using pronounceable characters becomes very clear. Strong encryption would render this impossible. Data Obfuscation.
More Information with Example
To summarize, we can say that with data obfuscation (instead of encryption), it would not be very difficult to decipher the obfuscation scheme given enough data. A somewhat more effective method involves chopping the text into the segment and re-arranging it as well as obfuscating it. Next, we discuss another technique for preventing sensitive data falling into the hands of unauthorized persons. Sample source code to explain data obfuscation can be found at www.telport.com/-jkstill/util.
There is another way to ‘hide’ sensitive data:’data sanitization’. ‘Masking’ is one of the most commonly used methods for data sanitization. Data sanitization is the process of disguising sensitive information in test databases and development sanitized in order to protect valuable business information and also because there is, in most countries, a legal obligation to do so. Data Obfuscation.
Real Life Example
This technique is different from the previous example in the clear text cannot be reconstructed from the displayed data. This is useful in a situation where it is only necessary to display a portion of the data. A good case for this method is the receipts printed at gas stations (petrol pumps) and convenience stores. When a purchase is made with credit card, the last four digits of the credit are often displayed as clear text (see the graphic), while the rest of the credit card number has been masked with a series of X` s as shown on the hypothetical receipt in the graphic. This method can be used for reports where the person reading the report requires only a portion of the sensitive data. This method is also commonly used for the account numbers on printed transactions from ATMs. Data Obfuscation.
Stop ‘n’ sop
24/7/2017 8:58 P.M.
Veg Sandwich 1 12.58 12.58
Super Petrol 12.5 48.40 618.75
Amex 2/02 xxxx-xxxx-69048
There are a number of data sanitization techniques:
- Numm`ing out;
- Masking data;
- Shuffling records;
- Number variance;
- Gibberish generation;