Data Classification Guide and The NIST Classification Levels
One of the biggest challenges for a business with any sort of information security needs is ensuring proper handling of that information. With hundreds of data breaches, large and small, happening every single year, you don’t want to be a statistic. More than that, though, if you’re working on a government contract and using a framework like HITRUST, HIPAA, or FedRAMP, you need to adhere to high standards.
A breach, and the investigation that follows, can lead to a lot more than a loss of business continuity and the expense of disaster recovery. You might end up losing your contract and, in extreme cases, face civil or even criminal penalties.
Of course, not all information is created equal. A database containing a list of your executives or a document with the keywords you target in advertising is very different from a database that stores your customer’s personal information or a government-controlled CUI.
Understanding how information is classified and how to apply that classification to your security is a key part of building and maintaining compliance within your systems.
BLUF - Bottom Line Up Front
Proper handling of information is key for security and compliance, especially with government contracts. Classify data by impact: public, private, internal, confidential, and restricted. Public data requires minimal security, while confidential and restricted data need strong protection. Use frameworks like NIST to define data impact levels. Stay informed on evolving classification practices for better security. Track your security posture with comprehensive tools to ensure compliance and protect sensitive data.
Understanding Data Classification
Before you can operate in a government contract, you need to pass an audit. Before you can pass that audit, you need to have implemented robust security and confidentiality requirements throughout your organization across a wide range of different processes and controls. Before you can do that, you need to know what those controls are and how they’re applicable to your business processes.
Any business process that handles information will require security for that information. However, as mentioned above, not all information needs to be treated equally. The government, fortunately, acknowledges this and allows you to have comparatively lax (though still robust) security in place for the least valuable and lowest-impact information in your systems.
So, how do you know? That’s what data classification is: identifying the data you handle, what categories it falls into, and what the impact level would be if that information were to fall into the wrong hands. The more damage information could do if it leaked, the higher the security surrounding it needs to be.
Considering Classification Levels
Depending on the standards and frameworks you’re using, the definitions and classification levels you’ll be using might vary. Broad categories tend to be similar between, say, ISO and NIST standards, but there can be some fuzziness around the edges.
Since we’re usually talking about businesses looking to work with the United States government in federal contracts, it’s generally the NIST standards we need to consider. These standards, defined by NIST in various documents, govern both the classification of information, the categorization of that information, and the definition of the impact levels relevant to that information.
When you’re considering the classification of the information your business handles, it’s important to ask yourself a variety of questions.
- What kind of information do you handle, in general? Handling information, in this case, means collecting, storing, and processing that information. Even if all you do is harvest information and pass it on to another party unchanged, that’s still handling the information. Common versions of information categorization include PHI, PCI, and PII.
- Where is data in motion, and where is it at rest? That is, where and how does data enter your ecosystem, and how is that entry point both validated and secured? When you store information, how is the information protected from intrusion? When you forward information onwards as it’s accessed, processed, or requested, how is it secured in transit?
- Who owns and takes responsibility for the data in your organization? Generally, you will need to define an individual whose overall responsibility is to manage data security, which could be someone as part of your Security and Compliance team, or a Data Protection Officer, or another role.
- Who, within your organization, can access the data? Generally, you want to have as few people able to access your data as possible. When access is necessary, it should generally be auditable, logged, and verifiable.
To clarify the acronyms above, PHI, PCI, and PII stand for Personal Healthcare/Cardholder/Identity Information. PHI is healthcare information held and processed by healthcare providers, insurance companies, and the like and is governed by HIPAA in particular. PCI is cardholder information and is data relating to finances, including card numbers, expiration dates, validation codes, and more, all governed by financial regulations. PII, personally identifiable information, is personal data that includes things like social security numbers, driver’s license numbers, and passport numbers.
General Classification Types
Despite how robust the National Institute of Standards and Technology is with defining standards and classifications for nearly everything, they don’t have one clear definition of different kinds of data and what classification they deserve. There are several possible resources and categories that can be relevant.
The most broadly accepted set of classifications puts all data into one of five categories.
Public Data
Public data is information that is freely available and publicly accessible to anyone who would need to ask. This can include things like a vehicle’s license plate number, the name of a company, an individual’s first name and surname, voter registration addresses, phone numbers, job descriptions, company organizational charts, and other such data. This data is essentially valueless; there are very, very few organizations where a list of their members or employees is sensitive enough to pose any threat at all if the information were made public; indeed, most of the time, that information is public already.
In some classification systems, this category of information is left out, primarily because it doesn’t need any particular special security or consideration. Note, as well, that some regions regulate this information; the most obvious example is Europe’s General Data Protection Regulation, or GDPR, which requires even public information to be safeguarded and controlled.
Private Data
The second classification is private data. This data is slightly more valuable than public data but still doesn’t require much specific security. The data being exposed could pose a small risk to individuals or organizations, but that risk is generally very minimal. Private data includes things like the personal contact information of employees, the browsing history of executives, the contents of general email inboxes or cell phones, the ID numbers on student ID cards for a university, and so on. It’s all information that itself isn’t necessarily damaging and generally doesn’t contain anything secret.
Examples such as an email inbox are variable; an inbox used as the public comments field for a web form might not be sensitive, but the email inbox for an executive’s business purposes, for the finance department, or another more critical role may be more important and have a higher risk than just private data.
Internal Data
Internal data is data that relates more directly to the company or organization and is usually restricted in terms of who can access it. This can range from business plans to internal emails to a company intranet, to budget spreadsheets, company data archives, payroll information, business IP ranges, and more. There’s an immense amount of internal company data, and this data can also fall into other classifications.
For example, when individuals apply to work for a company and are hired, they typically submit specific personal information, such as social security numbers, to validate their identities, submit to background checks, and more. This information is generally stored somewhere in the company’s HR system and requires a higher level of security than something like a company business memo.
Confidential Data
Confidential data is data that only a limited number of people can access and often requires special authorization or clearance to have access in the first place. That access may involve identity verification, a specialized password and authentication process, or something else. This is usually the most sensitive information a typical private business might handle, like social security numbers, medical records, insurance provider information, credit card information, financial records, and biometric information.
There are also likely secondary protections on confidential information, which can include things like nondisclosure agreements signed on the parts of the people who can access that information and comprehensive logging of all access and changes made to that information.
Restricted Data
The final classification is information that is the most sensitive information handled by many organizations short of the government. This is what might fall under the Controlled Unclassified Information heading, and it includes things like tax information, protected health information, and more. Sometimes, this is mixed up with confidential information, and there are certain kinds of information that are often found in or covered by multiple classifications.
As you can see, the classification system is a mess and isn’t entirely well-defined.
NIST, FIPS, and Data Impact Levels
Another means of classifying information is to define it across three different axes of influence: the confidentiality of that information, the integrity of that information, and the availability of that information.
Each of these has three possible impact levels: low, moderate, and high. A breach of information that causes a low amount of impact on one of the three axes receives a low classification; information that has a high or catastrophic impact on one or more of the axes receives a high impact classification.
These are the same impact levels used for FedRAMP and are defined by the Federal Information Processing Standards FIPS 199 document. Once the information has been classified, it can then be mapped to security categories for individual controls according to this document.
It’s worth mentioning that, in general, NIST doesn’t concern itself with information that is lower than low impact – the public information – or higher than high. Information that is classified by the government and is listed as officially Confidential, Secret, Top Secret, or another designation is generally controlled by its own policies, such as Executive Order 13526, Classified National Security Information. Similarly, defense information is controlled by different regulations relating to the departments of defense and national security.
The Future of Data Classification
If you’ve read through all of this and thought that it’s a mess, you’re not wrong. Even NIST agrees, which is why they have spent the last few years working on a new system. Specifically, the National Institute of Standards and Technology’s National Cybersecurity Center of Excellence, or NCCoE, has been collaborating with vendors, including ActiveNav, Adobe, Gitlab, Google, Janusnet, JPMorgan, Quick Heal, Thales, Trellix, and Virtru, to develop a new process for classifying information.
The project, simply called Data Classification, is part of a push for a zero trust approach, focused on data and security management, regardless of the context of that data. In order to create that approach, there must be a framework for how to identify and classify information, so that’s what the NIST group has been working on. The end result, when it is made available, will be part of the NIST Cybersecurity Practice Guide.
So why haven’t we given you their framework? It’s not implemented yet. It’s not even publicly available yet. In April of last year, NIST made it available for limited public comment for those who registered to be part of the group giving feedback. The document – NIST SP 1800-39 – was only available in that context. An executive summary can be found here.
The public comment session has been closed and NIST and the workgroup are reading through those comments now. Once they have reviewed the comments, they will produce a revised draft, and will – unless significant roadblocks occur – likely have a roadmap for refining it into a final draft, how to use that draft, and when it will go into effect in relation to other security frameworks and special publications.
Keeping Ahead of the Curve
Knowledge is power, which is why you need to keep it safe. Whether your business handles little more than public information, or you intend to be working closely with the federal government and dealing with controlled unclassified information, knowing what you have on your hands is a big part of adhering to security standards.
To track your security posture, maintain an audit state, and generally keep on top of the situation requires a lot of dedicated bookkeeping, paperwork, information tracking, and more. To help keep track of all of that without relying on siloed software and spreadsheets, you can try the Ignyte Platform. We designed our platform to help track information across a range of different security frameworks. To see it in action, book a demo today!
Max Aulakh is a distinguished Data Security and Compliance leader, recognized for implementing DoD-tested security strategies and compliance measures that protect mission-critical IT operations. His expertise was shaped in the United States Air Force, where he was responsible for the InfoSec and ComSec of network hardware, software, and IT infrastructure across global classified and unclassified networks. He also developed strategic relationships with military units in Turkey, Afghanistan, and Iraq. After his tenure with the USAF, Max played a pivotal role in driving Information Assurance (IA) programs for the U.S. Department of Defense (DoD). As a Senior Consultant for a leading defense contracting firm, he led a team that ensured data centers met Air Force Level Security audits for regulatory requirements like HIPAA, SOX, and FISMA. Currently, as the CEO of Ignyte Assurance Platform, he is at the forefront of cyber assurance and regulatory compliance innovation, catering to defense, healthcare, and manufacturing sectors. Max is also an esteemed speaker, having presented at several conferences on topics including cybersecurity GRC, medical device security, and cybersecurity perspectives in vendor management. You can follow him in LinkedIn here.