Good Data Quality is Better Security

Data quality is not a glamourous subject. It is not the type of topic that headlines a conference or becomes front-page news. It is more typically suited for help guides and reference manuals that few individuals relish reading. However, organizations that acknowledge the importance of data quality and have strong data quality programs significantly reduce privacy and security risks. They also lower the potential costs associated with data breaches, the legal risks, and potential size of business interruptions.

Data quality issues start when information is created. This includes incorrect information, data entry errors, and inaccurate document conversion such as conversion of text contained within image files (e.g., a screen shot from a patient management system). Data quality issues also arise as data is being processed, transferred or stored.

1. Build a foundation of knowledge and fluency about data.

“Understanding data” means moving deeper than simply understanding that a database stores records or that a file contains information. Knowledge of data means taking the time to understand that data exists in different layers and structures and can be readily transformed. Additionally, data can be defined as discreet elements (e.g., a data element that stores date time information) and have assigned roles and restrictions. Investment in the language of data can improve control over data and enable better decisions on information security and privacy.

2. Don’t leave data design and quality decisions to the development team or an IT group.

This could place data at significant risk including possible loss, misuse and insecurity. Development teams are often provided with high-level requirement such as “design a secure form to collect user data”. While this directive may appear clear, privacy and security risks reside in the implementation of this directive. To achieve better security and privacy, more attention must be directed to clarify the method of data form collection, transmission and storage of data. Further validations should be provided so that data is corrected before it is stored.

3. Articulate security and privacy concepts in terms that assist developers integrate better security.

Regulations and policies concerning privacy and information security often address data from a systems perspective. Terms such as “protect the perimeter” articulate protection of a network and the systems and data within the network. “Protect the perimeter” does not clearly translate design into a more secure system.

Developers and analysts work with data in the context of business and user requirements. Developers also work under tight budget constraints and significant systems complexity where one requirement may consist of several steps. As security and privacy requirements continue to mature, understanding the needs and workflow of developers will facilitate better “baked in” security and privacy.

4. Extend security and privacy requirements to how data is created, changed, stored, transmitted and deleted.

Security requirements typically speak at a high level and leave a substantial gap in clarity with respect to data. As an example, a business may have a requirement where social security numbers (SSNs) are encrypted at rest. At the same time, the company may display SSNs in a web application where the SSNs are partially hidden by form design but otherwise are present and unprotected.

5. Embed security analysis into the QA process.

Security testing is often the purview of InfoSec groups and external consultants who evaluate software that exists in an operations environment (also referred to DevOps or Production). This includes the use of tools and the knowledge to locate and remediate vulnerabilities. The pitfall with this approach to security testing is that vulnerabilities are not identified before software is released. Using tools such as Seeker (which analyzes software for vulnerabilities during the QA process) can improve overall application security by reducing the number of possible vulnerabilities in software design.

CASE: Data at Risk (by Design)

Organizations are at increased risk of security incidents due to un-defined or poorly specified software requirements. One such example is inadequate articulation of secure password storage. Poor design is initiated when developers or an IT group receive a directive to secure user passwords. However, securing passwords can mean many things including:

Storing clear text passwords in a secure database.
Using well-known mathematical formulae to convert passwords into what are called hash values.
Storing software code or algorithms to secure passwords in the same data file or directory as the password data.
Storing password hints with passwords.
Forgetting to secure the folders where data is stored (which leaves the door open to the risk of exfiltration)
Not requiring strong password rules for the creation of passwords.
Not validating passwords prior to storing passwords.
Leaving administrative passwords in the same location as customer data.
Creating a backdoor for developers as an easy means to administrate or perform corrections.
Not requiring or allowing time for developers who wrote the code for securing passwords to create documentation that explains the code.
Leaving design implementation to a developer who may not be available or reachable after code implementation

Accountability for data design, use and quality should exist across an organization. With less of a technical divide, organizations can improve the conversation on how to better protect data with the appropriate use of security to balance risk and cost. Attention to detail at the bottom (the data level) may also deliver secondary benefits such as cleaner customer data, reduction in time to resolve customer issues, or better disaster recovery.