AUSTIN, Texas--(Apache Hadoop™ community will convene to discuss issues and opportunities shaping the business, finance, media and government sectors. A key conversation topic will be how organizations can improve data security for Hadoop and the applications that run on the platform.)--Big data will take center stage this week at the Strata Conference & Hadoop World in New York. The world’s largest gathering of the
“Hadoop and similar NoSQL data stores enable any organization – large or small – to collect, manage and analyze immense data sets, but these nascent technologies were not necessarily designed with comprehensive security in mind”
1. Think about security before getting started – You don’t wait until after a burglary to put locks on your doors, and you should not wait until after a breach to secure your data. Make sure a serious data security discussion takes place before installing and feeding data into your Hadoop cluster.
2. Consider what data may get stored – If you are using Hadoop to store and run analytics against regulatory data, you likely need to comply with specific security requirements. If the stored data does not fall under regulatory jurisdiction, keep in mind the risks to your public reputation and potential loss of revenue if data such as personally identifiable information (PII) were breached.
3. Encrypt data at rest and in motion – Add transparent data encryption at the file layer as a first step toward enhancing the security of a big data project. SSL encryption can protect big data as it moves between nodes and applications.
As Securosis analyst Adrian Lane wrote in a recent blog, “File encryption addresses two attacker methods for circumventing normal application security controls. Encryption protects in case malicious users or administrators gain access to data nodes and directly inspect files, and it also renders stolen files or disk images unreadable. It is transparent to both Hadoop and calling applications and scales out as the cluster grows. This is a cost-effective way to address several data security threats.”
4. Store the keys away from the encrypted data – Storing encryption keys on the same server as the encrypted data is akin to locking your house and leaving the key in your front door. Instead, use a key management system that separates the key from the encrypted data.
5. Institute access controls – Establishing and enforcing policies that govern which people and processes can access data stored within Hadoop is essential for keeping rogue users and applications off your cluster.
6. Require multi-factor authentication - Multi-factor authentication can significantly reduce the likelihood of an account being compromised or access to Hadoop data being granted to an unauthorized party.
7. Use secure automation – Beyond data encryption, organizations should look to DevOps tools such as Chef or Puppet for automated patch and configuration management.
8. Frequently audit your environment – Project needs, data sets, cloud requirements and security risks are constantly changing. It’s important to make sure you are closely monitoring your Hadoop environment and performing frequent checks to ensure performance and security goals are being met.
9. Ask tough questions of your cloud provider – Be sure you know what your cloud provider is responsible for. Will they encrypt your data? Who will store and have access to your keys? How is your data retired when you no longer need it? How do they prevent data leakage?
10. Centralize accountability – Centralizing the accountability for data security ensures consistent policy enforcement and access control across diverse organizational silos and data sets.
“Hadoop and similar NoSQL data stores enable any organization – large or small – to collect, manage and analyze immense data sets, but these nascent technologies were not necessarily designed with comprehensive security in mind,” said Dustin Kirkland, chief technology officer at Gazzang. “As these repositories grow in popularity and size, the potential for sensitive data to get swept up and stored is significant. Our customers recognize this and trust Gazzang to keep their data safe.”
Gazzang provides data security solutions and operational diagnostics that help enterprises protect sensitive information and maintain performance in cloud environments. The company has over 200 customers across multiple industries including SaaS providers, financial services, technology, health care and public sector organizations. Gazzang is backed by Austin Ventures and Silver Creek Ventures. For more information, visit www.gazzang.com.