Securing a Successful Modern Data Analytics Platform in the Cloud
Every change introduces a high risk to an enterprise company that must be assessed before its implementation. The increasing cases of cyber-attacks by malicious hackers for ransomware are a real threat to the less cloud-experienced traditional companies. Learning how to secure their newly created cloud environment is a must first step to plan and implement. This chapter will discuss how to improve the security of the organization’s data against cyber-attacks while adding new powerful business analytical capabilities.
Introduction
In the previous chapters, we described the reasons to rethink the old data warehouse or data lake model for data analytic in a large enterprise organization; we broke it down to a three-tier model and outlined the main architectural building blocks. Many readers and companies that I’ve met since the publication of these chapters were excited to implement it, solving many of the problems they have in their current data architecture. Schema Inflexibility, lack of modern tools, scale, and other issues prevent them from stepping up their analytical game, which is one of the top requests from both C levels and other managers in the companies. “We can’t compete with (insert name of technology-based newcomer competitor) without analytical tools,” is felt and said across almost every large traditional business.
Still, there are many barriers to successfully implementing a modern data analytics platform to allow these businesses to compete and continue growing. For example, the organization’s DNA has to change toward faster-moving technological and data-driven tools, which we will discuss in the following chapters. Nevertheless, the number one barrier is the security aspects of the “data analytics platform in the cloud.” The organization’s CISO, if they already have one, is trying to implement the exact familiar mechanisms such as Firewalls, DMZ, Data Loss Prevention (DLP), etc. These mechanisms work if your data environment is defined as “IN vs. OUT,” and the data is mainly in a single location and even a single data system (SAP, for example). The proposed modern data platform allows restricted access to parts of the data for external data scientists and rapid development of business-oriented solutions based on big data from various sources. The platform also uses “public” cloud resources, primarily a new environment for the organization. I’ve used the “public” adjective in quotes, as I hear a lot of…