The problem facing businesses today is how to take advantage of their huge untapped data source to find the hidden truths inside, thereby leading to smarter, brighter decisions. creating and being more innovative in developing products and services for customers as well as optimizing internal operations, thereby bringing efficiency and breakthroughs in their market. One of the answers to this problem is the application of big data analytics and data lake technologies.
Every business has a lot of different types of data, from structured databases, data about customer information and behavior, video data recorded from camera systems to real data. raw data such as log files generated by devices in the IT and transmission infrastructure. All of this data can bring new information to the business when combined together. However, these data are often scattered in many places on many different systems, making it difficult for businesses to combine and find new information from these discrete systems. Not only that, these data are often generated at a high speed while the capacity of each individual system is limited, leading to businesses having to discard a lot of data that has not yet been exploited.
Data lake will be the solution to the above problem of businesses. A data lake is a centralized place that stores all types of enterprise data in its native format, through which analytics and solutions will be able to access all of this data without needing to access it. garbage different systems. Data lake is scalable with large and fast capacity, through which enterprises will not waste untapped data, and storing data in its original format will help to miss hidden information inside. in those data.
In the current market, data analysis solutions and data lakes in the world and Vietnam are often deployed on the distributed data platform Hadoop. Previously, with the traditional deployment model of Hadoop, which was to use server infrastructure for all the system’s tasks including computation and storage, this system often encountered some problems. disadvantages such as:
– Hadoop cluster can only communicate over HDFS protocol, resulting in an additional intermediary system that receives source data from different protocols before saving to HDFS data lake.
– Due to the use of both computing and storage function blocks on the same server device, the system often cannot optimize resources. computing (even though the system doesn’t need it) through the addition of a server device.
– Low available data storage rate, only about 30% compared to raw capacity.
Traditional Hadoop system architecture
To solve the above problems, Dell EMC provides businesses with PowerScale – Scale-Out NAS storage solution. PowerScale solves the problems of the traditional Hadoop architecture with the following capabilities:
– Built-in HDFS feature, which reduces the storage load completely on PowerScale, Hadoop cluster only plays a compute role. The two components compute and storage can scale independently and do not cause waste as before.
– Multi-protocol support, one data can be accessed simultaneously via NAS and HDFS protocols, no need for an intermediary system like traditional architecture.
– PowerScale provides the same level of data protection, even higher than traditional protection with a much higher data availability rate, up to 85% raw data.
– Very large capacity scalability and easy expansion operation.
– Compatible with most versions of Hadoop and specially certified compatible with Cloudera CDP solution.
System Architecture Hadoop and Dell EMC PowerScale
By combining Hadoop with Dell EMC PowerScale, businesses do not need to worry about storing data in their data lake, but only need to focus on developing data analysis applications, creating value. from their own data. NT&T Solution, an authorized distributor and service partner of Dell Technologies for over 16 years, with a team of professional, experienced and internationally certified Dell Technologies with PowerScale solutions, will bring providing the best quality services for businesses in Vietnam (https://nttsolution.com).