Articles

Applications used in the field of Big Data process huge amounts of information, and this often happens in real time. Naturally, such applications must be highly reliable so that no error in the code can interfere with data processing. To achieve high reliability, one needs to keep a wary eye on the code quality of projects developed for this area. The PVS-Studio static analyzer is one of the solutions to this problem. Today, the Apache Flink project developed by the Apache Software Foundation, one of the leaders in the Big Data software market, was chosen as a test subject for the analyzer.

So, what is Apache Flink? It is an open-source framework for distributed processing of large amounts of data. It was developed as an alternative to Hadoop MapReduce in 2010 at the Technical University of Berlin. The framework is based on the distributed execution engine for batch and streaming data processing applications. This engine is written in Java and Scala. Today, Apache Flink can be used in projects written using Java, Scala, Python, and even SQL.

Source de l’article sur DZONE

See how to leverage your Db2 skills with Big Data.

The Challenge

The idea of the traditional data center being centered on relational database technology is quickly evolving. Many new data sources exist today that did not exist as little as 5 years ago. Devices such as active machine sensors on machinery, autos and aircraft, medical sensors, RFIDs, as well as social media and web click-through activity are creating tremendous volumes of mostly unstructured data, which cannot possibly be stored or analyzed in traditional RDMS’s.

These new data sources are pushing companies to explore the concepts of Big Data and Hadoop architecture, which is creating a new set of problems for corporate IT. Hadoop development and administration can be complicated and time-consuming. Developing the complex MapReduce programs to mine this data is a complicated and very specialized skill. Companies need to invest in training their existing personnel or hire people specializing in MapReduce programming and administration. This is the very reason many enterprises have been hesitant to invest in big data applications.

Source de l’article sur DZONE