Big / Bug Data: Analyzing the Apache Flink Source Code

Applications used in the field of Big Data process huge amounts of information, and this often happens in real time. Naturally, such applications must be highly reliable so that no error in the code can interfere with data processing. To achieve high reliability, one needs to keep a wary eye on the code quality of projects developed for this area. The PVS-Studio static analyzer is one of the solutions to this problem. Today, the Apache Flink project developed by the Apache Software Foundation, one of the leaders in the Big Data software market, was chosen as a test subject for the analyzer.

So, what is Apache Flink? It is an open-source framework for distributed processing of large amounts of data. It was developed as an alternative to Hadoop MapReduce in 2010 at the Technical University of Berlin. The framework is based on the distributed execution engine for batch and streaming data processing applications. This engine is written in Java and Scala. Today, Apache Flink can be used in projects written using Java, Scala, Python, and even SQL.

Source de l’article sur DZONE