Articles

Apache Spark supports many different data formats, such as the ubiquitous CSV format and web-friendly JSON format. Common formats used primarily for big data analytical purposes are Apache Parquet and Apache Avro.

In this post, we’re going to cover the properties of these four formats — CSV, JSON, Parquet, and Avro with Apache Spark.

Source de l’article sur DZONE