A Brief Introduction to Big Data Applications and Hadoop App Development
Big data refers to the set of techniques used to store and/or process large amounts of data.
Usually, big data applications are one of two types: data at rest and data in motion. The difference between them is the same as the difference between a lake and a river—a lake is a place that contains a lot of water while a river is a place where a lot of water is travelling through.
For this reason, data at rest applications are often referred to as a data lake while data in motion applications are called data streams or data rivers.
Fun Fact
The “Rio de la plata”, located next to our office in Uruguay, is the widest river in the world, measuring 220km/140mi wide 🙂
We’re now going to give an overview of the most used technologies and tools for big data, some of the concepts and challenges behind it and a gentle introduction on how to start developing and testing big data applications in a virtualized environment.
For this article, we’ll focus mainly on data at rest applications and on the Hadoop ecosystem specifically. Kee reading at the link!