Seite 41 - Cloud Services and Big Data

Big Data
32
Figure 14 - Company & Environment Eco-System
5.3.1
Apache Hadoop – An Example for Integrating Big Data
Apache Hadoop is a framework based on Java, which enables companies to handle
big amounts of data (ranging from at least 10-100 gigabytes to petabytes) by
splitting CPU-intensive processes and calculations among several computer
clusters. These clusters operate simultaneously. It does not make a difference how
the data is structured, because the final allocation into an integrated system (f.i. ERP
system) is done by specific developer code. Many companies, like IBM, AOL,
Facebook or Twitter use Hadoop to monitor log files, for analysis, and machine
learning. Yahoo, one of the co-founders of Hadoop, for example uses more than
100,000
CPUs, which are divided among over 40,000 computers (Steinberg, 2012)
& (
Dumbill, 2012).
Source: Author’s Chart