Hadoop Matters Blog

Big Industries' blog

Running Cloudera on Microsoft Azure

In recent years, the cloud has offered a scalable, versatile, and efficient environment for big data workloads. In this blogpost we explore how companies are deploying Cloudera Enterprise for running production Hadoop workloads - ETL, BI, advanced analytics, Spark - on Microsoft Azure, with enterprise-class data security and governance.

Read More →

Cloudera User Group: Big Data analytics

Cloudera User Group.jpg

  

 

Belgium Cloudera User Group Meetup

Big Industries, as main sponsor of the Belgium Cloudera User Group, organised on Wednesday May 31st, 2017 a Meetup in our offices at Cronos in Kontich with Big Data Analytics as central topic.

Read More →

Apache spark market survey

top use cases for spark.jpg

 

Read More →

Why Do I need a Data Lake?

What is a data lake?

A Data Lake is an enterprise-wide system for storing and analyzing disparate sources of data in their native formats. A Data Lake might combine sensor data, social media data, click-streams, location data, log files, and much more with traditional data from existing RDBMSes. The goal is to break the information silos in an enterprise by bringing all the data into a single place for analysis without the restrictions of schema, security, or authorization. Data Lakes are designed to store vast amounts of data, even petabytes, in local or cloud-based clusters consisting of commodity hardware.

Read More →

Data Governance in hadoop environments

Cloudera User Group.jpg

 

                                  

 

Belgium Cloudera User Group Meetup

Big Industries, as main sponsor of the Belgium Cloudera User Group, organised on Wednesday February 8th, 2017 a Meetup in our offices at Cronos in Kontich with Data Governance in Hadoop Environments as central topic.

Read More →

Five common hadoopable problems

five_hadoopable_problems.jpg

Apache Hadoop has evolved into the standard platform solution for data storage and analysis. Large, successful companies are increasingly adopting Hadoop to perform powerful analysis of their ever-growing business data.

Two key aspects of Hadoop have driven its rapid adoption by companies hungry for improved insights into the data they collect:

  • Hadoop can store data of any type and from any source - inexpensively and at very large scale.
  • Hadoop enables the sophisticated analysis of even very large data sets, easily and quickly.

Read More →