Hadoop Matters Blog

Big Industries' blog

Running Cloudera on Microsoft Azure

In recent years, the cloud has offered a scalable, versatile, and efficient environment for big data workloads. In this blogpost we explore how companies are deploying Cloudera Enterprise for running production Hadoop workloads - ETL, BI, advanced analytics, Spark - on Microsoft Azure, with enterprise-class data security and governance.

Read More →

How to read compressed mnist data when using Tensorflow on Hadoop


Our team at Big Industries has been recently working on implementing TensorFlow open-source deep learning system on a Cloudera Hadoop Cluster. We found out that one of the challenges was trying to read the compressed MNIST data files from the Hadoop File System (HDFS). The example code that comes out of the box with TensorFlow assumes that the compressed files GZIP format reside on a local filesystem:

Read More →

Cloudera User Group: Big Data analytics

Cloudera User Group.jpg



Belgium Cloudera User Group Meetup

Big Industries, as main sponsor of the Belgium Cloudera User Group, organised on Wednesday May 31st, 2017 a Meetup in our offices at Cronos in Kontich with Big Data Analytics as central topic.

Read More →

Apache spark market survey

top use cases for spark.jpg


Read More →

Infrastructure as Code: Managing Servers in the Cloud

Managing Servers in the cloud.png

Read More →

Big Data Architectures: beyond hadoop



Read More →