Big Industries' blog
In recent years, the cloud has offered a scalable, versatile, and efficient environment for big data workloads. In this blogpost we explore how companies are deploying Cloudera Enterprise for running production Hadoop workloads - ETL, BI, advanced analytics, Spark - on Microsoft Azure, with enterprise-class data security and governance.
Our team at Big Industries has been recently working on implementing TensorFlow open-source deep learning system on a Cloudera Hadoop Cluster. We found out that one of the challenges was trying to read the compressed MNIST data files from the Hadoop File System (HDFS). The example code that comes out of the box with TensorFlow assumes that the compressed files GZIP format reside on a local filesystem: