Analogue.cloud transforms your data investments in actionable business results. We are partnering with Amazon Web Services (AWS) and cover Belgium and Luxembourg. We have expertise ranging from system architecture and design, application development, deployment & security, data integration and business intelligence, through to applied machine learning; with a focus on Big Data, fast data and large scale technology solutions that fully leverage the AWS Data Lakes and Analytics stack.
Our team of AWS accredited and certified consultants along with our proven project management methodology provides a trusted combination for your next Data Lake based initiative.
We Love Big Data
Big Data means lots of data in all kinds of forms. Raw data, structured data, streaming data, …. Many Use Cases drive the need for next generation Data Lakes which can handle the large volumes of (streaming) data like sensors, smart meters, machine logs, … combined with structured data offloaded from traditional Data Warehouses.
In short: Customers come to us when they want to start their Data Lake Journey.
In the current big data world, there are two main roles that stand out in junior profiles. A DevOps role and a Data Engineering role. In the internship you will immersed in both roles to give you a broad overall experience.
The assignment will evolve around an end-to-end project that you will develop and deliver. The first step of the assignment is creating a Data Lake and ingest a large open dataset. This could be achieved with both streaming and batch technologies. By setting up the infrastructure of the Data Lake you will encounter several AWS networking tools as well as the concept of serverless computing and streaming. The next step will be more focused on the data engineering role. You will explore, clean, structure and analyze the data. In a third step you will have to visualize the data.
In the assignment there is room for improvisation and we would definitely love to see some personal touches in the project. The solution you develop will form the basis for a technical blog post on our website. In this way you can show off your work to the big data world.
During this internship you have the chance to work with cutting-edge technologies that matter today. You are going to use the AWS stack as the basis of your project. An overview of the technologies and tools you are going to interact with:
- Networking: VPC, Subnet, Security Groups
- Serverless: Lambda
- Data lake: S3
- Streaming: Kinesis
- Data exploration: Glue
- Data Analytics: Athena
- Data Visualization: QuickSight
- Data analytics/Batch processing: EMR (Hadoop/Spark/Zeppelin/PySpark)
- Monitoring: CloudWatch | SES
- Version control: Git (Gitlab)
- CI/CD: Gitlab Pipelines, Ansible, CloudFormation
The technologies and tools listed above are a good indication of what you are going to use, but it’s definitely not written in stone. There is always room for alternatives and your opinion is also important to us.
In our vision the goal of an internship is to create a real-world working experience. To achieve this, you are going to deliver an end-to-end project that will be approached in an agile way. This means that you, in close collaboration with your mentor, will tackle the project in sprints. Each sprint will start with a sprint planning where we define the user stories and goals we want to achieve and ends with a sprint retrospective.
What do we expect from you?
Interns are expected to be highly-motivated with a healthy appetite for big data and problem solving.
We expect you to be a self-starter, who can think outside the box and be able to manage a small project. And don’t forget, the most important thing is learning a lot for yourself.