In this course, you will learn about data engineering patterns & practices to work with batch & real-time analytical solutions. You will begin with core compute & storage technologies used to build an analytical solution. You will explore how to design analytical serving layers & focus on data engineering considerations for working with source files. You will learn how to interactively explore data stored in files in a data lake. You will learn ingestion techniques used to load & transform data using the Apache Spark capability found in Azure Synapse Analytics, Databricks, Data Factory, or Synapse pipelines. You will learn how to monitor & analyze the performance of an analytical system to optimize performance of data loads or queries issued against the system. It is important to implement security to ensure data is protected at rest or in transit. You will then show how data in an analytical system can be used to create dashboards or build predictive models in Azure Synapse Analytics.
The primary audience for this course is data professionals, data architects, and business intelligence professionals who want to learn about data engineering and building analytical solutions using data platform technologies that exist on Microsoft Azure. The secondary audience for this course data analysts and data scientists who work with analytical solutions built on Microsoft Azure.
Prerequisites
Successful students start this course with knowledge of cloud computing and core data concepts and professional experience with data solutions.
Course cost listed does not include the cost of courseware. Course is subject to a minimum enrollment to run. Course may run virtually as a Virtual Instructor-Led (VILT) class if minimum enrollment is not met. For more information, please contact: learn@vtec.org or call 207-775-0244