arrow_back
Module1: Getting Started
Section Introduction
What is Big Data and Time Before Big Data
What is Hadoop
Benefits of Hadoop
Archhitecture of Hadoop and HDFS
YARN Architecture and Workflow
MapReduce and it's Drawback and Challenges
Module2: Spark Internals
Section Introduction
Introduction to Spark
Spark Terminology
Explanantion of Spark Internal Architecture
Life cycle of Spark Application
Features of Spark
Spark Eco-System
Explanation of Spark and Databricks History
Explanation of Databricks
Module3: Setup Your Environment
Section Introduction
Creating Azure Free Account
Creating Azure Free Account Lab
Azure Portal Overview
Creating Azure Databricks Service or Databricks Workspace
Creating Databricks Community Edition Account
Module4: Introduction to Azure Databricks
Section Introduction
Azure Databricks Architecture
Explanantion of Cluster and It's Types
Cluster Configurations and Modes
Access Modes and Cluster Policies
Creating Azure Databricks Cluster
Azure Databricks Cluster Pools
Azure Databricks Pricing Structure
Introduction to Databricks Notebook and Creating Notebook
Introduction to Markdown and Creating Markdown
Introduction to DBFS
Databricks Utilities (dbutils)
Demo - Databricks Utilities (dbutils)
Demo 02 -Databricks Utilities (dbutils) Notebook
Hive Metastore
Module5: RDD, DataFrame and Dataset
Section Introduction
What is RDD, DataFrame and Dataset
Spark SQL Engine or Catalyst Optimizer
Reading Files in Databricks
Demo-Read Data in Spark
InferSchema
Programmatic Schema Definition
DDL Schema
Demo-Schema, Creation and Operations
Demo-Handling Corrupted Records
Module6: File Formats
Section Introduction
Row and Column Based Format
CSV Format
Demo-Read CSV File
JSON Format
Demo-Reading JSON File
Demo-Flatten a Nested JSON File
Parquet Format
Demo-Reading Parquet File
AVRO Format
Demo-Reading AVRO File
ORC Format
Demo-Reading ORC File
Difference between File Formats
Module7:Transformations and Actions Operations
01-Section Introduction
02-Introduction to Transformations and its Types
03-What is Jobs, Stages and Tasks
04-Transformation Create Two DataFrame
05- Transformation Filter
06-Transformation Filter Equality
07-Transformation Filter AND
08-Transformation Filter OR
09-Transformation Filter Startswith and Endswith
10-Transformation Filter Contains
11- Transforamtion Filter IsNull
12-Transforamation Filter IsNotNull Voice Over
13- Transforamtion Filter IsIn
14- Transformation Filter Inequality
15- Transformation Filter Like
16-Transformation SELECT
17- Transformation DROP
18- Transformation WithColumn
19- Transformation WithColumnRenamed
20- Transformation WithColumn Concatenate with Separator and Lit
21- Transforamtion Union and UnionAll
22- Transformation OrderBy
23- Transformation Sort Function
Demo-Basic Transformation Operations
Demo-Basic Transformation Operations 2
24- Transformation Joins
Demo-Joins
Demo-Union and UnionAll
25-Transformation Aggregate functions Count
26- Transformation GroupBy and Aggregations (SUM and AVG)
27- Transformation GroupBy and Aggregations (Count MIN and MAX)
Demo-Aggregate Functions
28- Transformation Distinct
29- Transformation Window Functions and it's Types
30- Transformation Window function Ranking Functions
Demo-Window Functions (Ranking Functions)
31- Transformation Window Function Analytic Functions (LAG)
32- Transformation Window Functions Analytic Functions (Example)
Demo-Window Functions (Analytics Functions)
33- Transformation Window Function Aggregate Functions (First and Last Value)
Demo-Window Functions (Aggregate Functions)
34- Transformation Fillna
35- Transformation Date and Timestamp Function 01
36- Transformation Date and Timestamp Function 02
37- Transformation Date and Timestamp Function 03
Demo-Date and Time Functions
38- Transformation String Manipulations 1
39- Transformation String Manipulations 2 (Concat)
40- Transformation String Manupulations 3 (Contains)
41- Transformation String Manipulations 4 (Startswith)
42- Transformation String Manipulations 5 Endswith
43- Transformation String Manipulations 6 (Initcap Upper and Lower)
44- Transformation String Manipulations 7 (Substring)
45- Transformation String Manipulations Length
46- Transformation String Manipulations (TRIM Ltrim and Rtrim)
47- Transformation String Manipulations (REGEX_EXTRACT)
48- Transformation String Manipulations (REGEX_REPLACE)
49- Transformation String Manipulations (RPAD)
Demo-String Manipulation
50- Transformation Pivot and Unpivot
Demo-Pivot and Unpivot
52- Transformation (Transform)
Demo-Transform Function
53-Transformation (Explode)
Demo-Explode Function
Module8: Widgets and Parameterization
Section Introduction
Introduction to Widgets Parameterization and It's Types
Demo-Widgets
Module9: Design and Implement Lakehouse Storages
Section Introduction
Introduction to Data Lake and Architecture
Introduction to Azure Storage Account
ADLS Integration with Databricks
Intro to ADLS access by Creating a Secret Scope
Demo-Access ADLS by Creating a Secret Scope
Mount Point using ADLS Access Key
Demo-Creating Mount Point using ADLS Access Key
Demo-ADLS Access by Direct Storage Key
ADLS Access By SAS Token Fourth Method
Demo-ADLS Access by SAS Token
Module10: Structured Streaming and AutoLoader
Section Introduction
Introduction to Spark Streaming and Standard Architecture
Spark Structured Streaming Architecture
Spark Structured Streaming Internal Working1
Demo-Spark Streaming
Types of Windows
Watermarking
Auto Loader Use Case
Auto Loader Architecture
Auto Loader Directory Listing Mode
Demo-Auto Loader Diretory Listing
Auto Loader File Notification Mode
Demo-File Notification Mode
Schema Evolution
Schema Evolution (AddNewColumns)
Schema Evolution (Rescue)
FailOnNewColumns
None
Schema Inference
Demo-Schema Inference and Evolution
Module11: Deep Dive into Delta Lake
Section Introduction
Concept of Delta Lake and it's Advantages
Difference Between Data Warehouse, Data Lake and Delta Lake
Creating Delta Table (PySpark Method)
SQL Method
DataFrame Method
Demo-Delta Create Detla Tables
Features of Delta Table
Internal Architecture of Delta Table
Demo-Delta Internal Architecture
Spark Tables
Demo-Delta Spark Tables
Schema Evolution and Enforcement
Demo-Delta Schema Evolution
Time Travel and Data Versioning
Demo-Delta Time Travel and Versioning
Optimize Command
Demo-Delta Optimize
Restore Command
Demo-Delta Restore
Vacuum Command
Demo-Delta Vacuum
Merging Data into Delta Tables
Demo-Delta Upsert
Write Modes
Demo-Delta Write Modes
Updating Delta Tables
Delete Operations
Demo-Delta Delete
Module12: Databricks Performance Optimization Techniques
01-Section Introduction
02-Join Strategy
03-Types of Join Strategy
04-Broadcast Hash Join
05-Shuffle Hash Join
06-Shuffle Sort-Merge Join
07-Cartesian Product Join
08-Broadcast Nested Loop Join
09.1-Introduction to Adaptive Query Execution and Types_backup
10.1-Dynamically Coalesce Shuffle Partition
11.1-Dynamically Switching Joins
12-Dynamically Optimizing Skew Join
13-Repartition and Coalesce
14-Coalesce
15-Introduction to Data Caching and Persist_backup
16-Memory_Only Storage Level
17-Memory_Only_Ser and Memory_And_Disk
18-Memory_And_Disk_Ser and Disk_Only_backup
19-Explanation of Shared Variables and Types
20-Accumulator Variable
21-Explanation of Spark Memory Management
22-Executor Memory
23-Spark Memory Manager and Types
24-Mitigation Strategies
25-PartitionBy
26-Partition Pruning and Predicate Pushdown
27-Dynamic Partition Pruning
Module13: WorkFlow
01-Section Introduction
02-Introduction to Azure Databricks Workflow
Demo-Workflow
Module14: Delta Live Tables
01-Section Introduction
02-Introduction to Delta Live Tables
03-Medallion Lakehouse Architecture
04-Delta Live Tables Architecture
05-Delta Live Tables Datasets (Streaming Tables)
06-Delta Live Tables (Materialized View)
07-Delta Live Tables (Views)
08-How to Create Delta Live Tables
09-When to use Datasets ST MV View
10-Explanation of Data Quality
11-Different Type Invalid Record Actions
12-Explanation of Change Data Capture (CDC)
Module15: Unity Catalog
01-Section Introduction
02-Introduction to Unity Catalog
03-Query Data in Unity Catalog
04-Volumes in Unity Catalog
05-Key Features of Unity Catalog
06-Administrative Roles
07-Unity Catalog Object Model
08-How to Enable Unity Catalog
09-Limitations of Unity Catalog
10-Storage Credentials and External Locations
11-Capture and View Lineage
Preview - Azure Databricks
Discuss (
0
)
navigate_before
Previous
Next
navigate_next