Last updated 6 months ago
Databricks is ACID
DB Learning Library
Free courses
Docs Optimization recommendations
Comprehensive Guide to Optimize Databricks, Spark and Delta Lake Workloads
Vector Search
DB for ML
DB for Data Engineering
(good) Introduction & Tutorial - cluster / notebook / table / SQL / DataFrame / connections
must know 7 concepts
RDD vs Dataframe vs Dataset
2016 official blog post
linkedin blog post
comparison on youtube
RDDs vs. Dataframes vs. Datasets – What is the Difference and Why Should Data Engineers Care?
Optimizations
Optimization recommendations on Databricks
How I Use Caching in Databricks to Increase Performance and Save Costs
Why and How: Partitioning in Databricks
Best Practices
official docs