Databases
Lakes & Warehouses
DataLakeHouse - Firebolt comparison with Snowflake vs Databricks.
Delta lake is a data lake that can store raw unstructured, semi-structured, and structured data. When combined with Delta Engine it becomes a data lakehouse.
What is SnowFlake, 2 - Snowflake decouples the storage and compute functions, which means organizations that have high storage demands but less need for CPU cycles, or vice versa, donβt have to pay for an integrated bundle that requires them to pay for both. Users can scale up or down as needed and pay for only the resources they use.
data mart
talend on data marts + 3 types (dependent, independent, hybrid)
(good) netsuite on data marts - the three types ^ + structures (star, snowflake, denormalized) + comparisons
Data Lake
monitoring health status at scale using great expectations and spark
Comparisons
Snowflake vs Delta Lake vs Fire Bolt - "Databricks Delta Lake and Delta Engine is a lakehouse. You choose it as a data lake, and for data lakehouse-based workloads including ELT for data warehouses, data science and machine learning, even static reporting and dashboards if you donβt mind the performance difference and donβt have a data warehouse.
Most companies still choose a data warehouse like Snowflake, BigQuery, Redshift or Firebolt for general-purpose analytics over a data lakehouse like Delta Lake and Delta Engine because they need performance.
But it doesnβt matter. You need more than one engine. Donβt fight it. You will end up with multiple engines for very good reasons. Itβs just a matter of when. "
Snowflake Intro and demo
Use Cases
Hunters on their architecture, airflow, snowflake, snowpipe, flink, rockdb, cluster optimization during ingestion, monitoring metrics, cost.
Snowflake
getting started with SF tasks - sql or procedures, schedules, B-tree tasks.
ClickHouse
open source database for real time apps and analytics - think snowflake open source
Feature engineering
Data lake Table Formats
Apache Iceberg
Databricks Delta Lake
Vector Databases
Gartner - Innovation Insight: Vector Databases
Chroma - AI native open source embedding database, github
Last updated