IBM CDC Databricks ETL pipeline

Bronze → Silver medallion architecture · IBM - CDC.ipynb · Delta Lake

IBM stock API JSON · daily OHLCV data Bronze layer workspace.bronze.ibm · append-only · Delta Watermark check MAX(Date) in bronze skip already-seen rows Append new records JSON → tabular full history retained new dates only reads bronze Silver layer workspace.silver.ibm · latest-per-Date · Delta Clean & validate cast types · drop nulls de-duplicate on Date Delta MERGE key: Date insert new · update existing curated rows requests pandas · numpy pyspark · delta