Where... do we store Data?
Sagacity primarily uses AWS S3, a type of cloud object storage to store data. We can store a wide variety of objects in S3, and in addition to raw data files, Databricks uses S3 to store metadata about workspaces, tables and more.
Databricks is configured to access S3 using Unity Catalog storage credentials and external locations, which are then used to store data files backing tables and volumes. Databricks does not require data migration into proprietary storage systems; instead, it integrates with our existing S3 storage, allowing us to manage and process data directly within our AWS account.
This setup supports the Databricks Lakehouse architecture, where Delta Lake files stored in S3 provide the data foundation. Additionally, ephemeral block storage is used for temporary data during compute operations, but long-term data storage is managed through S3.
See Also: