Snowflake Architecture and Warehouse
As every Engineer or person who works on data had heard Snowflake for the first time, thought it would we a schema to relate Fact and Dimension table and I thought the same as well XD.
Snowflake is nothing but a cloud hosted data warehouse which is specifically built for Cloud services like Azure, AWS and GCP. In simple words, a data warehouse build on these three cloud services data center.
This blog will help you to understand the Snowflake warehouse Architecture and how to create a Virtual Warehouse in Snowflake. Series on blog will come soon on different topic on Snowflake cloud service.
Architecture
Snowflake is divided into three parts Storage, Query Processing and Cloud Services
Storage — Hybrid Columnar Storage, saved in blobs
Query Processing — Performs MPP (Virtual Warehoused)
Cloud Services — Managing Infra, Access control, Security, Monitor, Optimizer, etc
Virtual Warehouse —
Virtual Warehouse is present in Query Processing Layer which handles the query performance by using Massively Parallel Process Concept
It is not data warehouse. It’s a cluster or a group of compute resources which query to process quickly as per requirement.
Virtual warehouse in snowflake comes with different sizes -
Smallest is XS and Largest is 4XL
XS — 1
S — 2
M — 4
L — 8
XL -16
2XL -32
3XL — 64
4XL — 128
Steps to create Virtual Warehouse in Snowflake:
Step 1: Connect to your snowflake account
Step 2: Click on the warehouse icon
Step 3: Click in Create to Create Virtual Warehouse. Pop windows will come on screen, fill the requires information and Click on Finish.
Step 4: You can Configure your virtual warehouse by clicking on Configure button next to create.
Step 5: Next to Configure, there is Suspend and Resume bottom to manually Stop and Start the cluster. Drop button is to delete the Warehouse.
Scaling Virtual Warehouse
Scaling Out
Multi-clustering — Also known as Horizontal Scale up
Same number of cores added together to improve the query queuing of users or Multiple user can run query at same time without letting query into Queue
Scaling Up
Vertical Scale up. Scaling up the warehouse size to improve query performance
While creating Warehouse in Snowflake there is an option of proving Minimum and Maximum cluster.
Snowflake will auto-scale based on the warehouse scaling policy. If the number is the same in both min and max, the warehouse will always resume with that number of clusters
Also, there is another field of Scaling Policy with two options “Standard” (default) and “Economy”.
Standard -
Focus on Not letting Query to get into a queue by scaling out the warehouse (Horizontal Scaling)
Immediately scales up cluster size when there is query is in queue or system detects other query can be executed.
Economy -
Focus on using the cluster size completely then scale out the warehouse (Horizontal Scaling) only if system detect enough load on cluster to increase the cluster size
These are some basic information on Snowflake Cloud Warehouse and how user can create a warehouse.
Series of Snowflake blog coming up soon. Stay connected!
.
.
.
Some of my other blog -