Loading Data in Snowflake

Sarvesh Pandey
3 min readJun 24, 2022
Loading Data in Snowflake.

So, on the previous blog we have learned and understand Snowflake Architecture and How to create Database, Table

As the table were empty there are multiple ways to insert data into a table.

But if we have to load data which is already existing in different source or in some file format or in same server, for these scenario Snowflake have different which is quite help. Lets look into this,

Snowflake provide two ways or method to load data into tables -

Bulk Loading and Continuous Loading

Bulk Load — Loading the whole data at one time

  1. Most Frequent Load
  2. Uses Warehouse
  3. Loading from Stages
  4. Copy command
  5. Transformation possible

Continuous Load -

  1. Designed to load small volume of Data
  2. Automatically gets loaded once they are added to stages
  3. Latest data always available for analysis
  4. Snow-pipe (Serverless feature)

We would need stages to performing Loading in Snowflake -

Stages are the database objects which have target database or table location where data can be loaded

Types of Stages -

External Stages

  • External cloud provider service are used
  • S3 from AWS
  • Azure blob storage from Azure
  • GCP
  • Database object created in schema
  • Create stage command

Note — Addition cost map apply if region/platform differs

Internal Stages

  • Local storage maintained by SNOWFLAKE

Steps to create Stages on snowflake

Follow the command below -

// CREATE A NEW DATABASE

create or replace database MANAGE_DB;

// CREATE SCHEMA FOR EXTERNAL STAGES

CREATE OR REPLACE SCHEMA EXTERNAL_STAGE;

// CREATING EXTERNAL STAGES QUERY

CREATE OR REPLACE STAGE MANAGE_DB.EXTERNAL_STAGE.AWS_STAGE — FULLY QUALIFIED STAGE NAME

url=’s3://bucketsnowflakes3'

credentials=(aws_key_id=’ABCD_DUMMY_ID’ aws_secret_key=’1234abcd_key’);

// DESCRIPTION OF STAGE

DESC STAGE MANAGE_DB.EXTERNAL_STAGE.AWS_STAGE

Loading Data in Snowflake.

//ALTER STAGE

ALTER STAGE AWS_STAGE

SET credentials=(aws_key_id=’ABCD_DUMMY_ID’ aws_secret_key=’your_key’);

// PUBLICALLY ACCESS STAGING AREA

CREATE OR REPLACE STAGE MANAGE_DB.EXTERNAL_STAGE.AWS_STAGE

url=’s3://bucketsnowflakes3';

// LIST STAGE

LIST @AWS_STAGE — @ IS USED AS A PREFIX FOR EXTERNAL STAGING OBJECTS

Note — To load data into any any source or Target, 1st we have to create a table in it. Its a very important step in Loading data between same or different services

// Creating ORDERS table

CREATE OR REPLACE TABLE OUR_FIRST_DB.PUBLIC.ORDERS (

ORDER_ID VARCHAR(30),

AMOUNT INT,

PROFIT INT,

QUANTITY INT,

CATEGORY VARCHAR(30),

SUBCATEGORY VARCHAR(30));

SELECT * FROM OUR_FIRST_DB.PUBLIC.ORDERS — EMPTY TABLE

Performing Bulk load of data from Staging Table on Snowflake Table

// COPY COMMAND NEED FULLY QUALIFIED NAME

// First copy command

COPY INTO OUR_FIRST_DB.PUBLIC.ORDERS — TARGET

FROM @MANAGE_DB.EXTERNAL_STAGE.AWS_STAGE — SOURCE

file_format = (type = csv field_delimiter=’,’ skip_header=1);

Loading Data in Snowflake.

— This will throw error as there are multiple files present in the AWS stage and some file are not able to map with Orders table

LIST @MANAGE_DB.EXTERNAL_STAGE.AWS_STAGE

// COPY COMMAND WITH SELECTIVE FILES

COPY INTO OUR_FIRST_DB.PUBLIC.ORDERS — TARGET

FROM @MANAGE_DB.EXTERNAL_STAGE.AWS_STAGE — SOURCE

file_format = (type = csv field_delimiter=’,’ skip_header=1)

files = (‘OrderDetails.csv’)

SELECT * FROM OUR_FIRST_DB.PUBLIC.ORDERS

Loading Data in Snowflake.

— What if we again load the data? Try it yourself and let me know in the comments. It will be great practice for you!!

— truncate table OUR_FIRST_DB.PUBLIC.ORDERS

// COPY COMMAND WITH PATTERN

COPY INTO OUR_FIRST_DB.PUBLIC.ORDERS

FROM @MANAGE_DB.external_stage.aws_stage

file_format= (type = csv field_delimiter=’,’ skip_header=1)

pattern=’.*Order.*’;

In the next Blog we will see “How to handle Error in Bulk Load”.

Some of my other blog -

  1. Database in Snowflake. If I ask this question… | by Sarvesh Pandey | Jun, 2022 | Medium
  2. Snowflake Architecture and Warehouse | by Sarvesh Pandey | Medium | Medium
  3. Parameter Passing — Move multiple Files from Blog Storage to Azure SQL DB | by Sarvesh Pandey | Medium
  4. Database Backup Location is AZURE! | by Sarvesh Pandey | Medium
  5. Starting Machine Learning with Python! | by Sarvesh Pandey | May, 2022 | Medium

--

--

Sarvesh Pandey

Hi Everyone, Let me give you my quick Introduction. I am Azure Data Engineer and Microsoft Certified Azure DBA and Trainer.