Finding all the Snowflake Best Practices & Recommendations

by | May 17, 2023 | BlogPosts, Snowflake | 0 comments

I have been reading the docs.snowflake.com site for over 4 years now.  I read it when trying to get things to work, I read it as a reference book for correct syntax, and I even read it to see what’s new!  Things do change fast in the Snowflake world so it’s important to keep up.

When working with Snowflake clients, a question that always comes up is: Are we using Snowflake correctly? The usual answer is “it depends”, but a better answer would be to perform a health check and find out for sure.  However, once a health check has been performed, how do you keep up to date with Snowflake? 

This got me thinking – what if I could pull all the documented best practices from docs.snowflake.com, put them into an Excel sheet, and then use this as a best practice checklist that is kept up to date.

So, I went ahead and did just that.

Firstly, how do I know what are the Snowflake best practices?  Easy – Search the Snowflake docs with this Google query:

https://www.google.com/search?q=”as+a+best+practice”+site%3Adocs.snowflake.com 

This will find about 43 results where the phrase “as a best practice” appears in the docs site.

If you click on the very first link:

ALL_USER_NAMES | Snowflake Documentation 

You will be rewarded with this excellent piece of advice:

As a best practice, username and login name values should be different.

Did you know that? No, me neither…but it makes total sense from a security point of view.

Here’s the SQL Query to check your own Snowflake account for this best practice:

 

-- Setup and context
USE ROLE ACCOUNTADMIN;
USE DATABASE SNOWFLAKE;
USE WAREHOUSE <Your Warehouse>;
-- Show all users in Snowflake account
SHOW USERS;
-- Get the Query ID to use later and help speed this up a bit
SET QUERY_ID = (
  SELECT QUERY_ID
  FROM TABLE(INFORMATION_SCHEMA.QUERY_HISTORY())
  WHERE QUERY_TEXT = 'SHOW USERS;'
  ORDER BY START_TIME DESC
  LIMIT 1
); 
-- Find all users where name = login_name
SELECT *
FROM TABLE(RESULT_SCAN($QUERY_ID))
WHERE "name" = "login_name";

I then did the same for the phrase “Snowflake recommends” which has around 242 results:

https://www.google.com/search?q=”Snowflake+recommends”+site%3Adocs.snowflake.com

The first result will be about Table Design Considerations:

Table Design Considerations | Snowflake Documentation

And your reward will be 4 recommendations (Here is the first recommendation from that page):

When defining columns to contain dates or timestamps, Snowflake recommends choosing a date or timestamp data type rather than a character data type. Snowflake stores DATE and TIMESTAMP data more efficiently than VARCHAR, resulting in better query performance. Choose an appropriate date or timestamp data type, depending on the level of granularity required. 

So, let’s go find our tables (in my current database) that are not following the best practice. (I had to create an example one because this is a best practice that I do know about!)

-- Setup and context
USE ROLE ACCOUNTADMIN;
CREATE DATABASE BADPRACTICE_DB;
USE DATABASE BADPRACTICE_DB;
CREATE SCHEMA BADPRACTICES;
USE SCHEMA BADPRACTICES;
USE WAREHOUSE <Your Warehouse>;

-- Example bad practices in a table
CREATE TABLE BADPRACTICETABLE (
    ID   NUMBER,
    DATE VARCHAR,
    TIME VARCHAR,
    LDTS VARCHAR,
    TS   VARCHAR,
    AMOUNT VARCHAR,
    DESCRIPTION VARCHAR
) COMMENT = 'This is a table to demonstrate bad practices!';
 
-- Find all tables (in your database) that have this bad
practice of defining date/time columns as VARCHAR
SELECT COL.TABLE_SCHEMA,
    COL.TABLE_NAME,
    COL.ORDINAL_POSITION AS COL_ID,
    COL.COLUMN_NAME,
    COL.DATA_TYPE,
    COL.CHARACTER_MAXIMUM_LENGTH AS MAX_LENGTH
FROM INFORMATION_SCHEMA.COLUMNS AS COL
    JOIN INFORMATION_SCHEMA.TABLES AS TAB 
    ON TAB.TABLE_SCHEMA = COL.TABLE_SCHEMA
    AND TAB.TABLE_NAME = COL.TABLE_NAME
    AND TAB.TABLE_TYPE = 'BASE TABLE'
WHERE COL.DATA_TYPE = 'TEXT'
AND COL.TABLE_SCHEMA != 'INFORMATION_SCHEMA'
    AND COL.COLUMN_NAME ILIKE ANY 
('%DATE%', '%TIME%', '%LDTS%', '%TS%') -- Change this search list to suit your own environment ORDER BY COL.TABLE_SCHEMA,   COL.TABLE_NAME,     COL.ORDINAL_POSITION;

This example table shows 4 results:

There are many more Snowflake best practices & recommendations to look for, and if you would like Mphasis Datalytyx to do the work on your behalf or for a copy of our spreadsheet then please contact us.

Know about our Snowflake consulting services and schedule a meeting with us.

This blog is written by Sunny Sharma.

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *