Clean SQL Tricks
I'm not great at analytical SQL, but when I try to treat my SQL the way I treat code by breaking out intermediate values, prioritizing tight feedback loops, and using clear constant values, I find that I'm able to write better, more maintainable SQL, faster. I consistently use a few techniques to make this happen:
create temporary table
: creating lots of small, focused tables has been the most important technique to improve my SQL queries. Smaller queries are much simpler to understand, and by building these tables up one by one, subsequent queries that build on those tables can be much faster. If you often write large queries with lots ofjoin
s, you're probably familiar with the struggle of debugging and maintaining these queries! (The temporary keyword is a Redshift keyword to create a table only for your current session)- Use
select
without afrom
to quickly test out behavior. You can run queries likeselect extract('year' from GETDATE());
orselect 1 != null, 1 is not null
to quickly check your understanding. There's no need for afrom
. - Rather than writing constants multiple times, setting up a constants table, or a
with constants
clause, can make queries more maintainable. Realizing you updated a field in one spot but not in another can be frustrating!with constants as (select '2020-01-01' as start, '2020-02-01' as end)
(Shoutout to Ben Haley for showing me this trick!) - Use
limit
to explore data as you go. Building up complex analytics queries should be a pretty iterative process. Ideally, you create focused, easy-to-understand tables, query them to make sure their data looks like you expect, and then later combine those tables together to answer your question. Running lots of small queries usinglimit
to see if things are working correctly as you go can make the whole process a lot easier!