Advantages of Using dbt(Data Build Tool)

Published

April 25, 2020

Keywords

dbt, beginner

In this article we aim to go over the reasoning behind why someone might want to use dbt. If you are interested in learning dbt checkout this article. Some common questions from Data Engineers about dbt are

it is not very clear to me why would I use dbt instead of running SQL queries on Airflow Why would I switch from sql scripts to dbt scripts considering the learning curve?

Dbt is designed to solve for the T part of ETL, by working on raw data already present in a data warehouse. It provides less functionality compared to other OSS ETL orchestration tools such as Airflow, Luigi, But this comes with the advantage of dbt being extremely simple to understand and run compared to other OSS ETL orchestration tools especially for a non engineer.

DBT

In recent years, Data warehouses have become extremely flexible(UDFs,etc) and powerful, with features like separation of storage and processing, elastic scaling and Machine Learning capabilities(Bigquery’s ML). This has led many companies to use the data warehouse to perform the data transformation and load part of the ETL process (otherwise know as ELT). This is where dbt shines as it provides an easy, version controlled way of writing transformations using just SQL. Additionally, it also provides data quality check natively.

The key points, on why someone would want to use dbt are

  1. Easy to use for non engineers (shared data knowledge between engineering and non engineering teams)

  2. Extremely flexible data model (recreate data easily, backfills are easy)

  3. If most of your transformations are at a data warehouse level, this tool makes it extremely easy to do

  4. Built in testing for data quality

  5. Online, searchable data catalog and lineage

  6. Reusable macros

  7. Shockingly low learning curve

  8. Production run using dbt cloud or through Airflow trigger.

Conclusion

If you are building a data pipeline where multiple engineers and non engineers are stakeholders in how the data is transformed and you have a powerful data warehouse to support such requirements, dbt is a very competitive choice as it frees you up from having to manage the dependencies, has test support natively and has a very low learning curve enabling engineers and non engineers to contribute to the transformation logic.

If you are interested in learning how to setup and run dbt checkout this article dbt tutorial. Let me know if you have any questions or comments in the comments section below.

Back to top

Land your dream Data Engineering job with my free book!

Build data engineering proficiency with my free book!

Are you looking to enter the field of data engineering? And are you

> Overwhelmed by all the concepts/jargon/frameworks of data engineering?

> Feeling lost because there is no clear roadmap for someone to quickly get up to speed with the essentials of data engineering?

Learning to be a data engineer can be a long and rough road, but it doesn't have to be!

Imagine knowing the fundamentals of data engineering that are crucial to any data team. You will be able to quickly pick up any new tool or framework.

Sign up for my free Data Engineering 101 Course. You will get

✅ Instant access to my Data Engineering 101 e-book, which covers SQL, Python, Docker, dbt, Airflow & Spark.

✅ Executable code to practice and exercises to test yourself.

✅ Weekly email for 4 weeks with the exercise solutions.

Join now and get started on your data engineering journey!

    Testimonials:

    I really appreciate you putting these detailed posts together for your readers, you explain things in such a detailed, simple manner that's well organized and easy to follow. I appreciate it so so much!
    I have learned a lot from the course which is much more practical.
    This course helped me build a project and actually land a data engineering job! Thank you.

    When you subscribe, you'll also get emails about data engineering concepts, development practices, career advice, and projects every 2 weeks (or so) to help you level up your data engineering skills. We respect your email privacy.