6
Watch
42
Star
15
Fork
27
Issue

dbt adapter for Azure Synapse SQL Dedicated Pools

dbt-msft
dbt-msft
pushedAt 12 hours ago

dbt-msft/dbt-synapse

dbt-synapse

custom dbt adapter for Azure Synapse. Major credit due to @mikaelene and his dbt-sqlserver custom adapter.

related packages

To get additional functionality, check out:

major differences b/w dbt-synapse and dbt-sqlserver

  • macros use only Azure Synapse T-SQL. Relevant GitHub issue
  • use of Create Table as Select (CTAS) means you don't need post-hooks to create indices (see Table Materializations section below for more info)
  • rewrite of snapshots because Synapse doesn't support MERGE.

status & support

as of now, only support for dbt 0.18.0

Passing all tests in dbt-adapter-tests, except test_dbt_ephemeral_data_tests

outstanding work:

  • ephemeral materializations (workaround for non-recursive CTEs) see #25
  • officially rename the adapter from sqlserver to synapse see #40
  • Make seed creation more fault-tolerant #36

Installation

Easiest install is to use pip (not yet registered on PyPI).

First install ODBC Driver version 17.

pip install dbt-synapse

Authentication

Please see the Authentication section of dbt-sqlserver's README.md

Table Materializations

CTAS allows you to materialize tables with indices and distributions at creation time, which obviates the need for post-hooks to set indices.

Example

You can also configure index and dist in dbt_project.yml.

`models/stage/absence.sql

{{
    config(
        index='HEAP',
        dist='ROUND_ROBIN'
        )
}}

select *
from ...

is turned into the relative form (minus __dbt's _backup and _tmp tables)

  CREATE TABLE ajs_stg.absence_hours
    WITH(
      DISTRIBUTION = ROUND_ROBIN,
      HEAP
      )
    AS (SELECT * FROM ajs_stg.absence_hours__dbt_tmp_temp_view)

Indices

  • CLUSTERED COLUMNSTORE INDEX (default)
  • HEAP
  • CLUSTERED INDEX ({COLUMN})
  • CLUSTERED COLUMNSTORE INDEX ORDER({{COLUMN}}) # see docs for performance suggestions

Distributions

  • ROUND_ROBIN (default)
  • HASH({COLUMN})
  • REPLICATE

Changelog

See CHANGELOG.md

ucloud ads