Transformers

Passo a passo de como iniciar sua jornada na Dadosfera

Deterministic Transformers

Overview

The Conversor DDF uses deterministic transformers to convert DataStage jobs and Tableau Prep workflows to Snowflake SQL without requiring LLMs for core transformation logic.

Summary

  • 18 Transformer Classes - Convert source metadata to Snowflake SQL
  • 11 DataStage Transformers - Handle IBM DataStage stage types
  • 7 Tableau/DSX Transformers - Handle Tableau Prep node types
  • 3 Hybrid Transformers - Mostly deterministic with selective LLM assistance for complex expressions

Transformer Categories

Fully Deterministic (15 transformers)

These transformers convert source metadata to SQL using only template-based logic, without any LLM calls:

DataStage:

  • LookupStageTransformer
  • ModifyStageTransformer
  • JoinStageTransformer
  • AggregatorStageTransformer
  • RemoveDuplicatesTransformer
  • CopyStageTransformer
  • FunnelStageTransformer
  • InputStageTransformer
  • ImportStageTransformer
  • OutputOracleTransformer

Tableau:

  • AggregateTransformer (SuperAggregate)
  • JoinTransformer (SuperJoin)
  • UnionTransformer (SuperUnion)
  • OutputTransformer
  • LoadExcelTransformer
  • LoadSqlProxyTransformer

Hybrid Transformers (3 transformers)

These transformers are mostly deterministic but use LLMs selectively for complex expressions:

  • TransformerStageTransformer (DataStage) - Deterministic for simple column passthroughs and constraints; uses LLM for complex C transformation code
  • ContainerTransformer (Tableau) - Deterministic for column renames, removals, type changes, and merges; uses LLM for filter expressions and derived columns

How Transformers Work

All transformers inherit from DeterministicTransformer base class and implement two key methods:

  1. can_transform() - Returns True if the transformer can handle the given stage/node
  2. transform() - Converts the stage/node to SQL and returns a TransformResult object

Transform Result

Each transformation returns a TransformResult containing:

  • success - Whether transformation succeeded
  • sql - Generated SQL query
  • schema - Output schema (column names and types)
  • error - Error message if transformation failed
  • field_info - Human-readable description of transformation

Browse Transformers

[DataStage Transformers](./IBM Data Stages/)

Documentation for all DataStage stage transformers with real input/output examples.

Documentation for all Tableau Prep node transformers with real input/output examples.

Architecture

DeterministicTransformer (Base Class)
├── DataStage Transformers
│   ├── LookupStageTransformer - Multi-table JOINs
│   ├── ModifyStageTransformer - Column renames/filters
│   ├── JoinStageTransformer - SQL JOINs
│   ├── AggregatorStageTransformer - GROUP BY
│   ├── RemoveDuplicatesTransformer - DISTINCT/ROW_NUMBER()
│   ├── CopyStageTransformer - SELECT from datasets/stages
│   ├── FunnelStageTransformer - UNION ALL
│   ├── InputStageTransformer - Oracle SQL conversion
│   ├── ImportStageTransformer - File reads to tables
│   ├── OutputOracleTransformer - SELECT passthrough
│   └── TransformerStageTransformer - Column derivations (hybrid)
└── Tableau Transformers
    ├── AggregateTransformer - GROUP BY
    ├── JoinTransformer - SQL JOINs (including anti-joins)
    ├── UnionTransformer - UNION ALL
    ├── OutputTransformer - SELECT passthrough
    ├── LoadExcelTransformer - Excel to table mapping
    ├── LoadSqlProxyTransformer - Database table reads
    └── ContainerTransformer - Column operations (hybrid)

Next Steps

  • Explore [DataStage Transformers](./IBM Data Stages/) for detailed documentation
  • Explore Tableau Transformers for detailed documentation
  • See How It Works for overall conversion architecture