ContainerTransformer

Passo a passo de como iniciar sua jornada na Dadosfera

ContainerTransformer

Overview

Transforms Tableau Container nodes to SQL SELECT statements with multiple transformations. Container nodes group multiple operations (add columns, remove columns, filters, etc.) into a single logical unit.

Node Type: .v1.Container Output: SQL SELECT with derived columns and filters

Capabilities

  • Add calculated columns (AddColumn)
  • Remove columns (RemoveColumns)
  • Rename columns (RenameColumn)
  • Filter rows (FilterOperation, ValueFilter)
  • Merge columns (MergeColumns)
  • Type casting (ChangeColumnType)
  • Sequential operation processing

Tableau Node Example

Example 1: Add Date Components and Remove Original

Input (Tableau format)

{
  "id": "e3f5b8c4-9093-4214-9ba5-4da0697a365c",
  "name": "Clean Date Fields",
  "node_type": ".v1.Container",
  "base_type": "container",
  "loomContainer": {
    "nodes": {
      "0d180902-5195-4437-8f6e-98c0a111284f": {
        "id": "0d180902-5195-4437-8f6e-98c0a111284f",
        "name": "Add Year",
        "nodeType": ".v1.AddColumn",
        "columnName": "Year",
        "expression": "YEAR([Invoice Date])"
      },
      "76a04765-f5fa-415a-a1cb-9f2ad241bbc5": {
        "id": "76a04765-f5fa-415a-a1cb-9f2ad241bbc5",
        "name": "Add Month",
        "nodeType": ".v1.AddColumn",
        "columnName": "Month",
        "expression": "MONTH([Invoice Date])"
      },
      "854e5241-a7d4-4569-8651-d5bebe9e2aa3": {
        "id": "854e5241-a7d4-4569-8651-d5bebe9e2aa3",
        "name": "Remove Invoice Date",
        "nodeType": ".v1.RemoveColumns",
        "columnNames": ["Invoice Date"]
      }
    },
    "initialNodes": ["0d180902-5195-4437-8f6e-98c0a111284f"]
  }
}

Input Schema

[
  {"name": "division", "type": "string"},
  {"name": "sales_amount", "type": "real"},
  {"name": "order_type", "type": "string"},
  {"name": "document_number", "type": "string"},
  {"name": "invoice_date", "type": "date"}
]

Generated SQL Output

SELECT
  division,
  sales_amount,
  order_type,
  document_number,
  YEAR(CAST(invoice_date AS DATE)) AS year,
  MONTH(CAST(invoice_date AS DATE)) AS month
FROM previous_stage_V0S12

Output Schema

{
  "division": "string",
  "sales_amount": "real",
  "order_type": "string",
  "document_number": "string",
  "year": "integer",
  "month": "integer"
}

Example 2: Add Calculated Column

Input (Tableau format)

{
  "id": "f4e5d6c7-b8a9-0c1d-2e3f-4a5b6c7d8e9f",
  "name": "Calculate Profit",
  "node_type": ".v1.Container",
  "base_type": "container",
  "loomContainer": {
    "nodes": {
      "a1b2c3d4-e5f6-7890-abcd-ef1234567890": {
        "id": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
        "name": "Add Profit",
        "nodeType": ".v1.AddColumn",
        "columnName": "Profit",
        "expression": "[Sales Amount] - [Cost Amount]"
      }
    },
    "initialNodes": ["a1b2c3d4-e5f6-7890-abcd-ef1234567890"]
  }
}

Generated SQL Output

SELECT
  product_code,
  sales_amount,
  cost_amount,
  (sales_amount - cost_amount) AS profit
FROM products_V0S5

Example 3: Filter Rows

Input (Tableau format)

{
  "id": "9a8b7c6d-5e4f-3a2b-1c0d-9e8f7a6b5c4d",
  "name": "Filter Active Orders",
  "node_type": ".v1.Container",
  "base_type": "container",
  "loomContainer": {
    "nodes": {
      "b2c3d4e5-f6a7-8901-bcde-f12345678901": {
        "id": "b2c3d4e5-f6a7-8901-bcde-f12345678901",
        "name": "Filter Status",
        "nodeType": ".v1.FilterOperation",
        "expression": "[Status] = 'Active'"
      }
    },
    "initialNodes": ["b2c3d4e5-f6a7-8901-bcde-f12345678901"]
  }
}

Generated SQL Output

SELECT
  order_id,
  customer_id,
  status,
  order_date
FROM orders_V0S8
WHERE status = 'Active'

Example 4: Multiple Operations (Add, Rename, Remove)

Input (Tableau format)

{
  "id": "8c7d6e5f-4a3b-2c1d-0e9f-8a7b6c5d4e3f",
  "name": "Transform Customer Data",
  "node_type": ".v1.Container",
  "base_type": "container",
  "loomContainer": {
    "nodes": {
      "node1": {
        "nodeType": ".v1.AddColumn",
        "columnName": "Full Name",
        "expression": "[First Name] + ' ' + [Last Name]"
      },
      "node2": {
        "nodeType": ".v1.RemoveColumns",
        "columnNames": ["First Name", "Last Name"]
      },
      "node3": {
        "nodeType": ".v1.RenameColumn",
        "columnName": "Customer ID",
        "newColumnName": "ID"
      }
    },
    "initialNodes": ["node1"]
  }
}

Generated SQL Output

SELECT
  customer_id AS id,
  email,
  phone,
  (first_name || ' ' || last_name) AS full_name
FROM customers_V0S3

How It Works

1. Identify Container Structure

Reads loomContainer object with internal nodes:

{
  "loomContainer": {
    "nodes": { ... },
    "initialNodes": ["first-node-id"]
  }
}

2. Process Internal Nodes Sequentially

Executes operations in order defined by nextNodes relationships:

  1. Start with initialNodes
  2. Apply each operation
  3. Follow nextNodes chain
  4. Build cumulative transformation

3. Supported Operations

AddColumn

Adds calculated columns using expressions:

"expression": "YEAR([Invoice Date])"

Converts to:

YEAR(CAST(invoice_date AS DATE)) AS year

RemoveColumns

Excludes columns from output:

"columnNames": ["Invoice Date", "Temp Field"]

RenameColumn

Renames columns in SELECT:

"columnName": "Old Name"
"newColumnName": "New Name"

Converts to:

old_name AS new_name

FilterOperation / ValueFilter

Adds WHERE clause conditions:

"expression": "[Status] = 'Active'"

Converts to:

WHERE status = 'Active'

MergeColumns

Concatenates columns:

"expression": "[First] + ' ' + [Last]"

Converts to:

(first || ' ' || last) AS merged_column

ChangeColumnType

Casts column types:

"fields": {
  "Amount": {"type": "integer"}
}

Converts to:

CAST(amount AS INTEGER) AS amount

4. Generate SQL

Creates single SELECT statement combining all operations:

SELECT
  <existing_columns>,
  <added_columns>,
  <renamed_columns AS aliases>
FROM input_table
WHERE <filter_conditions>

Expression Support

Container nodes use Tableau Prep expressions that are converted to SQL:

Date Functions

Tableau ExpressionSQL Output
YEAR([Date])YEAR(CAST(date AS DATE))
MONTH([Date])MONTH(CAST(date AS DATE))
DAY([Date])DAY(CAST(date AS DATE))
DATEPART('quarter', [Date])QUARTER(CAST(date AS DATE))

String Functions

Tableau ExpressionSQL Output
[First] + ' ' + [Last](first || ' ' || last)
UPPER([Name])UPPER(name)
LOWER([Name])LOWER(name)
TRIM([Name])TRIM(name)

Math Functions

Tableau ExpressionSQL Output
[Sales] - [Cost](sales - cost)
[Sales] * 1.1(sales * 1.1)
ROUND([Amount], 2)ROUND(amount, 2)

Conditional Logic

Tableau ExpressionSQL Output
IF [Status] = 'A' THEN 'Active' ELSE 'Inactive' ENDCASE WHEN status = 'A' THEN 'Active' ELSE 'Inactive' END

Operation Execution Order

Container nodes process operations sequentially:

1. AddColumn "Year"      → Adds year column
2. AddColumn "Month"     → Adds month column (can reference year)
3. RemoveColumns         → Removes invoice_date
4. RenameColumn          → Renames columns
5. FilterOperation       → Applies WHERE clause

Later operations can reference columns created by earlier operations.

Hybrid Transformer

ContainerTransformer is a hybrid transformer that:

  • Uses deterministic conversion for simple operations (RenameColumn, RemoveColumns)
  • Uses LLM assistance for complex expressions (AddColumn with calculations)
  • Combines both approaches for optimal results

Limitations

  • ⚠️ Complex expressions may require LLM - Very complex Tableau expressions may need AI assistance for accurate conversion
  • ⚠️ Expression syntax differences - Some Tableau functions may not have direct SQL equivalents
  • ⚠️ Performance consideration - Multiple operations in one container are combined into a single SELECT for efficiency

Best Practices

When to Use Container Transformer

  • Multiple related transformations on the same dataset
  • Calculated columns from existing fields
  • Data cleanup operations (remove, rename, filter)
  • Date component extraction from timestamp fields

Simplifying Containers

If a container has many operations:

  1. Review for redundancy - Remove unnecessary operations
  2. Group logically - Keep related operations together
  3. Test incrementally - Verify each operation's output

Related Transformers

  • AggregateTransformer - For grouping and aggregations after transformations
  • JoinTransformer - For combining transformed datasets
  • OutputTransformer - For final output after transformations