LoadExcelTransformer

Passo a passo de como iniciar sua jornada na Dadosfera

LoadExcelTransformer

Overview

Transforms Tableau LoadExcel nodes to SQL SELECT statements. Maps Excel sources to pre-loaded Snowflake tables.

Node Type: .v1.LoadExcel Output: SQL SELECT from mapped Snowflake table

Recommended Approach

For LoadExcel nodes reading Excel files, we strongly recommend:

  1. Ingest the Excel file into Snowflake using Dadosfera Ingestion module
  2. Map the Excel node ID to the target Snowflake table in the configuration
  3. The converter will generate a SELECT from the target table

This approach ensures data is already loaded in Snowflake before the converted SQL runs.

Capabilities

  • Maps Excel sources to Snowflake tables
  • Preserves column schema from Excel definition
  • Supports column renames (RenameColumn action)
  • Supports column removal (RemoveColumns action)
  • Supports type casting (ChangeColumnType action)
  • Field name sanitization (spaces to underscores, lowercase)

Tableau Node Example

Input (Tableau format)

{
  "id": "33d0c8d7-be35-4b36-9f34-f3e658eeefc1",
  "name": "Sales Data",
  "node_type": ".v1.LoadExcel",
  "base_type": "input",
  "fields": [
    {
      "name": "Region Code",
      "type": "string",
      "ordinal": 0
    },
    {
      "name": "Sales Rep Number",
      "type": "integer",
      "ordinal": 1
    },
    {
      "name": "Sales Rep Name",
      "type": "string",
      "ordinal": 2
    },
    {
      "name": "Product Number",
      "type": "string",
      "ordinal": 3
    },
    {
      "name": "Product Description",
      "type": "string",
      "ordinal": 4
    },
    {
      "name": "Customer Number",
      "type": "integer",
      "ordinal": 5
    },
    {
      "name": "Sales Amount",
      "type": "real",
      "ordinal": 6
    },
    {
      "name": "Order Date",
      "type": "date",
      "ordinal": 7
    }
  ],
  "actions": []
}

Excel to Snowflake Mapping Configuration

{
  "33d0c8d7-be35-4b36-9f34-f3e658eeefc1": "PUBLIC.SALES_DATA"
}

Generated SQL Output

SELECT region_code,
       sales_rep_number,
       sales_rep_name,
       product_number,
       product_description,
       customer_number,
       sales_amount,
       order_date
FROM PUBLIC.SALES_DATA

Output Schema

{
  "region_code": "string",
  "sales_rep_number": "integer",
  "sales_rep_name": "string",
  "product_number": "string",
  "product_description": "string",
  "customer_number": "integer",
  "sales_amount": "real",
  "order_date": "date"
}

How It Works

1. Map Excel Node to Snowflake Table

Uses excel_table_mapping to find corresponding Snowflake table:

{
  "<node_id>": "SCHEMA.TABLE_NAME"
}

2. Extract Field Schema

Reads column definitions from node.fields:

{
  "name": "Region Code",
  "type": "string",
  "ordinal": 0
}

3. Apply Actions (if any)

Processes actions in order:

  • RenameColumn: Renames columns in SELECT
  • RemoveColumns: Excludes columns from SELECT
  • ChangeColumnType: Adds CAST expressions

4. Sanitize Field Names

Converts Tableau field names to SQL identifiers:

  • Spaces → underscores
  • Lowercase
  • Remove special characters
"Sales Amount" → sales_amount
"Region Code" → region_code

5. Generate SELECT Statement

Creates SQL selecting columns from mapped table:

SELECT <sanitized_columns>
FROM <mapped_table>

Actions Support

RenameColumn Action

Input

{
  "actions": [
    {
      "actionType": ".v1.RenameColumn",
      "columnName": "Sales Amount",
      "newColumnName": "Total Sales"
    }
  ]
}

Generated SQL

SELECT region_code,
       sales_amount AS total_sales,
       order_date
FROM PUBLIC.SALES_DATA

RemoveColumns Action

Input

{
  "actions": [
    {
      "actionType": ".v1.RemoveColumns",
      "columnNames": ["Product Description", "Customer Number"]
    }
  ]
}

Generated SQL

SELECT region_code,
       sales_rep_number,
       sales_rep_name,
       product_number,
       sales_amount,
       order_date
FROM PUBLIC.SALES_DATA

ChangeColumnType Action

Input

{
  "actions": [
    {
      "actionType": ".v1.ChangeColumnType",
      "fields": {
        "Sales Amount": {
          "type": "integer"
        }
      }
    }
  ]
}

Generated SQL

SELECT region_code,
       sales_rep_number,
       CAST(sales_amount AS INTEGER) AS sales_amount,
       order_date
FROM PUBLIC.SALES_DATA

Field Type Mapping

Tableau TypeSnowflake Type
stringVARCHAR
integerINTEGER
realFLOAT
dateDATE
datetimeTIMESTAMP
booleanBOOLEAN

Excel Mapping Examples

Simple Mapping

{
  "node-id-001": "PUBLIC.SALES_TABLE"
}

With Schema Prefix

{
  "node-id-001": "ANALYTICS.SALES_DATA"
}

Multiple Nodes (Different Excel Sources)

{
  "node-id-001": "PUBLIC.SALES_2023",
  "node-id-002": "PUBLIC.CUSTOMERS",
  "node-id-003": "ANALYTICS.PRODUCTS"
}

Each node ID maps to exactly one Snowflake table.

Limitations

  • ⚠️ Requires table mapping - Excel node IDs must be mapped to Snowflake tables
  • ⚠️ Pre-ingestion required - Excel files must be loaded into Snowflake using Dadosfera Ingestion module before SQL execution
  • ⚠️ No file format conversion - Excel parsing and data type conversions must be done during ingestion, not in SQL
  • ⚠️ Supported actions only - Only RenameColumn, RemoveColumns, and ChangeColumnType actions are supported

Best Practices

When to Use LoadExcel Transformer

  • Simple Excel reads with pre-defined schema
  • Files already ingested into Snowflake tables
  • Straightforward column mappings (1:1 from Excel to table)

Pre-Ingestion with Dadosfera

Always use Dadosfera Ingestion module to:

  1. Load Excel files into Snowflake tables
  2. Handle schema mapping and data type conversions
  3. Configure error handling for malformed records
  4. Set up incremental loads if needed

Then configure excel_table_mapping to point to the ingested tables.

Related Transformers

  • LoadSqlProxyTransformer - For database inputs (not Excel files)
  • ContainerTransformer - For post-load transformations
  • UnionTransformer - For combining multiple Excel sources