ModifyStageTransformer
Passo a passo de como iniciar sua jornada na Dadosfera
ModifyStageTransformer
Overview
Transforms DataStage MODIFY stages to SQL SELECT statements. Handles column renames, column selection (keep), and column exclusion (drop).
Stage Type: MODIFY
Output: SQL SELECT with column operations
Capabilities
- Column renames (
NEW_NAME = OLD_NAME) - Column drops (DROP clause)
- Column keep/selection (implicit from rename list)
- Pass-through columns (no modification)
- Type conversions (when specified)
DataStage Stage Example
Input (.dsx format)
#### STAGE: ModifyColumns
## Operator
modify '
## Operator options
DROP SOURCE_ID
;
CODE_VALUE = CODE_VALUE
;
'
## General options
[ident('ModifyColumns'); jobmon_ident('ModifyColumns')]
## Inputs
0< [] 'InputData:In_Modify.v'
## Outputs
0> [] 'ModifyColumns:Output.v'
Input Schema (from upstream stage)
[
{"name": "CODE_VALUE", "type": "string"},
{"name": "SOURCE_ID", "type": "string"},
{"name": "REFERENCE_KEY", "type": "number"}
]Generated SQL Output
SELECT
CODE_VALUE,
REFERENCE_KEY
FROM inputdata_V0S37Output Schema
[
{"name": "CODE_VALUE", "type": "string"},
{"name": "REFERENCE_KEY", "type": "number"}
]How It Works
1. Parse Modify Options
Extracts operations from the modify string:
DROP SOURCE_ID ← Column to exclude
;
CODE_VALUE = CODE_VALUE ← Column rename (or passthrough)
;
2. Apply DROP Operations
Removes specified columns from output:
DROP SOURCE_ID→SOURCE_IDexcluded from SELECT
3. Apply Renames
Processes rename statements:
CODE_VALUE = CODE_VALUE→ Passthrough (no rename needed)NEW_COL = OLD_COL→OLD_COL AS NEW_COL
4. Include Remaining Columns
Columns not mentioned in DROP or rename are passed through automatically:
REFERENCE_KEY→ Included in SELECT (not dropped, not renamed)
5. Generate SELECT
SELECT
<renamed_columns>,
<passthrough_columns>
FROM upstream_stageModify Operations
DROP Column
DROP TEMP_FIELD
;
Excludes the column from output.
Rename Column
NEW_NAME = OLD_NAME
;
Renames OLD_NAME to NEW_NAME in output.
Passthrough (Identity)
FIELD_NAME = FIELD_NAME
;
Keeps column with same name (redundant but valid).
Multiple Operations
DROP FIELD_A
;
NEW_B = OLD_B
;
NEW_C = OLD_C
;
Type Conversions
The MODIFY stage can also convert types:
VALUE_STR = STRING(VALUE_NUM, 10)
;
Note: Complex type conversions may require LLM assistance for expression translation.
Schema Inference
Output schema is built from:
- Input schema (from upstream stage)
- Minus dropped columns
- Plus renamed columns (with new names)
- Types preserved from input (unless conversion specified)
Limitations
- Complex expressions - Only simple renames supported deterministically; complex derivations require LLM
- Type conversions - Limited to simple CAST operations; complex conversions need LLM
- Conditional logic - Not supported (use TRANSFORMER stage)
- Multiple outputs - MODIFY supports only one output link
Common Use Cases
1. Drop Temporary Columns
DROP TEMP_COL1
;
DROP TEMP_COL2
;
2. Rename for Clarity
ID_FIELD = ID_COL
;
DATE_FIELD = DATE_COL
;
3. Clean Schema
DROP AUDIT_TIMESTAMP
;
DROP ETL_BATCH_ID
;
FINAL_VALUE = VALUE_FIELD
;
Related Transformers
- CopyStageTransformer - For reading data with column filtering
- TransformerStageTransformer - For complex transformations with C code
- InputStageTransformer - Can include modify operations in orchestrate code
Updated 5 days ago
