[EN] Pipelines
What is a pipeline?A data pipeline is the process by which data from a source is directed to a destination with or without prior processing and transformation. At Dadosfera, we use the EL(T) paradigm.
A Pipeline at Dadosfera has its monitoring metrics and specific properties, such as name, description, status, and execution history.
Recurrent data collection in batches is done through the creation of a pipeline, determined from the selection of a source, for the evolution of the collected data within the Platform in the following stages.
Stages
Loading data into the Platform basically consists of:
- Registering or choosing a registered data source;
 - Defining the general information of the pipeline;
 - Inserting the pipeline settings (which vary according to the type of source);
 - Defining the entities, columns, and synchronization mode (which varies according to the type of source);
 - Creating micro-transformation (optional);
 - Choosing the frequency of collection.
 
Supported Data
| Classification | Data Type | 
|---|---|
| Numeric | number, decimal numeric, int, integer, bigint, smallint, byteint, float, float4, float8, double, double precision, real | 
| String and Binary | varchar, char, character, string, text, binary, verbinary | 
| Logical | boolean | 
| Date and Time | date, datetime, time, timestamp, timestamp_ltz, timestamp_ntz, timestamp_tz | 
| Semi-structured | variant, object, array | 
| Geospatial | geography | 
Unsupported Data
| Classification | Data Type | 
|---|---|
| LOB (Large Object) | blob, clob | 
| Others | enum, user-defined data type | 
To learn more about supported data types, access the "Data types" topic in the Snowflake documentation.
Updated about 1 month ago
