Skip to main content

Pipeline Overview

Pipeline Parts#

A pipeline is a process defined in Data Composer. It consists of three parts:

  1. Source - Where does the data originate?
  2. Transforms - How do you need to change the data to fit the destinations?
  3. Destinations - Where do you want to send the data?

All pipelines consist of a single source (a reader), zero or more transforms, and zero or more destinations (writers). Most pipelines can be defined by saying: "I would like to get something from a file | database | website | device | api | email and send it to a file | database | website | device | api | email | sms | etc...

Pipeline Execution#

Once a pipeline has been configured and is running, it follows a very specific order of execution:

Trigger - Based on a set interval, schedule, external event, or even a reader module indicating there is new data available. This will initiate a pipeline execution.
Read - this will prime the pipeline with data
Transforms - each of the transforms are applied in order, with the result of each transform being passed on to the next.
Destinations - each of the writers will be called with the final output from the transforms

Along the way, the Pipeline can publish events that other Pipelines can use as triggers. In this way, you can chain pipelines and create compound pipelines.