Introduction to Data Workflow
Estimated Reading Time: 7 minutes
TIP To learn more about specific Data Workflow use cases, see the How-to Guides: Operator Library section of our How-to Guides.
Overview
Your Unqork applications rely on data to function. But, data by itself doesn't create functionality. Data processing does. Data processing is the collection and manipulation of data to create a desired result.
Most components process data in specific ways. For example, the Calculator component makes calculations, while input components can store strings and numbers. The Data Workflow component has broader functionality. It's an Extract, Transform, Load (ETL) tool that lets you transform, manipulate, and move data. With the Data Workflow component, you can process data in almost-infinite ways. Use this component to manipulate data from one or more inputs and map the data to an output.
What You'll Learn
In this article, you'll learn:
Canvas
The first thing you notice about the Data Workflow component is that it has its own canvas. This is separate from the Module Editor canvas. Use this canvas to add, connect, and group Data Workflow operators.
To begin, drag and drop the Data Workflow component onto your Module Editor canvas. After adding the Data Workflow, you can see the canvas and Data Workflow operators:
Connecting Operators
Operators are the individual components of your Data Workflow. They let you manipulate large and complex data structures from multiple sources by connecting operators to create Data Workflows.
Common uses of operators include:
-
Inputting data.
-
Unwinding complex data structures.
-
Filtering data to obtain specific items.
-
Viewing data at different points in the Data Workflow.
-
Appending data items to create new structures.
-
Outputting data.
Here's an extract of a Data Workflow canvas showing an orchestration of operators. You can find this Data Workflow configuration in the Address Parsing snippet.
TIP To learn more about the Address Parsing snippet, visit our Address Parsing Snippet article.
As you can see, the Input operator brings in the data from an Address Search component. The Unwind operator separates the individual data points and the other operators take each data point and outputs them to different fields.
Ports
When you look closer at Data Workflow operators, you see small extensions around the edges. These are operator ports. Lines drawn between these ports connect your operators.
The 3 types of ports are:
-
Input (left port)
-
Argument (top port)
-
Output (right port)
An operator can have more than one Input or Output port and some have Argument ports. Below are the ports associated with a Filter operator.
Input Port
The Input port is the entry point into your operator. The previous operator in your Data Workflow connects to the next operator's Input port.
Some data operations have two inputs. For example, the Merge operator combines two inputted data sets. In this instance, a data set connects to the upper Input port, and another data set connects to the lower Input port.
Argument Port
Arguments add a dynamic aspect to data processing. They can create conditions, rules, and more functionality for an operator. When you connect an operator to another operator's top port, you bring it in as an argument.
It's important to make a distinction between an argument and an expression. Most operators have an Expression or Key field that controls what the operator does. For example, you might have a data set of savings accounts and another for checking accounts. You want to filter the two account types into separate outputs using a Filter operator. Without an argument, your expression might be account="savings". From there, the savings accounts filter out from a single output port. Anything that isn't a savings account is filtered out. The operator performs a single operation. If you want an operator to always have the same function, there's no need for an argument.
Use an argument when you want different functions based on dynamic data. Let's use the example from earlier, where you have data sets of savings and checking accounts. With an argument, you can make the filter more complex. You can add more filtering options and sort accounts by size. You can add logic to your filter, calculating multiple account sizes. Or, you can set up decisions based on the accounts your end-user selects.
NOTE Unqork references an argument as _arg in an expression.
Output Port
Outputs are where you send the final result of the data operation. You connect outputs using the right ports.
Some data operations have two outputs. For example, the Filter operator separates data and outputs into multiple locations. The upper output point might be the results that give a Boolean value of true. The lower output port then gives a Boolean value of false.
Connecting Operators
Remember, you connect operators to each other with lines drawn from an Output (right) port to an Input (left) or Argument (top)port. This is commonly referred to as "wiring" your operators.
In the following example, an Input operator's Output (right) port connects to a Col2Array operator's Input (left) port:
Arranging and Grouping Operators
As you dive into more complex uses of the Data Workflow component, you may find yourself with a Data Workflow canvas crowded with operators and connecting lines. Grouping operators together can help you organize your Data Workflow and provide a clearer process flow. You can also use the canvas' Enable Manual Layout to arrange operators as you please. Labeling your grouped operators can also help you keep track of what functions your operators perform.
Arranging Operators
To arrange Data Workflow operators:
1. | On the Data Workflow canvas, set Enable Manual Layout to ON. |
2. | In the pop-up modal, click OK. |
3. | Grab the operators individually and rearrange them on the canvas. Or, hold down the Shift key when clicking the operators to select multiple operators to rearrange. |
4. | Click Save. |
Below is the Data Workflow component of the Address Parsing snippet when Enable Manual Layout is set to OFF:
Below is the Data Workflow component of the Address Parsing snippet when Enable Manual Layout is set to ON (notice how the operators are rearranged in a tighter layout):
Grouping Operators
To group Data Workflow operators together:
1. | Hold down the Shift key when clicking the operators you want to group. |
2. | Press the Control + G keys (PC) or Command + G keys (Mac). The selected operators group together, then collapse into an expandable frame. |
TIP You can expand or collapse the frame by clicking the plus (+) or minus (-) sign in the frame's top bar.
3. | In the Info window, enter a Label for your group of operations. |
4. | Click Save. |
Below is the Data Workflow component of the Address Parsing snippet without grouping:
Below is the Data Workflow component of the Address Parsing snippet after grouping the operators:
Best Practices
-
Data Workflows timeout after 5 minutes in all environments. Build Data Workflows to complete operations within 5 minutes to prevent timeouts.
-
If you don't plan to use disabled components in your application, remove them to ensure optimal performance. Remember to check all active components that connect to disabled components. Ensure active components still function properly after you remove the disabled ones.
-
Add labels to all Data Workflow operators to describe their function. These labels make it easier to know the purpose of an operator without having to open the Info window.
-
Select the Do Not Sanitize setting in all your operators to improve application performance.
-
Organize Data Workflow components based on their function in your application.
-
Use the component's Notes tab to comment on complex data processes. Add notes to explain what components are being triggered, trigger types, and the importance of each component.
Resources