Overview
Create a Batch Loop in a Data Workflow to perform one action on a large group of data at once. Think of working with a massive data set that you want to turn into Unqork submissions. You'll use one of Unqork's internal APIs (application programming interfaces) to do that. You know that the API call will be the same to create each submission. But a Create Submissions API call has a limit, allowing for the creation of only 50 submissions at a time. So, you'll need to group your data into sets of 50 before sending it through the API call. To do that, you'll build a batch loop Data Workflow.
What is a Batch Loop?
A batch loop Data Workflow works by separating your data into groups of a certain size. You can set the number of items you want in each batch while configuring your Data Workflow. Then, the Data Workflow separates your data and assigns each group its own index. It's this group index the looping Data Workflow then references while running.
Let's look more closely at the Create Submissions API call example above. Say you have a data set of Fortune 500 companies. That data set includes the company's name, their rank, their revenue, and their profit. You want to turn each of these companies into its own Unqork submission. The easiest way to do this is to send your data through an API call in batches. So, you'll configure one Data Workflow to separate out 50 companies at a time. Then, you'll configure a second Data Workflow to send each batch of 50 through a Create Submissions API call.
Here's how your module will look in the Module Builder:
Your completed use case won't show anything in Express View because the loop happens behind the scenes. So, you'll only see activity in the DevTools Console. Here's how that will look:
What You'll Need
For this use case, you need:
To set up your first Data Workflow, you need:
To set up your second Data Workflow, you need:
These instructions assume you have a new module open, saved, and with a title.
Configuration
Configure the Data Table Component
The Data Table you'll use to build this use case is quite long. Normally, you'd use a batch loop Data Workflow to handle your submissions. So, to simulate that, this same Data Table has 500 rows. To make things easier, we've created one you can copy into your own module.
Open the sample data module here: https://training.unqork.io/#/form/6086c37bb08e7e0a26942e1f/edit.
Hover over the dtFortuneFive
Data Table component.
A 5-button toolbar appears above the component on hover-over.
Holding Command or Control on your keyboard, click the
(Settings) button. This copies the component to your clipboard.
In the upper-right corner of your own module, click
(Options) button .
Click Paste Module Definition.
In the Module Definition field, type Command + V or Control + V. This pastes the component from your clipboard.
Click Paste.
You'll see the dtFortuneFive Data Table added to your canvas. You can open this component to verify the data copied successfully. Your Data Table will have 500 rows and should look like this:
Configure the indexLoopBulk Hidden Component
Each row in your table has a number assigned to it. That number is the index. In a simple looping Data Workflow, this is the index you'd reference in your Data Workflow. When looping in batches, you'll assign one index to several rows of your data. This is how your Data Workflow processes batches of data at once. So, you'll use a Hidden component to keep track of which group was last processed. Just like with a simple looping Data Workflow, you'll set the Default Value to 0
. And later, you'll add logic that adds to this as your submissions are created.
Drag and drop a Hidden component onto your canvas, placing it below the dtFortuneFive
Data Table component.
In the Property ID field, enter indexLoopBulk.
In the Canvas Label Text field, enter
indexLoopBulk
.In the component's configuration menu, click Data.
In the Default Value field, enter 0.
Click Save & Close.
Configure the dwfLoopGroup Data Workflow
Next, let's start building out your loop. You'll manage that with a Data Workflow component. The configuration here may look complex, but we'll explain what each operator does as you add it.
Drag and drop a Data Workflow component onto your canvas, placing it below the indexLoopBulk Hidden component.
In the Canvas Label Text and Property Name fields, enter
dwfLoopGroup
.This image shows how your Data Workflow will look at the end of this configuration. You'll add some operators shown here later.
Configure the First Input Operator
This Input operator references your indexLoopBulk Hidden component. This is how your Data Workflow knows which group of data needs to be processed next.
Drag and drop an Input operator onto the Data Workflow canvas.
Configure the Input operator's Info window as follows:
Setting
Value
Category
Input
Component
indexLoopBulk
Required
No
Source
Default
Configure the First Console Operator
You'll add several Console operators during this configuration. These come in handy so you can see what's happening behind the scenes. You can reference these in the DevTools Console in Express View for troubleshooting purposes. This Console lets you view the current index being processed.
Drag and drop a Console operator onto the Data Workflow canvas.
Configure the Console operator's Info window as follows:
Setting
Value
Category
Console
Label
Current Index
Connect the output port (right) of the
indexLoopBulk
Input operator to the input port (left) of theCurrent Index
Console operator.
Configure the First Formula Operator
This Formula operator adds 1
to the index of the previous index group processed. This is how your Data Workflow notes that it's processed each group for when the loop comes back around. The formula =SUM(A,1)
takes the value passed to the operator (A
) and adds 1
to it. Using A
acts as an alias for values entering an operator's input port.
Drag and drop a Formula operator onto your Data Workflow canvas.
Configure the Formula operator's Info window as follows:
Setting
Value
Category
Formula Value
Label
Iterate
Formula/Expression
=SUM(A,1)
Connect the output port (right) of the Input operator to the input port (left) of the
Iterate
Formula operator.
Configure the First Output Operator
This Output operator replaces the value in your indexLoopBulk Hidden component with the result of your Formula operator. If you didn't do this, your Data Workflow would always process the index group 0
because that's the Hidden component's Default Value. Now with each loop of the Data Workflow, the value coming into the indexLoopBulk Input operator will increase by 1.
Drag and drop an Output operator onto the Data Workflow canvas.
Configure the Output operator's Info window as follows:
Setting
Value
Category
Output
Component
indexLoopBulk
Action
value
Connect the output port (right) of the
Iterate
Formula operator to the input port (left) of theindexLoopBulk
Output operator.
Configure the Second Input Operator
This Input operator is what brings the data into the Data Workflow.
Drag and drop another Input operator onto the Data Workflow canvas.
Configure the Input operator's Info window as follows:
Setting
Value
Category
Input
Component
dtFortuneFive
Required
No
Source
Default
Configure the Create Index Operator
When processing items one at a time, you can use the index of each row in your table. But to process items as a group, you'll need to create a new index, one that refers to a group of items together. This process takes a few operators to complete. To start, let's create a new index that you can manipulate. For that, you'll add a Create Index operator.
Drag and drop a Create Index operator onto the Data Workflow canvas.
Configure the Create Index operator's Info window as follows:
Setting
Value
Category
Create Index
Label
Indexer
Index Name
indexer
Starting Index
0
Keys
Connect the output port (right) of the
dtFortuneFive
Input operator to the input port (left) of theIndexer
Create Index operator.
Configure the Create Field Operator
Once your data has passed through the Create Index operator, each row now has a new field called indexer
. This gives you a version of your index you can work with. So, let's assign each row in your table to a group. You'll store this index assignment in a new field using a Create Field operator.
Remember, each group can hold 50 items. So, you'll divide the value in the indexer
field by 50. You'll also want to round that down to the nearest integer so the same value is assigned to 50 items. You'll store the result of this equation in a new field called indexGrp. So, your entire formula will be indexGrp=INT(indexer/50)
.
Drag and drop a Create Field operator onto the Data Workflow canvas.
Configure the Create Field operator's Info window as follows:
Setting
Value
Category
Formula
Label
Index Group
Do Not Sanitize Formula
Yes (checked)
Field 1
indexGrp=INT(indexer/50)
Field 2
Field 3
Field 4
Field 5
Connect the output port (right) of the
indexer
Create Index operator to the input port (left) of theIndex Group
Create Field operator.
Configure the Second Console Operator
This is the second Console in this configuration. This Console lets you see the new group indexes assigned to your data.
Drag and drop another Console operator onto the Data Workflow canvas.
Configure the Console operator's Info window as follows:
Setting
Value
Category
Console
Label
Index Group
Connect the output port (right) of the
Index Group
Create Field operator to the input port (left) of theIndex Group
Console operator.
Configure the Size Operator
To split your data into groups of 50, you'll first need to determine how many total groups you'll have. The first step in doing that is to find how large your data table is. You'll use a Size operator to find that.
Drag and drop a Size operator onto the Data Workflow canvas.
Configure the Size operator's Info window as follows:
Setting
Value
Category
Size
Label
Total Size of Array
Connect the output port (right) of the
dtFortuneFive
Input operator to the input port (left) of theTotal Size of Array
Size operator.
Configure the Second Formula Operator
Next, you'll need to divide the size of your table by the intended size of your groups. In this case, we want each group to hold 50 pieces of data. So, you'll use the formula =INT(A/50)
. Here, A refers to the data passed through the operator's input port. And =INT
turns the resulting value into an integer. You'll want an integer because you'll always have a whole number of indexes.
Drag and drop a Formula operator onto the Data Workflow canvas.
Configure the Formula operator's Info window as follows:
Setting
Value
Category
Formula Value
Label
Grouped Index
Formula/Expression
=INT(A/50)
Connect the output port (right) of the
Total Size of Array
Size operator to the input port (left) of theGrouped Index
Formula operator.
Configure the Third Formula Operator
Just like in a simple looping Data Workflow, you need to construct a way to know if there is more data to process or if you've reached the end of your data set. To do that, you'll use another Formula operator. You'll use the formula =CONCATENATE(A,"<=",_arg)
to do that. Here, A
is the value in your indexLoopBulk
Hidden component. And _arg
is the value found by your groupedIndex
Formula operator. Later, you'll use a Decision operator to find whether this statement is true or not.
Drag and drop another Formula operator onto the Data Workflow canvas.
Configure the Formula operator's Info window as follows:
Setting
Value
Category
Formula Value
Label
Decision Argument
Formula/Expression
=CONCATENATE(A,"<=",_arg)
Connect the output port (right) of the
indexLoopBulk
Input operator to the input port (left) of theDecision Argument
Formula operator.Connect the output port (right) of the
Grouped Index
Formula operator to the argument port (top) of theDecision Argument
Formula operator.
Configure the Third Console Operator
This is the third Console in this configuration. This Console lets you view the result of your Decision Argument
Formula operator.
Drag and drop another Console operator onto your Data Workflow canvas.
Configure the Console operator's Info window as follows:
Setting
Value
Category
Console
Label
Index Argument
Connect the output port (right) of the
Decision Argument
Formula operator to the input port (left) of theIndex Argument
Console operator.
Configure the Decision Operator
The Decision operator decides whether there are more submissions to create. It does this by looking at the expression created by your Decision Argument
Formula operator. If this expression is true, the Decision passes your data through the upper output port. If this expression is false, the Decision passes your data through the lower output port. So, your Create Field operator serves as the input here, passing along data from your dtFortuneFive
Data Table. And your Decision Argument
Formula operator serves as the argument.
Let's take a moment to explain why you're having the Decision operator check whether the statement A<=_arg
is true. You need a way to tell your Data Workflow when to stop the loop. It should stop when there are no more rows to process in your Data Table. Or in this case, no more batches to process. So, you're creating a way for the Data Workflow to check if the number stored in the index
Hidden component is less than your total number of grouped indexes. That's what A<=_arg
represents here.
After the first loop of the Data Workflow, the value stored in the indexLoopBulk
Hidden component is 1
. So, on the next loop, the Data Workflow processes the second group (index 1). Your formula includes <=
so it can account for any data that may not fit evenly into your batches of 50. If you have a number of rows that don't divide neatly into your batches, you'll want to go one index further during your loop. And including that equals sign here lets you do that. And once the statement returns as false, you now have a way to stop the Data Workflow.
Drag and drop a Decision operator onto the Data Workflow canvas.
Configure the Decision operator's Info window as follows:
Setting
Value
Category
Decision
Input List
Condition
_arg
Connect the output port (right) of the
Index Group
Create Field operator to the input port (left) of the Decision operator.Connect the output port (right) of the
Decision Argument
Formula operator to the argument port (top) of the Decision operator.
Configure the Convert Value Operator
Next, we want to filter the data based on the index group currently being processed. To do that, you'll need to reference that index group number. You can do this by converting the value in your indexLoopBulk
Hidden component to a number. You'll use a Convert Value operator for that.
Drag and drop a Convert Value operator onto the Data Workflow canvas.
Configure the Convert Value operator's Info window as follows:
Setting
Value
Category
Convert to Value
Label
Index to Number
Cast To
Number
Connect the output port (right) of the
indexLoopBulk
Input operator to the input port (left) of theIndex to Number
Convert Value operator.
Configure the Filter Operator
Now, let's add your Filter operator. Here, you'll filter out the data in the index group you're currently processing. So, you'll set the upper output port of your Decision operator as the input. Then, you'll set the Convert Value operator as the argument. Finally, you'll set your Expression as indexGrp=_arg
so the operator looks for the data that has an indexGrp value matching the argument. Any data that matches passes through the upper output port of the Filter operator. And any data that doesn't match passes through the lower output port.
Drag and drop a Filter operator onto the Data Workflow canvas.
Configure the Filter operator's Info window as follows:
Setting
Value
Category
Filter
Label
indexGrp=_arg
Do Not Sanitize Formula
Yes (checked)
Expression
indexGrp=_arg
Connect the output port (right) of the Decision operator to the input port (left) of the
indexGrp=_arg
Filter operator.Connect the output port (right) of the
Index to Number
Convert Value operator to the argument port (top) of theindexGrp=_arg
Filter operator.Click Save.
Configure the selectedGroup Hidden Component
Before we configure the rest of your Data Workflow, let's add a place to store its result. You'll use another Hidden component for that. This Hidden component will store each group of data as it's retrieved by your Filter operator. Your loop overwrites this each time it runs. So, this will only ever hold the group of data you're currently processing.
Drag and drop a Hidden component onto your canvas, placing it below the dwfLoopGroup Data Workflow component.
In the Property ID and Canvas Label Text fields, enter
selectedGroup
.Click Save & Close.
Update the First Data Workflow Component
Now that you have a Hidden component to hold your output, let's add that to your Data Workflow. This tells your Filter operator where to store the data it retrieves. You'll later reference this data in a separate Data Workflow. That second Data Workflow processes your actual operation. In this use case, that's where you'd actually create your submissions. This first Data Workflow only manages your loop.
Hover over the dwfLoopGroup Data Workflow component.
A 5-button toolbar appears above the component on hover-over.
Using the toolbar, click the
(Settings) button.
Configure the Second Output Operator
This Output operator shows your Filter operator where to store the data it retrieves. So, you'll select the selectedGroup Hidden component you just added.
Drag and drop a Output operator onto the Data Workflow canvas.
Configure the Output operator's Info window as follows:
Setting
Value
Category
Output
Component
selectedGroup
Action
value
Connect the upper output port (right) of the
indexGrp=_arg
Filter operator to the input port (left) of theselectedGroup
Output operator.
Configure the Fourth Console Operator
This is the Fourth Console operator in this configuration. This Console operator shows the data separated out by the Filter operator. You can use this to view what's passed to your selectedGroup
Hidden component.
Drag and drop another Console operator onto your Data Workflow canvas.
Configure the Console operator's Info window as follows:
Setting
Value
Category
Console
Label
Rows
Connect the upper output port (right) of the
indexGrp=_arg
Filter operator to the input port (left) of theRows
Console operator.Click Save.
Configure the Second Data Workflow
Now that you have your loop configured, you can set up a Data Workflow to perform your operation. In this use case, this is where you would configure the creation of your submissions.
This article is focused on the looping aspect of a Data Workflow. So, we won't build the entire logic to create submissions. Instead, we'll focus on what you need in this second Data Workflow so your loop runs.
Drag and drop Data Workflow component onto your canvas, placing it below the
selectedGroup
Hidden component.In the Canvas Label Text and Property Name fields, enter
dwfOperationGroup
.
Configure the Input Operator
This Input operator brings in the data stored in your selectedGroup Hidden component. When creating your submissions, you would reference this instead of your larger data table. That's because this Hidden component holds only the data for one group of 50 entries at a time.
Drag and drop an Input operator onto the Data Workflow canvas.
Configure the Input operator's Info window as follows:
Setting
Value
Category
Input
Component
selectedGroup
Required
Yes
Source
Default
Configure the Console Operator
This is the only Console operator you'll add in this Data Workflow. This Console lets you see the data pulled into your Data Workflow from the selectedGroup
Hidden component.
Drag and drop a Console operator onto your Data Workflow canvas.
Configure the Console operator's Info window as follows:
Setting
Value
Category
Console
Label
Create Module Submissions
Connect the output port (right) of the
selectedGroup
Input operator to the input port (left) of theCreate Module Submissions
Console operator.
Configure the Output Operator
Next, you'll add an Output operator here to trigger your first Data Workflow to run again once your submissions are created.
Drag and drop an Output operator onto the Data Workflow canvas.
Configure the Output operator's Info window as follows:
Setting
Value
Category
Output
Component
dwfLoopGroup
Action
trigger
Connect the output port (right) of the
selectedGroup
Input operator to the input port (left) of thedwfLoopGroup
Output operator.Click Save.
Update the First Data Workflow
Now that you have a Data Workflow configured to run your operation. Let's configure your first Data Workflow to trigger it. In this step, we'll also add a way to stop your loop once you've created all of your submissions.
Hover over the dwfLoopGroup Data Workflow component.
A 5-button toolbar appears above the component on hover-over.
Using the toolbar, click the
(Settings) button.
Configure the Third Output Operator
You'll need to tell your second Data Workflow that the data it needs is ready. So, you'll use an Output operator to tell the second Data Workflow to run.
Drag and drop another Output operator onto your Data Workflow canvas.
Configure the Output operator's Info window as follows:
Setting
Value
Category
Output
Component
dwfOperationGroup
Action
trigger
Connect the upper output port (right) of the
indexGrp=_arg
Filter operator to the input port (left) of thedwfOperationGroup
Output operator.
Configure the Fourth Output Operator
Remember that your Decision operator knows whether there are more submissions to create. If there aren't, the Decision passes data through its lower output port. So, let's add an Output operator there to stop the looping process.
Drag and drop another Output operator onto the Data Workflow canvas.
Configure the Output operator's Info window as follows:
Setting
Value
Category
Output
Component
_self
Action
value
Connect the lower output port (right) of the Decision operator to the input port (left) of the
_self
Output operator.Click Save.
Configure the Initializer Component
Finally, let's add an Initializer component to start the whole operation. You'll set this component to trigger your looping Data Workflow.
Drag and drop an Initializer component onto your canvas, placing it below the dtFortuneFive
Data Table.
In the Property ID and Canvas Label Text fields, enter
initLoop
.In the component's configuration menu, click
Actions.
From the Trigger Type drop-down, select New Submission.
In the Outputs table, enter the following:
Property ID
Type
Value
1
dwfLoopGroup
trigger
GO
Click Save & Close.
Save your module.
Now you can check your work. Preview your module in Express View and open the DevTools Console. You'll see a lot of activity from your various Consoles. That shows your loop is working properly. Let's look at some key points.
First, you'll see your Index Group
Console. This is where your Data Workflow assigns the index group. So, the first 50 items show an indexGrp of 0, the next 50 show an indexGrp of 1
, and so on. You'll see this in the image below:
From there, you'll see your Current Index
Console. This shows which indexGrp is being processed. In the below image, you'll see the current indexGrp is 0
. Next, you'll see your Index Argument
and your Rows
Console. These show how your Data Workflow checked to see if there was more data to process, as well as the data as it's passed through your Filter operator. Here, you'll only see 50 pieces of data at a time. That's because you're only seeing each group as it passes through this portion of your Data Workflow.
And finally, you'll see your Create Module Submissions Console. This shows that your batch of data has made it to your second Data Workflow. You'll see each of these steps repeated until all your data has been processed.
Lab
You can view this complete use case here: https://training.unqork.io/#/form/6081961fd455ab3509979026/edit.
Best Practices
Data Workflows timeout after 5 minutes in all environments. Build Batch Loop Data Workflows to complete operations within 5 minutes to prevent timeouts.
If you don't plan to use disabled components in your application, remove them to ensure optimal performance. Remember to check all active components that connect to disabled components. Ensure active components still function properly after you remove the disabled ones.
Add labels to all Data Workflow operators to describe their function. These labels make it easier to know the purpose of an operator without having to open the Info window.
Select the Do Not Sanitize setting in all your operators to improve application performance.
Organize Data Workflow components based on their function in your application.
Use the component's Notes tab to comment on complex data processes. Add notes to explain what components are being triggered, trigger types, and the importance of each component.