Processing large volumes of data
When processing data in high volumes, core computing principles will tell us to use an Asynchronous method. The need for this varies, but in Tray it is often linked to a time constraint associated with the total execution time of the workflow. An example would be having a data set that needs to be updated once per day, where a Synchronous update would take more than 24 hours.
The logic behind choosing an Asynchronous method comes down to decreasing execution time, as well as modularizing your workflow for more trackable logs, and more readable design. Other developers on your team may choose to re-use these modularized flows in larger flows of their own, or clone your flow to test a new logical sequence in Sandbox mode.
"I have an extremely large Salesforce instance. Every day, I want to enrich all of my records to ensure they are up-to-date with data that lives in another program."
We call this process "Enrichment", and it's a very popular Tray use-case. If you've ever worked with a database of hundreds of thousands of items, you'll agree it's valuable to keep each record up-to-date by enriching it with other data sources.
Processing hundreds of thousands of records Synchronously takes time. Depending on your workflow and the total number of logical steps, this could even take multiple days.
Break your data into batches and process each batch asynchronously. The core architecture of Tray allows you to run as many instances of a single process as you like at a single time. Each instance of that process creates a new execution thread on the platform, so you'll never experience concurrent processes slowing each other down.
Note: Depending on your Edition of Tray, there may be a per-second limit on triggers associated with your account. This is the only factor that would slow an Asynchronous batch of workflows. Contact Tray CS if you are interested in increasing your trigger limit, or have questions about your current limit.
What this looks like on Tray
Workflow #1: Create Batches
Using a Salesforce connector, calculate a total count of all your opportunities.
Paginate the total count. Add a List Helpers connector, with Operation: Get List of Page Numbers. In the Per Page section, enter the number of records you'd like in each batch. This will split the count of total records into a paginated batch of the size you select.
Loop through each page returned by the List Helpers connector.
Grab a batch of Salesforce records that correlates with the size of the batch you indicated in (2). The Salesforce connector returns a pagination token for each query IF the size-limit for the query is smaller than the total number of records potentially returned for that query.
Use Data Storage to store the pagination token returned from Salesforce and pass into your next batch. Name the key salesforce_offset
For subsequent iterations of the loop, you'll want to pass the pagination token you've just stored in Data Storage into your Salesforce batching step. Do this by performing a GET operation with the same key you used in (5).
Finally, you'll want to ensure you reset the pagination token to ZERO before each run. That way, you'll start from record 01 each time.
Workflow #2: Process Batches
Create Workflow #2 with a Callable Trigger. Once this is configured, we can select it from the dropdown in the Call Workflow step in Workflow #1.
The final step in Workflow #1 is to send the entire payload of the batch you've created to a secondary flow. Do to this, select your Workflow #2 in the dropdown in the Call Workflow connector and pass in the entire payload of your Salesforce call.
Now, in Workflow #2, use a Loop Collection connector to process each data element. Pass in the entire Trigger as the List, which at this point will contain the data payload you sent to the workflow in the previous step.
Process your Data
At this point, you have successfully configured a set of workflows that will process your data asynchronously in multiple batches. Feel free to run the flow and experiment. In Workflow #2, you can open the Debug tab to see multiple instances of the same flow run at one time, in parallel.