CSV queue management
OverviewCopy
To illustrate how to use the CSV Editor to manage the rate at which you make requests to a service API, the following imagines a scenario whereby you are receiving hundreds of applicants to a scheme via a webhook triggered workflow.
You are then wanting to upload these applicants to Google Sheets but you will need to do so in accordance with their API request limits which states that the rate limit is 100 requests per 100 seconds per user.
So we will be dividing this process into two workflows:
Workflow 1 will receive the applicants via webhook and create an account-level CSV to store the hundreds of applicants being received
Workflow 2 will be scheduled to pick up applicants from the account-level CSV and send appropriately-sized batches to Google Sheets and delete applicants from the CSV as they are processed
The overall result will be to build a 'first-in-first-out' (FIFO) queue whereby the first results to appear in Sheets will be the first applicants that were received.
Workflow 1 looks like this:
The steps involved here are:
Receive a webhook payload of several hundred applicants to a trial scheme
Parse this data into readable JSON with an Object Helper
Loop through each applicant and then for each one :
Check if an account-level CSV has already been created (its id will be stored in the data storage connector)
If not then create a new account-level CSV and store the id
Get the CSV id
Add the applicant as a new row to the CSV
The end result of this is that you will have an account-level CSV populated with several hundred applicants, ready to be accessed by other workflows which can use data storage to retrieve the CSV id.
The next step is to process these applicants at the desired rate of e.g. 200 per hour (this will vary according to the API limits of the service you are working with).
Then the sub-processing workflow 2 is kicked off by a scheduled trigger:
The steps involved here are:
Run a scheduled trigger. In this scenario we are saying that every Monday we want to process applications that have built up in the previous week. So we set the trigger to run every half hour on Mondays (you can deal with any scenario here - you might want to run this workflow every hour every day depending on your needs)
Get the id of the account-level CSV from Data Storage
Get a set number of rows from the CSV. In this case we choose 50 to make extra sure that we don't attempt to process at a rate of more than 100 in 100 secs
Check if any rows are found (i.e. there are applicants to be processed)
If so then loop through each row and:
Parse the row data to JSON
Add to a new row in Google Sheets
Finally, for each run of the workflow, we delete all the rows that have been processed
Setting up the main workflow 1Copy
1 - Parse JSON data from the webhook triggerCopy
The JSON Parse step uses the $.steps.trigger.body
jsonpath to get the applicant data from the webhook:
Exactly what jsonpath you would use to access this data will depend on how the data is formatted by the service sending the webhook data.
2 - Loop through each applicantCopy
We then loop through each applicant to begin the process of adding them as a row to the CSV:
3 - Get the CSV idCopy
For each applicant we need to get the account level CSV. This is done by using the Data Storage 'Get Value' operation:
Note that this is set to Account Scope and that it uses the best practice of adding the final suffix of the workflow url to the name of the key - for my test workflow it is csv-9c3a4eaff4ce
- to identify the workflow source of the key.
If you're wondering where the name of this key actually comes from - it is set with the Set CSV id step in the True branch of the first run of the workflow!
4 - Check if CSV id is setCopy
On the first run of the workflow, there will be no CSV id, so we need to use a boolean connector to check if it is present. Here we are checking for a null value:
5 - Create CSV on first workflow runCopy
If it is true that the CSV is not set then we create the CSV with the relevant columns, setting the scope to Account:
6 - Set CSV idCopy
We then have to use the $.steps.csv-1.id
jsonpath to store the CSV id at account level using Data Storage:
Note again that we are using the best practice of adding the final suffix of the workflow url to the name of the key - for my test workflow it is csv-9c3a4eaff4ce
- to identify the workflow source of the key.
7 - Get CSV idCopy
Before adding the applicant to a new row, we have to get the id of the CSV:
8 - Add applicant to rowCopy
The final step is to add the applicant to a new row in the CSV:
9 - Enable the workflowCopy
Make sure to enable the workflow!
Setting up a workflow to replicate a webhookCopy
This manually-triggered workflow only involves one step - an http client to send the dummy data to workflow 1:
Note the URL you enter here is the webhook url from workflow 1. This is found in workflow 1 by going to workflow settings in the top-left corner of the builder:
In the raw body you can enter the json applicants data to be sent to the main workflow (click here to download a sample json file which you can open to copy and paste into the raw body and will work with the main workflow setup)
You can then hit 'run workflow':
If you return to worfklow 1 and open the debug panel you should see results coming in!
Setting up the rate-controlled processing workflow 2Copy
1 - Set the scheduled triggerCopy
Here we use the Cron Schedule operation for the scheduled trigger, saying we want to run every half hour on Mondays by using the */30 * * * 1
cron expression (see crontab guru for a handy tool to help you generate more advanced cron expressions if the basic operations in
our scheduled trigger do not suffice).
Note on testing: When testing this setup, you could make use of a manual trigger, or you could just make sure the trigger is scheduled to run on the current day you are working on - e.g. */30 * * * 3
for if you are working on a Wednesday (the final digit in the cron formula uses numbers to set the day of the week - where Sun-Sat is 0-6)
2 - Get the CSV idCopy
We then need to get the account-level CSV id:
3 - Get a batch of 50 rowsCopy
In order to control the rate of processing we have decided that every half hour we are going to process 50 rows from the CSV:
4 - Check there are rows to processCopy
If there are no rows left to process, you can check for this and terminate the workflow:
5 - Loop through each rowCopy
We can then loop through each CSV row to be processed into Google Sheets:
6 - Add row in Google SheetsCopy
You should already have created an empty Google Sheet and retrieved for this step, with the appropriate columns set:
Then the row can be added to Google Sheets using the Create Row operation.
Note the jsonpaths pick up the value from the loop connector, and how they deal with spaces in the column name by using [' ']
as in $.steps.loop-1.value.['Emp ID']
7 - Delete processed rows from CSVCopy
The final step is then to delete the 50 rows just processed with the CSV Editor:
8 - Enable the workflowCopy
Don't forget to enable the workflow!
The end resultCopy
When your processing workflow is running you should then see them appearing in your Google Sheet: