Standard/Best Practices
Use Cases
Connecting to on-prem systems

CSV queue management

Overview
Copy

To illustrate how to use the CSV Editor to manage the rate at which you make requests to a service API, the following imagines a scenario whereby you are receiving hundreds of applicants to a scheme via a webhook triggered workflow.

You are then wanting to upload these applicants to Google Sheets but you will need to do so in accordance with their API request limits which states that the rate limit is 100 requests per 100 seconds per user.

So we will be dividing this process into two workflows:

  1. Workflow 1 will receive the applicants via webhook and create an account-level CSV to store the hundreds of applicants being received

  2. Workflow 2 will be scheduled to pick up applicants from the account-level CSV and send appropriately-sized batches to Google Sheets and delete applicants from the CSV as they are processed

The overall result will be to build a 'first-in-first-out' (FIFO) queue whereby the first results to appear in Sheets will be the first applicants that were received.

Workflow 1 looks like this:

The steps involved here are:

  1. Receive a webhook payload of several hundred applicants to a trial scheme

  2. Parse this data into readable JSON with an Object Helper

  3. Loop through each applicant and then for each one :

  • Check if an account-level CSV has already been created (its id will be stored in the data storage connector)

  • If not then create a new account-level CSV and store the id

  • Get the CSV id

  • Add the applicant as a new row to the CSV

The end result of this is that you will have an account-level CSV populated with several hundred applicants, ready to be accessed by other workflows which can use data storage to retrieve the CSV id.

The next step is to process these applicants at the desired rate of e.g. 200 per hour (this will vary according to the API limits of the service you are working with).

Then the sub-processing workflow 2 is kicked off by a scheduled trigger:

The steps involved here are:

  1. Run a scheduled trigger. In this scenario we are saying that every Monday we want to process applications that have built up in the previous week. So we set the trigger to run every half hour on Mondays (you can deal with any scenario here - you might want to run this workflow every hour every day depending on your needs)

  2. Get the id of the account-level CSV from Data Storage

  3. Get a set number of rows from the CSV. In this case we choose 50 to make extra sure that we don't attempt to process at a rate of more than 100 in 100 secs

  4. Check if any rows are found (i.e. there are applicants to be processed)

  5. If so then loop through each row and:

  • Parse the row data to JSON

  • Add to a new row in Google Sheets

  1. Finally, for each run of the workflow, we delete all the rows that have been processed

Setting up the main workflow 1
Copy

1 - Parse JSON data from the webhook trigger
Copy

The JSON Parse step uses the $.steps.trigger.body jsonpath to get the applicant data from the webhook:

Exactly what jsonpath you would use to access this data will depend on how the data is formatted by the service sending the webhook data.

2 - Loop through each applicant
Copy

We then loop through each applicant to begin the process of adding them as a row to the CSV:

3 - Get the CSV id
Copy

For each applicant we need to get the account level CSV. This is done by using the Data Storage 'Get Value' operation:

Note that this is set to Account Scope and that it uses the best practice of adding the final suffix of the workflow url to the name of the key - for my test workflow it is csv-9c3a4eaff4ce - to identify the workflow source of the key.

If you're wondering where the name of this key actually comes from - it is set with the Set CSV id step in the True branch of the first run of the workflow!

4 - Check if CSV id is set
Copy

On the first run of the workflow, there will be no CSV id, so we need to use a boolean connector to check if it is present. Here we are checking for a null value:

5 - Create CSV on first workflow run
Copy

If it is true that the CSV is not set then we create the CSV with the relevant columns, setting the scope to Account:

6 - Set CSV id
Copy

We then have to use the $.steps.csv-1.id jsonpath to store the CSV id at account level using Data Storage:

Note again that we are using the best practice of adding the final suffix of the workflow url to the name of the key - for my test workflow it is csv-9c3a4eaff4ce - to identify the workflow source of the key.

7 - Get CSV id
Copy

Before adding the applicant to a new row, we have to get the id of the CSV:

8 - Add applicant to row
Copy

The final step is to add the applicant to a new row in the CSV:

9 - Enable the workflow
Copy

Make sure to enable the workflow!

Setting up a workflow to replicate a webhook
Copy

This manually-triggered workflow only involves one step - an http client to send the dummy data to workflow 1:

Note the URL you enter here is the webhook url from workflow 1. This is found in workflow 1 by going to workflow settings in the top-left corner of the builder:

In the raw body you can enter the json applicants data to be sent to the main workflow (click here to download a sample json file which you can open to copy and paste into the raw body and will work with the main workflow setup)

You can then hit 'run workflow':

If you return to worfklow 1 and open the debug panel you should see results coming in!

Setting up the rate-controlled processing workflow 2
Copy

1 - Set the scheduled trigger
Copy

Here we use the Cron Schedule operation for the scheduled trigger, saying we want to run every half hour on Mondays by using the */30 * * * 1 cron expression (see crontab guru for a handy tool to help you generate more advanced cron expressions if the basic operations in our scheduled trigger do not suffice).

Note on testing: When testing this setup, you could make use of a manual trigger, or you could just make sure the trigger is scheduled to run on the current day you are working on - e.g. */30 * * * 3 for if you are working on a Wednesday (the final digit in the cron formula uses numbers to set the day of the week - where Sun-Sat is 0-6)

2 - Get the CSV id
Copy

We then need to get the account-level CSV id:

3 - Get a batch of 50 rows
Copy

In order to control the rate of processing we have decided that every half hour we are going to process 50 rows from the CSV:

4 - Check there are rows to process
Copy

If there are no rows left to process, you can check for this and terminate the workflow:

5 - Loop through each row
Copy

We can then loop through each CSV row to be processed into Google Sheets:

6 - Add row in Google Sheets
Copy

You should already have created an empty Google Sheet and retrieved for this step, with the appropriate columns set:

Then the row can be added to Google Sheets using the Create Row operation.

Note the jsonpaths pick up the value from the loop connector, and how they deal with spaces in the column name by using [' '] as in $.steps.loop-1.value.['Emp ID']

7 - Delete processed rows from CSV
Copy

The final step is then to delete the 50 rows just processed with the CSV Editor:

8 - Enable the workflow
Copy

Don't forget to enable the workflow!

The end result
Copy

When your processing workflow is running you should then see them appearing in your Google Sheet: