Connectors / Service / Redshift

Redshift

Redshift

The Amazon Redshift is a fully managed, petabyte-scale data warehouse service.

Overview

The Redshift connector allows you to query your Redshift database directly within tray.io, without having to worry about data pipelines or servers.

Connecting to your Redshift instance

To allow Tray.io to connect to your Redshift database, you'll need to either make your database publicly accessible or you can white list ALL of the following static IP addresses for access:

  • 52.40.200.248
  • 52.39.10.61
  • 52.26.59.155

Note: be sure to make it so that your Redshift database is only accessible with a strong username and password combination when white listing these IP addresses.

To do this:

  1. Open up the "Security" section of your Redhift AWS. Open the "default" security group.
  1. Click "Add Connection Type"
  1. Add a new "Connection Type", using the "CIDR/IP" connection type, and one of the above IP addresses as the IP address to authorize.

Authentication

Assuming that you already have a Redshift cluster set up, and have configured the access settings mentioned above correctly, we can proceed with the authentication process.

When using the Redshift connector, the first thing you will need to do is go to your Tray.io account page, and select the workflow you wish to work on. Once in the workflow builder itself, search and drag the Redshift connector from the connectors panel (on the left hand side) onto your workflow.

With the new Redshift connector step highlighted, in the properties panel on the right, click on the Authenticate tab and 'Add new authentication' (located under the 'Authentication' field).

This will result in a Tray.io authentication pop-up window. The first page will ask you to name your authentication, and state which type of authentication you wish to create ('Personal' or 'Organisational').

As you can see, the next page asks you for the details about your redshift cluster ('Host', 'Port', 'Database', and 'Schema') and your credentials ('User' and 'Password').

In order to get details about your reshift cluster, log on to AWS and head over to the Clusters screen in your Amazon Redshift account. From the available list of clusters, select the one that you're trying to connect.

Firstly, it asks for 'Host'. The Host name is the 'Endpoint' listed within the 'General informtion' section. You will need to edit the given URL before adding it to your Tray.io authentication model.

For example, if your 'Endpoint' URL is: redshift-cluster-1.chjrudn4xw74.us-east-2.redshift.amazonaws.com:5427/dev, users will need to remove everything after :5427 in order to have valid information for the 'Host' field.

Once this is done, copy and paste the URL into said 'Host' field.

To get the 'Port' and 'Database' fields, head to your AWS account. On the same page from which you just copied your 'Host' URL, select the 'Properties' tab. This should be below the 'General information' section.

Scroll down to the 'Database configurations' area. Here, you will find your cluster's 'Database name' and 'Port'. Copy and paste this information into your Tray.io authentication popup model.

The 'Schema' field will depend on your individual use case, and the schemas you have available.

If you have created a variety of schemas (but are not sure which ones are available for your use), copy and paste the following query into the Redshifts Query Editor: select * from pg_namespace;.

This query retrieves a full list of available schemas from the database, including the public schema.

You can then perform basic search queries in order to identify which tables are grouped under each schema. Users should enter the appropriate schema name in the 'Schema' field.

Finally, please provide your user credentials.

PLEASE NOTE: These user credentials are specific to the Redshift cluster and not AWS services.

In this example the primary user's credentials are displayed. Aka, the credentials that were created when first configuring the cluster.

If you want to use other individual's credentials for authentication, use the following query select * from pg_user; to retrieve their user details.

Remember that users will need certain access rights in order to query & manipulate the database.

Once you have added these fields to your Tray.io authentication popup window, click on the 'Create authentication' button. Go back to your settings authentication field (within the workflow builder properties panel), and select the recently added authentication from the dropdown options now available.

Your connector authentication setup should now be complete.

Notes on using Redshift

Run SQL query

The Run SQL query operation helps communicate with the Redshift database by executing raw specified SQL commands. The SQL commands can be used to perform actions like create and manipulate database objects, run queries, load tables, and modify the data in tables.

A sample query to create a table in the Redshift database would be something similar to this: Create table purchase_details (order_id int, email varchar, total_price float, currency varchar)

The Amazon Redshift is based on PostgreSQL. Amazon Redshift and PostgreSQL have several very important differences that you must be aware of while working with the Redshift database. For information about how Amazon Redshift SQL differs from PostgreSQL, see the Amazon Redshift and PostgreSQL document.

For the ease of working with the Redshift PostgreSQL queries, refer to the Redshift's SQL commands document.

Available Operations

The examples below show one or two of the available connector operations in use.

Please see the Full Operations Reference at the end of this page for details on all available operations for this connector.

Example usage

There are various ways in which you can upload (insert/update) data in to the Redshift tables based on your requirements and scenarios. Below are the few example work flows that will help you gain a better understanding of this ways:

Please note that the these demos which follow do not represent a 'fixed' way of working with Tray.io. They only show possible ways of working with Tray.io and the Redshift connector. Once you've finished working through this example please see our page Introduction to working with data and jsonpaths and Data Guide for more details.

In these examples, we have shown that the data you have received some customer data via the Webhook trigger, and now need to process it into Redshift. However users may have received this data in any number of other ways such as from the Salesforce or SurveyMonkey connector, through the CSV Editor, and so on.

Insert records

Below is an example of how you could potentially use the Redshift connector to insert new records into the Redshift database and read all the records from said database.

Upon receiving the records, the workflow checks whether the table in which we want to add these records already exists in the Redshift database. If it does not exist, then the workflow creates a new table and then adds the received records to that table. Once the records are added, the workflow continues to read the records from the selected Redshift table.

This example demonstrates the following operations of the Redshift connector:

  • Run SQL query: Executes a raw specified SQL query on the chosen database.
  • Insert new rows: Inserts one or more rows into your chosen Redshift database.
  • Find rows: Finds and reads the rows from your chosen database based on the specified criteria.

The steps will be as follows:

  1. Pull the client records from the source (a webhook trigger in this example) and create the table (if doesn't exist) in the Redshift database.
  2. Push the records one by one into the Redshift database using the Loop connector.
  3. Read the records from the Redshift database.

The final outcome should look like this:

1 - Pull records & create table

Select the Webhook trigger from the trigger options available, and set the operation to 'When webhook is received, auto respond with HTTP 200'. Be sure to click the 'Enable' button before continuing. This makes the workflow ready to receive incoming calls.

The records received by the Webhook are in JSON format and is as follows:

[{
"client_id": 82,
"name": "Renaldo Castaignet",
"email": "rcastaignet0@slashdot.org",
"contact_no": "578-493-4455"
},
{
"client_id": 79,
"name": "Arlen Grumbridge",
"email": "agrumbridge1@instagram.com",
"contact_no": "292-491-7708"
},
{
"client_id": 57,
"name": "Veronika Fitter",
"email": "vfitter2@google.com.au",
"contact_no": "457-411-5533"
},
{
"client_id": 68,
"name": "Tirrell Gladbach",
"email": "tgladbach3@discuz.net",
"contact_no": "714-514-0108"
},
{
"client_id": 90,
"name": "Emmalynne Gibbard",
"email": "egibbard4@accuweather.com",
"contact_no": "625-905-3040"
}
]

Remember that the format in which you receive data will vary depending on where it is coming from** - pulling records from Salesforce, Webform, etc., will have different results / views.

Next, add the redshift connector and set the operation to 'Run SQL query'. Add the following query in the 'SQL query' field.

Create table if not exists client_info (client_id int, name varchar, email varchar, contact_no varchar)

The above query will verify if the specified table (in this case, client_info) exists in the Redshift database. If it exists, then the workflow proceeds. If it doesn't, then it creates the table based on the table information provided in the above SQL query.

2 - Add records

Add a Loop connector with the 'List' field set to $.steps.trigger.body. You can use the connector-snake to generate this automatically.

JSONPATHS: For more information on what jsonpaths are and how to use jsonpaths with Tray.io, please see our Intro page and Data Guide for more details.
CONNECTOR-SNAKE: The simplest and easiest way to generate your jsonpaths is to use our feature called the Connector-snake. Please see the main page for more details.

Next, add a Redshift connector inside the Loop connector and set the operation to 'Insert new rows'.

In the 'Rows to insert' field, add all the property names using the 'Add property to Row' button for the selected table.

Using the connector-snake once more, find the jsonpath for each field from the previous step.

3 - Read records

Add another Redshift connector and set the operation as 'Find rows'.

Under the 'Fields' section, select the field names (from the drop-down options) for the set table for which you would like to retrieve the information from.

Adding a record to a particular table

This workflow demonstrates how to insert records into multiple Redshift tables based on the received data.

The Redshift database considered in this example contains three tables, namely 'account', 'prospect', and 'sequenceState'. So depending on the tableName specified in the received data, the workflow will add the data to the specified table.

The steps will be as follows:

  1. Pull the records from the source (a webhook trigger in this example) and extract the table name from the value of the tableName attribute.
  2. Navigate the course of the workflow to the appropriate branch using the Branch connector based on the value (table name) received from the previous step.
  3. Insert the new records into the appropriate Redshift table.

The final outcome should look like this:

1 - Pull records and extract the table name

Select the Webhook trigger from the trigger options available, and set the operation to 'When webhook is received, auto respond with HTTP 200'.

Be sure to click the 'Enable' button before continuing. This makes the workflow ready to receive incoming calls.

The records received by the Webhook are in JSON format and is as follows:

{
"data": {
"id": "20",
"first_name": "Blake",
"last_name": "Mosley",
"phone": "+1 (956) 577-2515"
},
"meta": {
"tableName": "create.prospect"
}
}

Remember that the format in which you receive data will vary depending on where it is coming from - pulling records from Salesforce, Outreach, etc., will have different rresults/views.

Next, add the Text Helper connector and set the operation to 'Get text after'.

The Text Helper connector using the 'Get text after' operation returns a substring based on the inputs provided.

As you can see in the image below, given a jsonpath to the 'String' field using the connector-snake ($.steps.trigger.body.meta.tableName) and a value to the 'Pattern' field as create.. It returns the substring between where the pattern was found depending on the 'Match number' 2 and beginning of the string.

To understand this operation, consider the example JSON provided above. Based on the provided jsonpath and values for each field, the retrieved values are:

  • String: create.prospect
  • Pattern: create.
  • Match number: 2

So when the values of the String and Pattern fields are compared, the match found is create.. So the operation provides the substring that comes after(second) the match. In this case, as you can see in the image below, the substring returned by the operation is prospect.

Based on this step's output, the workflow will navigate to the branch with the Label as 'Prospect'.

For more clarification on the pathways, you have available, open the Debug panel to view your step's Input and Output.

2 - Navigate and add records

Add a Branch connector with the 'Value To Test' field set to $.steps.text-helpers-1.result. You can use the connector-snake to generate this automatically.

Set the 'Value' and 'Label' pair for each branch with the Redshift table names, as shown in the image below.

The Branch connector will navigate the workflow execution based on the input received from the previous step. The possible values you could receive from the previous step are 'prospect', 'account', and 'sequenceState', thus the three branches.

The third branch is an auto-generated 'Default' branch, which will terminate the workflow if the branch does not receive any of the three values mentioned above as an input.

For the workflow to terminate, add the Terminate connector under this fourth branch, i.e., the 'DEFAULT' branch.

Next, add the Redshift connector under each branch except the 'DEFAULT' branch, and set the operation to 'Insert new rows'.

As you can see, the 'Table' and the 'Rows to insert' fields are mandatory.

Select the appropriate table name from the drop-down options based on the branch your Redshift connector is on. As you can see we have selected the table name for the first branch as prospect.

Using a connector-snake find the jsonpath for the 'Row' field from the Trigger step. It should appear similar to this: $.steps.trigger.body.data.

Similarly, add values for the 'Table' and the 'Rows to insert' fields for the Redshift connectors under the 'ACCOUNT' and 'SEQUENCE STATE' branches as well.

The Insert new rows operation will insert the records received through the Webhook trigger in the selected Redshift table.

Add or update record

The below example demonstrates how you could potentially use the Redshift connector to insert and update the records into the Redshift database.

In this workflow, the record to be uploaded to the Redshift database is received through the Webhook trigger. The received record includes the event attribute which specifies if the record is to be created or updated.

The steps will be as follows:

  1. Pull the records from the source (a webhook trigger in this example) and navigate the course of the workflow to the appropriate branch using the Branch connector based on the value (event) received from the previous step.
  2. Insert or update the record to the Redshift database.

The final outcome should look like this:

1 - Pull records and navigate

Select the Webhook trigger from the trigger options available, and set the operation to 'When webhook is received, auto respond with HTTP 200'.

Be sure to click the 'Enable' button before continuing. This makes the workflow ready to receive incoming calls.

The records received by the Webhook are in JSON format and is as follows:

{
"data": {
"id": "20",
"first_name": "Blake",
"last_name": "Mosley",
"phone": "+1 (956) 577-2515"
},
"meta": {
"event": "create"
}
}

Remember that the format in which you receive data will vary depending on where it is coming from - pulling records from Salesforce, Outreach, etc., will have different results/views.

Next, add the Branch connector with the 'Value To Test' field set to $.steps.trigger.body.meta.event. You can use the connector-snake to generate this automatically.

Set the 'Value' and 'Label' pair for the first two branches as 'create' and 'update' as shown in the image below.

The Branch connector will navigate the workflow execution based on the value of the event attribute received in the JSON data from the trigger. The possible values you could receive are 'create' and 'update', thus the first two branches.

The third branch is an auto-generated 'Default' branch, which will terminate the workflow if the branch does not receive any of the two values mentioned above as an input.

For the workflow to terminate, add the Terminate connector under this fourth branch, i.e., the 'DEFAULT' branch.

2 - Insert or update the record

Now, add the Redshift connector under each branch except the 'DEFAULT' branch.

As you can see in the image below, the first Redshift connector under the 'CREATE' branch will insert a new record into the Redshift database. The second Redshift connector under the 'UPDATE' branch will update the received record into the Redshift database.

The two Redshift connectors for insert and update operation can be configured as follows:

Redshift connector to insert a record:

To insert the record, set the operation to 'Insert new rows'.

As you can see, the 'Table' and the 'Rows to insert' fields are mandatory.

Select the table name of your choice from the available drop-down options. As you can see, in this example, we have selected the table name as customer_details.

Using a connector-snake find the jsonpath for the 'Row' field from the Trigger step. It should appear similar to this: $.steps.trigger.body.data.

Now, based on the selected table and the jsonpath for the received data, the workflow will insert the received record into the selected Redshift table.

Redshift connector to update a record:

To update the record, set the operation to 'Update rows'.

Set the values for the 'Table' and the 'Rows to insert' fields similarly as we did above.

Before updating the record, the Update rows operation matches the existing 'id' from the selected Redshift table with the ID of the data received from the trigger: $.steps.trigger.body.data.id.

Thus, as you can see in the image below the condition is set to specify the id of the existing record that needs to be updated.

Now, based on the specified table, jsonpath, and the condition, the workflow will locate and update the record if it already exists.

Advanced topics

Processing dynamic data

Sometimes, depending on the data source, your workflow may receive dynamic data.

A dynamic data is a piece of information that may change regularly, i.e., the number of attributes within the data received may vary for each execution of your workflow.

To understand the concept of dynamic data, let's assume an example where you wish to insert some data into the Redshift database. The data that you have received for the first execution of your workflow is in this format:

{
"data": {
"id": 129273,
"attributes": {
"updatedAt": "2021-01-29T07:52:27.000Z",
"engagedAt": "2021-01-29T07:52:22.000Z"
}
},
"meta": {
"deliveredAt": "2021-01-29T07:52:28.094+00:00"
}
}

For the second execution, your workflow may receive the data with more attributes like this:

{
"data": {
"id": 129273,
"attributes": {
"updatedAt": "2021-01-29T07:52:27.000Z",
"engagedAt": "2021-01-29T07:52:22.000Z",
"addressCity": "london",
"addressCountry": "uk",
"addressZip": "JU59KL"
}
},
"meta": {
"deliveredAt": "2021-01-29T07:52:28.094+00:00"
}
}
PLEASE NOTE: It is essential to understand that the attributes received in the JSON data correspond to the column names in your Redshift database table.

So in such a situation, where it is difficult to gauge the attributes you may receive in your data for each execution. We can not map each attribute with their respective column names in order to upload the data in the Redshift table, i.e., something like this, as shown in the image below, is not possible:

To understand the mapping of each attribute with the column name, please refer to the Insert records workflow explained above.

So as a solution, you have to upload the data in bulk, i.e., in one go. This is possible only if the received JSON data has a flat structure.

Now the best way to transform a nested JSON structure into a flat structure is using a script.

That is what the following workflow demonstrates. That is how you could manage the dynamic data using a script that flattens the nested JSON and uploads this data into your Redshift table.

In this example, we show data being received via the Webhook trigger - i.e., a scenario whereby you have configured an external service to automatically send new record data to your Tray.io workflow.

In practice, you can receive data in any structure, and in other ways, such as from any pre-built Tray.io service trigger or from a previous workflow step which has pulled data from another source (e.g., Salesforce 'Find Records', CSV Reader, etc.)

  1. The Webhook trigger with operation as 'Auto respond with HTTP 200' listens for the incoming data.
  2. The Process dynamic data step is a Script Connector step which flattens the received JSON data inorder to handle the incoming dynmaic data. (explained in detail below)
  3. The Insert new rows step inserts new rows based on the flatten JSON data received from the previous step $.steps.script-1.result in the selected Redshift table (explained in detail below).

Process dynamic data step explained

The Script connector using the 'Execute Script' operation flattens the received JSON data in order to handle the active incoming data and upload it to the database.

The operation accepts the jsonpath for the data received from the trigger as an input value: $.steps.trigger.data.

Using the following script the Script connector flattens the received JSON data:

// You can reference the input variables using input.NAME
// Parsed JSON files could be referenced as fileInput
exports.step = function(input, fileInput) {
const mappings = [
{"transformer": [
{"from": "id", "to": "id"},
{"from": "attributes.addedAt", "to": "added_at"},
{"from": "attributes.addressCity", "to": "address_city"},
{"from": "attributes.addressCountry", "to": "address_country"},
{"from": "attributes.addressState", "to": "address_state"},
{"from": "attributes.addressStreet", "to": "address_street"},
{"from": "attributes.addressStreet2", "to": "address_street2"},
{"from": "attributes.addressZip", "to": "address_zip"},
{"from": "attributes.angelListUrl", "to": "angel_list_url"},
{"from": "attributes.availableAt", "to": "available_at"},
{"from": "attributes.callOptedOut", "to": "call_opted_out"},
{"from": "attributes.callsOptStatus", "to": "calls_opt_status"},
{"from": "attributes.callsOptedAt", "to": "calls_opted_at"},
{"from": "attributes.campaignName", "to": "campaign_name"},
{"from": "attributes.clickCount", "to": "click_count"},
{"from": "attributes.company", "to": "company"},
{"from": "attributes.companyFollowers", "to": "company_followers"},
{"from": "attributes.companyFoundedAt", "to": "company_founded_at"},
{"from": "attributes.companyIndustry", "to": "company_industry"},
{"from": "attributes.companyLinkedIn", "to": "company_linkedin"},
{"from": "attributes.companyLinkedInEmployees", "to": "company_linkedin_employees"},
{"from": "attributes.companyLocality", "to": "company_locality"},
{"from": "attributes.companyNatural", "to": "company_natural"},
{"from": "attributes.companySize", "to": "company_size"},
{"from": "attributes.companyType", "to": "company_type"},
{"from": "attributes.contactHistogram", "to": "contact_histogram"},
{"from": "attributes.createdAt", "to": "created_at"},
{"from": "attributes.dateOfBirth", "to": "date_of_birth"},
{"from": "attributes.degree", "to": "degree"},
{"from": "attributes.emailOptedOut", "to": "email_opted_out"},
{"from": "attributes.emails", "to": "emails"},
{"from": "attributes.emailsOptStatus", "to": "emails_opt_status"},
{"from": "attributes.emailsOptedAt", "to": "emails_opted_at"},
{"from": "attributes.engagedAt", "to": "engaged_at"},
{"from": "attributes.engagedScore", "to": "engaged_score"},
{"from": "attributes.eventName", "to": "event_name"},
{"from": "attributes.externalId", "to": "external_id"},
{"from": "attributes.externalOwner", "to": "external_owner"},
{"from": "attributes.externalSource", "to": "external_source"},
{"from": "attributes.facebookUrl", "to": "facebook_url"},
{"from": "attributes.firstName", "to": "first_name"},
{"from": "attributes.gender", "to": "gender"},
{"from": "attributes.githubUrl", "to": "github_url"},
{"from": "attributes.githubUsername", "to": "github_username"},
{"from": "attributes.googlePlusUrl", "to": "google_plus_url"},
{"from": "attributes.graduationDate", "to": "graduation_date"},
{"from": "attributes.homePhones", "to": "home_phones"},
{"from": "attributes.jobStartDate", "to": "job_start_date"},
{"from": "attributes.lastName", "to": "last_name"},
{"from": "attributes.linkedInConnections", "to": "linkedin_connections"},
{"from": "attributes.linkedInId", "to": "linkedin_id"},
{"from": "attributes.linkedInSlug", "to": "linkedin_slug"},
{"from": "attributes.linkedInUrl", "to": "linkedin_url"},
{"from": "attributes.middleName", "to": "middle_name"},
{"from": "attributes.mobilePhones", "to": "mobile_phones"},
{"from": "attributes.name", "to": "name"},
{"from": "attributes.nickname", "to": "nickname"},
{"from": "attributes.occupation", "to": "occupation"},
{"from": "attributes.openCount", "to": "open_count"},
{"from": "attributes.optedOut", "to": "opted_out"},
{"from": "attributes.optedOutAt", "to": "opted_out_at"},
{"from": "attributes.otherPhones", "to": "other_phones"},
{"from": "attributes.updatedAt", "to": "updated_at"},
{"from": "attributes.voipPhones", "to": "voip_phones"},
{"from": "attributes.websiteUrl1", "to": "website_url1"},
{"from": "attributes.websiteUrl2", "to": "website_url2"},
{"from": "attributes.websiteUrl3", "to": "website_url3"},
{"from": "attributes.workPhones", "to": "work_phones"},
{"from": "relationships.account.data.id", "to": "rel_account_id"},
{"from": "relationships.creator.data.id", "to": "creator_id"},
{"from": "relationships.owner.data.id", "to": "owner_id"},
{"from": "relationships.stage.data.id", "to": "stage_id"},
{"from": "relationships.updater.data.id", "to": "updater_id"}
]}
];
const output = new Object();
output.id = input.data.id;
Object.keys(input.data.attributes).forEach(k => {
output[mappings.filter(m => Object.keys(m)[0] === "transformer")[0]["transformer"].filter(m => m.from === `attributes.${k}`)[0].to] = input.data.attributes[k];
});
return output;
};

The above script transforms the nested structure i.e attributes.[column_name] to a flat structure, i.e., just the [column_name].

Doing so will help the Redshift connector to map the flat attributes from the JSON to the column names in the Redshift table.

Once the script is executed, the flattened JSON data would look similar:

The above-mentioned script can be altered based on your requirements and data source.

The insert new rows steps explained

The Insert new records step accepts $.steps.script-1.result as the input from the previous step for Rows to insert in the 'Insert new rows' operation.

All Operations

Latest version:

1.3