Merge DataSets

Merge one DataSet into another. The source DataSet is called the Head DataSet, and the destination is called the Base DataSet. The merged DataSet will have the same columns as the Base, and a set of rows representing a merge of the Head into the Base. See Usage section for more details on how the merge is performed.

Inputs

Key

Value

baseDataSetUrl

Required

The URL of the DataSet file to be used as the merge destination, otherwise known as the Base DataSet. The Head DataSet (headDataSetUrl) will be merged into this DataSet.

baseKeyColumn

Required

The name of the file column in the Base DataSet (baseDataSetUrl) that should be used as the key column for merging (see Usage section for details).

headDataSetUrl

Required

The URL of the DataSet file to be used as the merge source, otherwise known as the Head Data Set. This DataSet will be merged into the Base DataSet (baseDataSetUrl).

headKeyColumn

Required

The name of the file column in the Base DataSet (baseDataSetUrl) that should be used as the key column for merging (see Usage section for details).

columnMapping

Required

A specification of how Base DataSet columns map to Head DataSet columns, using the following syntax: <head_column~base_column|head_column~base_column|...>. For instance, suppose the columnMapping value is SKU~SKUNumber|Description~Description|PurchaseCost~Cost. This means that:

  • the Head DataSet's SKU column maps to the Base DataSet's SKUNumber column;
  • the Head's Description column maps to the Base's Description column;
  • and the Head's PurchaseCost column maps to the Base's Cost column.

Head DataSet columns that are not included in the mapping will be ignored. Base DataSet columns that are not included in the mapping will be left as is for existing columns, or set to null for new columns.

Outputs

Key

Value

dataSetUrl

The URL of the DataSet JSON file new merged DataSet.

storageFileName

The name of the DataSet JSON file containing the new merged DataSet.

Usage

This action is useful when you have two sets of data that may have some overlap and need to be merged into a single set of data. This can occur in any number of circumstances, but before we go into a use case, let us first go over the mechanics of how the merge works.

This action performs a DataSet merge according to the following rules:

  • The resulting merged DataSet will contain the same columns as the Base DataSet.
  • A Head DataSet row is said to match a Base DataSet row when they have identical values in the key column.
  • All Base DataSet rows with no matches in the Head DataSet are included in the merged DataSet as is.
  • All Head DataSet rows with no matches in the Base DataSet will have their mapped columns included in the merged DataSet. Unmapped columns will be set to null.
  • For each Base DataSet row that DOES have a match in the Head DataSet, the Head version of each column will be included for mapped columns, and the Base version will be included for unmapped columns.

Let us illustrate these rules with an example. Below is a Base DataSet, a Head DataSet, and the resulting merged DataSet. "Contact" is used as the Key column in both DataSets, and columnMapping is set to Phone Number~Phone|Email~Email.

Base DataSet

Customer

Contact

Phone

Email

Acme

Jane Jones

123-555-1234

[email protected]

Acme

John Quincy

123-555-5678

[email protected]

Beverly Healthcare

Kate Smith

456-555-9753

[email protected]

Head DataSet

Contact

Phone Number

Email

Address

Jane Jones

456-555-4321

[email protected]

123 Main Street

Kate Smith

456-555-3333

[email protected]

11 10th Avenue

Greg Gold

123-555-1010

[email protected]

33 Spring Lane

Merged DataSet

Customer

Contact

Phone

Email

Acme

Jane Jones

456-555-4321

[email protected]

Acme

John Quincy

123-555-5678

[email protected]

Beverly Healthcare

Kate Smith

456-555-3333

[email protected]

null

Greg Gold

123-555-1010

[email protected]

Note the following:

  • The "Address" column from the Head DataSet is not used, because it was omitted from the mapping.
  • The "Jane Jones" and "Kate Smith" rows have the updated "Phone" and "Email" columns from the Head DataSet, but retain the "Customer" column value from the Base DataSet since that column was not mapped.
  • The "John Quincy" row is identical to the one in the Base DataSet, because it did not have a Head DataSet match.
  • The "Greg Gold" row has the "Phone" and "Email" from the Head DataSet, and null for the "Customer" column, because that column was not mapped.

One use case for this action is to update an external database. Suppose you keep a customer database in GoFormz as well as in Salesforce. The Salesforce database serves as your system of record, but the GoFormz database sometimes gets new information from forms — and while the GoFormz database is not as complete, the data it does have is more up-to-date. In this scenario, we would want to periodically update the Salesforce database from GoFormz. The Merge DataSets action can be used to do this.

10391039

In this example, we use the Salesforce: Query action to get the customer list from Salesforce and the Get Data Source action to get the GoFormz customer list, merge the DataSets using the Salesforce list as the Base and the GoFormz list as the Head. The resulting merged DataSet, representing the updated customer list, is then uploaded to Google Drive, from where it can be used to update the Salesforce data.