4.1.5. JSON file format

Tale of Data can read JSON files as input (whatever their storage system or location).

By default, JSON files are flattened to obtain tabular datasets. Each JSON path containing a (text, number or Boolean) literal value will become a column in the tabular dataset.

Note

JSON path

Lets you select and extract data from a JSON document. JSON paths start from the node or root item and end at the required item in the JSON structure, from which the data is extracted. Items in a JSON path are separated by /.

Tale of Data offers three keys for customizing how JSON files are converted to table format:

  • Key 1: Included JSON paths (by default, all paths are included).

  • Key 2: Paths to be kept in JSON.

  • Key 3: Excluded JSON paths.

Note

Each key must contain a list of JSON paths separated by line breaks, commas or semi-colons.

In the example of the JSON file below:

 1{
 2"responseStatus": "OK",
 3"responseBody": [
 4   {
 5      "name": "John Smith",
 6      "activities": [
 7      "Football",
 8      "Windsurfing"
 9      ]
10   },
11   {
12      "name": "Kathy Johnson",
13      "activities": [
14      "Tennis",
15      "Karate"
16      ]
17   }
18] }

If no key is entered, the connector will produce the following table:

responseBody_activities

responseBody_name

responseStatus

Football

John Smith

OK

Windsurfing

John Smith

OK

Tennis

Kathy Johnson

OK

Karate

Kathy Johnson

OK

If, on the other hand, you are interested only in data below the responseBody path, you need only specify this in the source node configuration, in the JSON paths included field. responseBody will then become a root path from which JSON data can be extracted:

image88

image89

Note

Paths can either be entered in the field or selected from a table by clicking the little button to the right of the field. image90

Note

If input is manual, you can use the * special character. For example, the Persons/Person/T* JSON path will retrieve all JSON items with a path that starts with Persons, then Person followed by any field starting with ‘T’.

Tip

Multiple paths can be selected.

By selecting the responseBody path in the Included JSON paths field, the following result is obtained:

responseBody_activities

responseBody_name

Football

John Smith

Windsurfing

John Smith

Tennis

Kathy Johnson

Karate

Kathy Johnson

If you want one row per person instead of one row per activity, select the responseBody/activities path in the Paths to be kept in JSON field.

image91

With this setting, the responseBody_activities column will contain the JSON table of activities for a given person:

responseBody_activities

responseBody_name

[“Football”,”Windsurfing”]

John Smith

[“Tennis”,”Karate”]

Kathy Johnson

Finally, the 3rd key (Excluded JSON paths) lets you exclude from the results all JSON items in the paths you have specified.

By adding the responseBody path to the list of paths to be excluded, e.g.

image92

All JSON data in any path descending from responseBody will be ignored. The result will therefore be as shown below:

responseStatus

OK