4.1.5. JSON file format
Tale of Data can read JSON files as input (whatever their storage system or location).
By default, JSON files are flattened to obtain tabular datasets. Each JSON path containing a (text, number or Boolean) literal value will become a column in the tabular dataset.
Note
- JSON path
Lets you select and extract data from a JSON document. JSON paths start from the node or root item and end at the required item in the JSON structure, from which the data is extracted. Items in a JSON path are separated by
/
.
Tale of Data offers three keys for customizing how JSON files are converted to table format:
Key 1: Included JSON paths (by default, all paths are included).
Key 2: Paths to be kept in JSON.
Key 3: Excluded JSON paths.
Note
Each key must contain a list of JSON paths separated by line breaks, commas or semi-colons.
In the example of the JSON file below:
1{
2"responseStatus": "OK",
3"responseBody": [
4 {
5 "name": "John Smith",
6 "activities": [
7 "Football",
8 "Windsurfing"
9 ]
10 },
11 {
12 "name": "Kathy Johnson",
13 "activities": [
14 "Tennis",
15 "Karate"
16 ]
17 }
18] }
If no key is entered, the connector will produce the following table:
responseBody_activities |
responseBody_name |
responseStatus |
Football |
John Smith |
OK |
Windsurfing |
John Smith |
OK |
Tennis |
Kathy Johnson |
OK |
Karate |
Kathy Johnson |
OK |
If, on the other hand, you are interested only in data below the responseBody
path, you need only specify this in the source node configuration, in the JSON paths included field. responseBody
will then become a root path from which JSON data can be extracted:
Note
Paths can either be entered in the field or selected from a table by clicking the little button to the right of the field.
Note
If input is manual, you can use the *
special character. For example, the Persons/Person/T*
JSON path will retrieve all JSON items with a path that starts with Persons, then Person followed by any field starting with ‘T’.
Tip
Multiple paths can be selected.
By selecting the responseBody path in the Included JSON paths field, the following result is obtained:
responseBody_activities |
responseBody_name |
Football |
John Smith |
Windsurfing |
John Smith |
Tennis |
Kathy Johnson |
Karate |
Kathy Johnson |
If you want one row per person instead of one row per activity, select the responseBody/activities path in the Paths to be kept in JSON field.
With this setting, the responseBody_activities column will contain the JSON table of activities for a given person:
responseBody_activities |
responseBody_name |
[“Football”,”Windsurfing”] |
John Smith |
[“Tennis”,”Karate”] |
Kathy Johnson |
Finally, the 3rd key (Excluded JSON paths) lets you exclude from the results all JSON items in the paths you have specified.
By adding the responseBody path to the list of paths to be excluded, e.g.
All JSON data in any path descending from responseBody will be ignored. The result will therefore be as shown below:
responseStatus |
OK |