Formats Library
Generic Read and Write Functions
The Formats
library contains read
, and write
functions for serializing and deserializing data in other formats, e.g. XML or CSV. The XML and CSV plugins are built into the standard DataSonnet distributions while other formats can be supported by implementing DataFormatPlugin
interface.
These functions can be used if the data contains embedded data in other formats, for example:
{ "embeddedXMLValue": "<test>Hello</test>" }
If the payload or variable itself is in the format other than JSON, it will be automatically detected by the DataSonnet, therefore using the read
will cause an error. The input and output of payload and variables is controlled by the headers.
read(string input, string inputMimeType, object params={})
Reads input data in the specified mime type and accepts additional parameters which override default plugin behavior. The list and format of parameters is specific to a plugin implementation.
Example
ds.read(payload.embeddedXMLValue, "application/xml")
Example
local params = { "NamespaceDeclarations" : { "datasonnet" : "http://www.modusbox.com" }, "NamespaceSeparator": "%", "TextValueKey": "__text", "AttributeCharacter": "*" }; ds.read(payload.embeddedXMLValue, "application/xml", params);
write(string output, string outputMimeType, object params={})
Outputs the data into specified data format and accepts additional parameters which override default plugin behavior. The list and format of parameters is specific to a plugin implementation.
Example
ds.write(payload.embeddedXMLValue, "application/xml")
Example
local params = { "NamespaceDeclarations" : { "datasonnet" : "http://www.modusbox.com" }, "NamespaceSeparator": "%", "TextValueKey": "__text", "AttributeCharacter": "*", "XmlVersion" : "1.1", "AutoEmptyElements": false }; ds.write(payload.someObj, "application/xml", params);
XML Format
read
Converts input XML string to a JSON object using BadgerFish convention:
-
Element names become object properties
-
Text content of elements goes in the
$
property of an object. -
Nested elements become nested properties
-
Multiple elements at the same level become array elements.
-
Attributes go in properties whose names begin with
@
. -
Active namespaces for an element go in the element’s
@xmlns
property. -
The default namespace URI goes in
@xmlns.$
. -
Other namespaces go in other properties of
@xmlns
. -
Elements with namespace prefixes become object properties, too.
-
The
@xmlns
property goes only in object relative to the element where namespace was declared. -
CDATA sections go in properties named
#1
,#2
, etc. -
Text fragments in mixed contents (elements and text) goes in properties named
$1
,$2
, etc.
Example
<?xml version="1.0" encoding="UTF-8"?> <test:root xmlns:test="http://www.modusbox.com"> <test:datasonnet version="1.0">Hello World</test:datasonnet> </test:root>
ds.read(payload.embeddedXMLValue, "application/xml")
{ "test:root": { "@xmlns": { "test":"http://www.modusbox.com" }, "test:datasonnet": { "@version": "1.0", "$": "Hello World" } } }
Providing an optional params
object allows for additional control over JSON generation. The params
is a JSON object where following properties can be set:
Parameter | Description | Default value |
---|---|---|
|
Separator between the prefix and the local name |
|
|
Key prefix for the text value property |
|
|
Key prefix for the CDATA value property |
|
|
Property key prefix which designates an XML element attribute |
|
|
Map of internal prefixes to the namespaces which overrides namespaces declarations in the input. Multiple values are allowed, for example:
|
|
|
if set, the output will be wrapped in a root element with the given name |
Example
<?xml version="1.0" encoding="UTF-8"?> <test:root xmlns:test="http://www.modusbox.com"> <test:datasonnet version="1.0">Hello World</test:datasonnet> </test:root>
local params = { "NamespaceDeclarations" : { "datasonnet": "http://www.modusbox.com" }, "NamespaceSeparator": "%", "TextValueKey": "__text", "AttributeCharacter": "*" }; ds.read(payload.embeddedXMLValue, "application/xml", params);
{ "datasonnet%root": { "*xmlns": { "datasonnet": "http://www.modusbox.com" }, "datasonnet%datasonnet": { "*version": "1.0", "__text": "Hello World" } } }
write
Converts the input JSON object into XML using the Badgerfish convention.
The input JSON must have a single key which will be mapped to the root element of the resulting XML. Correct:
{ "person": { "firstName": "John", "lastName": "Doe", "title": "Rookie DataSonnet mapper" } } Incorrect:
{ "firstName": "John", "lastName": "Doe", "title": "Rookie DataSonnet mapper" } Incorrect:
{ "person": { "firstName": "John", "lastName": "Doe", "title": "Rookie DataSonnet mapper" }, "anotherKey": "anotherValue" } |
Example
{ "test:root": { "@xmlns": { "test":"http://www.modusbox.com" }, "test:datasonnet": { "@version": "1.0", "$": "Hello World" } } }
{ embeddedXMLValue: ds.write(payload, "application/xml") }
{ "embeddedXMLValue": "<?xml version=\"1.0\" encoding=\"UTF-8\"?> <test:root xmlns:test=\"http://www.modusbox.com\"> <test:datasonnet version=\"1.0\">Hello World</test:datasonnet> </test:root>" }
Providing a params
object allows for more control over generated XML. In addition to the parameters described in the read
section, the following XML output-only parameters are supported:
Parameter | Description | Default value |
---|---|---|
|
XML version in the XML declaration |
1.0 |
|
XML encoding |
|
|
If set to |
|
|
If set to |
|
|
If set to |
|
{ "test%root": { "*xmlns": { "test":"http://www.modusbox.com" }, "test%datasonnet": { "*version": "1.0", "__text": "Hello World" }, "test%empty": {} } }
local params = { "NamespaceDeclarations" : { "datasonnet" : "http://www.modusbox.com" }, "NamespaceSeparator": "%", "TextValueKey": "__text", "AttributeCharacter": "*", "XmlVersion" : "1.1", "AutoEmptyElements": false }; { embeddedXMLValue: ds.write(payload, "application/xml") }
{ "embeddedXMLValue": "<?xml version=\"1.1\" encoding=\"UTF-8\"?> <datasonnet:root xmlns:test=\"http://www.modusbox.com\"> <datasonnet:datasonnet version=\"1.0\">Hello World</datasonnet:datasonnet> <datasonnet:empty/> </datasonnet:root>"
CSV Format
read
Parses the CSV and converts it to a JSON array of objects. It expects the CSV input to be in a default format, with first row as column headers, comma separator, double quote, backslash escape character and \n
newline character. CSV headers are used as keys for the corresponding JSON object values.
Example
"First Name","Last Name",Phone William,Shakespeare,"(123)456-7890" Christopher,Marlow,"(987)654-3210"
{ local csvInput = ds.read(payload.embeddedCSVValue, "application/csv"); { name: csvInput[0]["First Name"] + " " + csvInput[0]["Last Name"] } }
{ "name": "William Shakespeare" }
Providing an optional params
object allows more control over the format of the input CSV. The params
is a JSON object where following properties can be set:
Parameter | Description | Default value |
---|---|---|
|
If set to |
|
|
specifies the quote character |
|
|
CSV separator character |
|
|
CSV escape character (only used for parsing CSV) |
|
|
New line character combination |
|
Example
'William'|'Shakespeare'|'(123)456-7890' 'Christopher'|'Marlow'|'(987)654-3210'
local params = { "UseHeader": false, "Quote": "'", "Separator": "|", "Escape": "\\", "NewLine": "\n" }; local csvInput = ds.read(payload.embeddedCSVValue, "application/csv", params); { name: csvInput[0][0] + " " + csvInput[0][1] }
{ "name": "William Shakespeare" }
write
Creates a CSV out of an array of JSON objects, using default quote, separator, escape and new line characters. The keys of JSON object values are used as a CSV headers.
Example
[ { "First Name": "William", "Last Name": "Shakespeare", "Phone": "(123)456-7890" }, { "First Name": "Christopher", "Last Name": "Marlow", "Phone": "(987)654-3210" } ]
{ embeddedCSVValue: ds.write(payload, "application/csv") }
{ "embeddedCSVValue": "\"First Name\",\"Last Name\",Phone\nWilliam,Shakespeare,\"(123)456-7890\"\nChristopher,Marlow,\"(987)654-3210\"\n" }
Providing an optional params
object allows for more control over the format of the output CSV. Quote, separator, and new line characters can be specified, CSV can be created without headers - in this case the input can be an array of arrays. In addition, a list of columns can be specified to override the JSON object names. In addition to the parameters described in the read
section, the following CSV output-only parameters are supported:
Parameter | Description | Default value |
---|---|---|
|
If set to |
|
|
an array of strings to use as column names (has no effect if |
|
Example
[ [ "William", "Shakespeare", "(123)456-7890" ], [ "Christopher", "Marlow", "(987)654-3210" ] ]
local params = { "UseHeader": false, "Quote": "'", "Separator": "|", "NewLine": "\n" }; { embeddedCSVValue: ds.write(payload, "application/csv", params) }
{ "embeddedCSVValue": "'William'|'Shakespeare'|'(123)456-7890'\n'Christopher'|'Marlow'|'(987)654-3210'\n" }
Java Objects
read
Converts POJO to JSON format using Jackson ObjectMapper. Also reads binary data into a byte array (if the input MIME type is application/octet-stream
)
The following read parameters are supported:
Parameter | Description | Default value |
---|---|---|
|
Converts POJO date / time fields to JSON strings using specified date format. See SimpleDateFormat for details. |
|
|
If set to |
|
|
If set to |
|
write
Converts JSON objects to Java POJOs using Jackson ObjectMapper.
The following write parameters are supported:
Parameter | Description | Default value |
---|---|---|
|
Converts POJO date / time fields to JSON strings using specified date format. See SimpleDateFormat for details. |
|
|
Produces an instance of specified class. If parameter is not set, the following conversion rules are used:
|
|
|
Adds a map of classes and their mix-ins to customize the Datasonnet / Jackson deserialization behavior. See the example below for the example of polymorphic deserialization. |
|
|
A comma-separated list of abstract classes that have multiple sub-classes. See the example below for the example of polymorphic deserialization. |
|
|
A property of the JSON object that contains the name of the deserialized class. |
|
JAXBElement
serialization and deserialization
When serializing or deserializing Java object with fields of type JAXBElement
, the mapping has to include additional fields value
, name
and declaredType
. For example, a JAXB-annotated class may look like this:
@XmlRootElement(name = "WsdlGeneratedObj") public class WsdlGeneratedObj { @XmlElementRef(name = "testField", namespace = "http://com.datasonnet.test", type = JAXBElement.class, required = true) protected JAXBElement<TestField> testField; ... }
Mapping from JSON to the WsdlGeneratedObj
is:
/** DataSonnet version=1.0 output.application/java.OutputClass=com.datasonnet.javatest.WsdlGeneratedObj */ { "testField": { "name": "{http://com.datasonnet.test}testField", "declaredType": "com.datasonnet.test.TestField", "value": { test: "Hello World" } } }
Polymorphic deserialization
In a situation where the property of the deserialized Java class is of an abstract type with multiple extending classes, it is necessary to provide an additional information so that Datasonnet can instantiate correct class. Consider the following mapping:
/** DataSonnet version=2.0 output application/x-java-object; dateformat=yyyy-MM-dd; outputclass=com.foo.bar.Household; */ { family: [ { "name": "Joe", "employer": "ModusBox" }, { "name": "Jane", "school": "Elk Grove Middle School" } ] }
package com.foo.bar; public class Household { List<Person> family; ... }
package com.foo.bar; public abstract class Person { String name; }
package com.foo.bar; public class Adult extends Person { ... String employer; }
package com.foo.bar; public class Child extends Person { ... String school; }
This mapping will fail, because Datasonnet doesn’t know which classes it should instantiate for the elements of the family
list.
To fix this, first we need to create a mixin class, e.g.:
package com.foo.bar; import com.fasterxml.jackson.annotation.JsonSubTypes; import com.fasterxml.jackson.annotation.JsonTypeInfo; @JsonTypeInfo( use = JsonTypeInfo.Id.NAME, include = JsonTypeInfo.As.PROPERTY, property = "@type") @JsonSubTypes({ @JsonSubTypes.Type(value = Adult.class, name = "adult"), @JsonSubTypes.Type(value = Child.class, name = "child") }) public abstract class PersonMixIn { }
This class maps the value of the property @type
in a JSON object to the class type to which it should be deserialized.
Now change the mapping to the following:
/** DataSonnet version=2.0 output application/x-java-object; dateformat=yyyy-MM-dd; outputclass=com.foo.bar.Household; mixins="{"com.foo.bar.Person":"com.foo.bar.PersonMixIn"}" */ { family: [ { "@type": "adult", "name": "Joe", "employer": "ModusBox" }, { "@type": "child", "name": "Jane", "school": "Elk Grove Middle School" } ] }
The value of the property @type
will be matched to one of the annotations in the mix-in class.
For the typical use case of an abstract class with concrete subtypes, where the type names are in a property on the JSON objects, you do not need to write your own Mixin, DataSonnet can handle it for you. In this case, the PolymorphicTypes
header must be set, and optionally the PolymorphicTypeIdProperty
, e.g.:
/** DataSonnet version=2.0 output application/x-java-object; dateformat=yyyy-MM-dd; outputclass=com.foo.bar.Household; polymorphictypes=com.foo.bar.Person */ { family: [ { "@class": "com.foo.bar.Adult", "name": "Joe", "employer": "ModusBox" }, { "@class": "com.foo.bar.Child", "name": "Jane", "school": "Elk Grove Middle School" } ] }
/** DataSonnet version=2.0 output application/x-java-object; dateformat=yyyy-MM-dd; outputclass=com.foo.bar.Household; polymorphictypes=com.foo.bar.Person; polymorphictypeidproperty=__clazz */ { family: [ { "__clazz": "com.foo.bar.Adult", "name": "Joe", "employer": "ModusBox" }, { "__clazz": "com.foo.bar.Child", "name": "Jane", "school": "Elk Grove Middle School" } ] }
Binary data
To read a binary data, set input type as application/octet-stream
. The resulting payload will be an array of integers representing the bytes of the input, e.g. [-119,80,78,71,13,10,26,10,0,0,0,13,73 …etc… ]
. Writing the above array as application/octet-stream
will produce a byte array.
Multipart Form Data
read
Reads a byte array of multipart form data into an internal structure of array of objects. Each object represents a part and has a following properties:
-
name
- the name of the part; -
contentType
- the part content type; -
content
- the content of the part. If part is binary, the content will be a byte array; -
fileName
(optional) - if the part is file attachment, this is a file name.
For example:
[ { "name": "textPart", "contentType": "text/plain; charset=UTF-8", "content": "Hello World" }, { "name": "binaryPart", "contentType": "image/png", "fileName": "DataSonnet.png", "content": [ -119, 80, 78, 71, 13, 10, 26, ... ] } ]
The following parameters are supported:
Parameter | Description | Default value |
---|---|---|
|
Explicitly sets the parts boundary (normally it’s automatically detected by the plugin itself) |
write
Creates a byte array containing multipart form data. The input structure must be in form of the array of objects each representing a part. For example, the following DataSonnet mapping will create a multipart data:
[ { name: "textPart", contentType: "text/plain; charset=UTF-8", content: "Hello World" }, { name: "binaryPart", fileName: "DataSonnet.png", contentType: "image/png", content: payload.image } ]
The following parameters are supported:
Parameter | Description | Default value |
---|---|---|
|
Explicitly sets the parts boundary (normally it’s automatically generated by the plugin itself) |
YAML Format
read
Reads input YAML structure and converts it to the internal DataSonnet representation.
No additional read
parameters are supported.
write
Creates YAML structure from the provided input.
The following write parameters are supported:
Parameter | Description | Default value |
---|---|---|
|
If set to |
|
|
If set to |
|