Formats Library

Generic Read and Write Functions

The Formats library contains read, and write functions for serializing and deserializing data in other formats, e.g. XML or CSV. The XML and CSV plugins are built into the standard DataSonnet distributions while other formats can be supported by implementing DataFormatPlugin interface.

These functions can be used if the data contains embedded data in other formats, for example:

{
    "embeddedXMLValue": "<test>Hello</test>"
}

If the payload or variable itself is in the format other than JSON, it will be automatically detected by the DataSonnet, therefore using the read will cause an error. The input and output of payload and variables is controlled by the headers.

read(string input, string inputMimeType, object params={})

Reads input data in the specified mime type and accepts additional parameters which override default plugin behavior. The list and format of parameters is specific to a plugin implementation.

Example

ds.read(payload.embeddedXMLValue, "application/xml")

Example

local params = {
    "NamespaceDeclarations" : {
        "datasonnet" : "http://www.modusbox.com"
    },
    "NamespaceSeparator": "%",
    "TextValueKey": "__text",
    "AttributeCharacter": "*"
};

ds.read(payload.embeddedXMLValue, "application/xml", params);

write(string output, string outputMimeType, object params={})

Outputs the data into specified data format and accepts additional parameters which override default plugin behavior. The list and format of parameters is specific to a plugin implementation.

Example

ds.write(payload.embeddedXMLValue, "application/xml")

Example

local params = {
    "NamespaceDeclarations" : {
        "datasonnet" : "http://www.modusbox.com"
    },
    "NamespaceSeparator": "%",
    "TextValueKey": "__text",
    "AttributeCharacter": "*",
    "XmlVersion" : "1.1",
    "AutoEmptyElements": false
};

ds.write(payload.someObj, "application/xml", params);

XML Format

MIME types and identifiers

  • application/xml

  • xml

read

Converts input XML string to a JSON object using BadgerFish convention:

  • Element names become object properties

  • Text content of elements goes in the $ property of an object.

  • Nested elements become nested properties

  • Multiple elements at the same level become array elements.

  • Attributes go in properties whose names begin with @.

  • Active namespaces for an element go in the element’s @xmlns property.

  • The default namespace URI goes in @xmlns.$.

  • Other namespaces go in other properties of @xmlns.

  • Elements with namespace prefixes become object properties, too.

  • The @xmlns property goes only in object relative to the element where namespace was declared.

  • CDATA sections go in properties named #1, #2, etc.

  • Text fragments in mixed contents (elements and text) goes in properties named $1, $2, etc.

Example

Embedded XML value
<?xml version="1.0" encoding="UTF-8"?>
<test:root xmlns:test="http://www.modusbox.com">
    <test:datasonnet version="1.0">Hello World</test:datasonnet>
</test:root>
DataSonnet map:
ds.read(payload.embeddedXMLValue, "application/xml")
Result
{
    "test:root": {
        "@xmlns": {
            "test":"http://www.modusbox.com"
        },
        "test:datasonnet": {
            "@version": "1.0",
            "$": "Hello World"
        }
    }
}

Providing an optional params object allows for additional control over JSON generation. The params is a JSON object where following properties can be set:

Parameter Description Default value

NamespaceSeparator

Separator between the prefix and the local name

:

TextValueKey

Key prefix for the text value property

$

CdataValueKey

Key prefix for the CDATA value property

#

AttributeCharacter

Property key prefix which designates an XML element attribute

@

NamespaceDeclarations

Map of internal prefixes to the namespaces which overrides namespaces declarations in the input. Multiple values are allowed, for example:

"NamespaceDeclarations" : {
    "datasonnet" : "http://www.datasonnet.com",
    "test" : "urn:com.foo.bar",
    "": "http://www.modusbox.com"
}

RootElement

if set, the output will be wrapped in a root element with the given name

Example

Embedded XML value
<?xml version="1.0" encoding="UTF-8"?>
<test:root xmlns:test="http://www.modusbox.com">
    <test:datasonnet version="1.0">Hello World</test:datasonnet>
</test:root>
DataSonnet map:
local params = {
    "NamespaceDeclarations" : {
        "datasonnet": "http://www.modusbox.com"
    },
    "NamespaceSeparator": "%",
    "TextValueKey": "__text",
    "AttributeCharacter": "*"
};

ds.read(payload.embeddedXMLValue, "application/xml", params);
Result
{
    "datasonnet%root": {
        "*xmlns": {
            "datasonnet": "http://www.modusbox.com"
        },
        "datasonnet%datasonnet": {
            "*version": "1.0",
            "__text": "Hello World"
        }
    }
}

write

Converts the input JSON object into XML using the Badgerfish convention.

The input JSON must have a single key which will be mapped to the root element of the resulting XML.

Correct:
{
  "person": {
    "firstName": "John",
    "lastName": "Doe",
    "title": "Rookie DataSonnet mapper"
  }
}
Incorrect:
{
  "firstName": "John",
  "lastName": "Doe",
  "title": "Rookie DataSonnet mapper"
}
Incorrect:
{
  "person": {
    "firstName": "John",
    "lastName": "Doe",
    "title": "Rookie DataSonnet mapper"
  },
  "anotherKey": "anotherValue"
}

Example

Payload
{
    "test:root": {
        "@xmlns": {
            "test":"http://www.modusbox.com"
        },
        "test:datasonnet": {
            "@version": "1.0",
            "$": "Hello World"
        }
    }
}
DataSonnet map:
{
    embeddedXMLValue: ds.write(payload, "application/xml")
}
Result
{
    "embeddedXMLValue": "<?xml version=\"1.0\" encoding=\"UTF-8\"?> <test:root xmlns:test=\"http://www.modusbox.com\"> <test:datasonnet version=\"1.0\">Hello World</test:datasonnet> </test:root>"
}

Providing a params object allows for more control over generated XML. In addition to the parameters described in the read section, the following XML output-only parameters are supported:

Parameter Description Default value

XmlVersion

XML version in the XML declaration

1.0

Encoding

XML encoding

UTF-8

AutoEmptyElements

If set to true, empty elements are mapped to self-closing tags. If set to false, start- and end tags are generated.

true

NullAsEmptyElement

If set to true, element with null value is treated as empty element. Otherwise null values are skipped.

true

OmitXmlDeclaration

If set to true, XML declaration is not written in the resulting output.

false

Payload
{
    "test%root": {
        "*xmlns": {
            "test":"http://www.modusbox.com"
        },
        "test%datasonnet": {
            "*version": "1.0",
            "__text": "Hello World"
        },
        "test%empty": {}
    }
}
DataSonnet map:
local params = {
    "NamespaceDeclarations" : {
        "datasonnet" : "http://www.modusbox.com"
    },
    "NamespaceSeparator": "%",
    "TextValueKey": "__text",
    "AttributeCharacter": "*",
    "XmlVersion" : "1.1",
    "AutoEmptyElements": false
};

{
    embeddedXMLValue: ds.write(payload, "application/xml")
}
Result
{
    "embeddedXMLValue": "<?xml version=\"1.1\" encoding=\"UTF-8\"?> <datasonnet:root xmlns:test=\"http://www.modusbox.com\"> <datasonnet:datasonnet version=\"1.0\">Hello World</datasonnet:datasonnet> <datasonnet:empty/> </datasonnet:root>"

CSV Format

MIME types and identifiers

  • application/csv

  • text/csv

  • csv

read

Parses the CSV and converts it to a JSON array of objects. It expects the CSV input to be in a default format, with first row as column headers, comma separator, double quote, backslash escape character and \n newline character. CSV headers are used as keys for the corresponding JSON object values.

Example

Embedded CSV value
"First Name","Last Name",Phone
William,Shakespeare,"(123)456-7890"
Christopher,Marlow,"(987)654-3210"
DataSonnet map:
{
    local csvInput = ds.read(payload.embeddedCSVValue, "application/csv");

    {
        name: csvInput[0]["First Name"] + " " + csvInput[0]["Last Name"]
    }
}
Result
{
    "name": "William Shakespeare"
}

Providing an optional params object allows more control over the format of the input CSV. The params is a JSON object where following properties can be set:

Parameter Description Default value

UseHeader

If set to true, the first row of CSV will be interpreted as a list of column headers and will map to the JSON object property names

true

Quote

specifies the quote character

"

Separator

CSV separator character

,

Escape

CSV escape character (only used for parsing CSV)

\\

NewLine

New line character combination

\n

Example

Embedded CSV value
'William'|'Shakespeare'|'(123)456-7890'
'Christopher'|'Marlow'|'(987)654-3210'
DataSonnet map:
local params = {
    "UseHeader": false,
    "Quote": "'",
    "Separator": "|",
    "Escape": "\\",
    "NewLine": "\n"
};

local csvInput = ds.read(payload.embeddedCSVValue, "application/csv", params);

{
    name: csvInput[0][0] + " " + csvInput[0][1]
}
Result
{
    "name": "William Shakespeare"
}

write

Creates a CSV out of an array of JSON objects, using default quote, separator, escape and new line characters. The keys of JSON object values are used as a CSV headers.

Example

Payload
[
  {
    "First Name": "William",
    "Last Name": "Shakespeare",
    "Phone": "(123)456-7890"
  },
  {
    "First Name": "Christopher",
    "Last Name": "Marlow",
    "Phone": "(987)654-3210"
  }
]
DataSonnet map:
{
    embeddedCSVValue: ds.write(payload, "application/csv")
}
Result
{
    "embeddedCSVValue": "\"First Name\",\"Last Name\",Phone\nWilliam,Shakespeare,\"(123)456-7890\"\nChristopher,Marlow,\"(987)654-3210\"\n"
}

Providing an optional params object allows for more control over the format of the output CSV. Quote, separator, and new line characters can be specified, CSV can be created without headers - in this case the input can be an array of arrays. In addition, a list of columns can be specified to override the JSON object names. In addition to the parameters described in the read section, the following CSV output-only parameters are supported:

Parameter Description Default value

DisableQuotes

If set to true, CSV quotes will not be used and the value of the Quote parameter will be ignored

false

Headers

an array of strings to use as column names (has no effect if UseHeader is set to false)

"

Example

Payload
[
  [
    "William",
    "Shakespeare",
    "(123)456-7890"
  ],
  [
    "Christopher",
    "Marlow",
    "(987)654-3210"
  ]
]
DataSonnet map:
local params = {
    "UseHeader": false,
    "Quote": "'",
    "Separator": "|",
    "NewLine": "\n"
};

{
    embeddedCSVValue: ds.write(payload, "application/csv", params)
}
Result
{
    "embeddedCSVValue": "'William'|'Shakespeare'|'(123)456-7890'\n'Christopher'|'Marlow'|'(987)654-3210'\n"
}

Java Objects

MIME types and identifiers

  • application/x-java-object

  • application/octet-stream

read

Converts POJO to JSON format using Jackson ObjectMapper. Also reads binary data into a byte array (if the input MIME type is application/octet-stream)

The following read parameters are supported:

Parameter Description Default value

DateFormat

Converts POJO date / time fields to JSON strings using specified date format. See SimpleDateFormat for details.

yyyy-MM-dd’T’HH:mm:ss.SSSZ

FailOnEmptyBeans

If set to true, an exception is thrown if no serializer is found for a Java type. Setting it to false instructs DataSonnet to ignore the field that cannot be serialized. See Jackson Serialization Features for details.

true

FindAndRegisterModules

If set to true, any additional Jackson serialization modules classpath will be autodiscovered by Jackson. See Jackson Serialization Features for details.

false

write

Converts JSON objects to Java POJOs using Jackson ObjectMapper.

The following write parameters are supported:

Parameter Description Default value

DateFormat

Converts POJO date / time fields to JSON strings using specified date format. See SimpleDateFormat for details.

yyyy-MM-dd’T’HH:mm:ss.SSSZ

OutputClass

Produces an instance of specified class. If parameter is not set, the following conversion rules are used:

  • Objects → java.util.HashMap

  • Arrays → java.util.ArrayList

  • String values → java.lang.String

  • Boolean values → java.lang.Boolean

  • Numerical values → java.lang.Number

MixIns

Adds a map of classes and their mix-ins to customize the Datasonnet / Jackson deserialization behavior. See the example below for the example of polymorphic deserialization.

PolymorphicTypes

A comma-separated list of abstract classes that have multiple sub-classes. See the example below for the example of polymorphic deserialization.

PolymorphicTypeIdProperty

A property of the JSON object that contains the name of the deserialized class.

@class

JAXBElement serialization and deserialization

When serializing or deserializing Java object with fields of type JAXBElement, the mapping has to include additional fields value, name and declaredType. For example, a JAXB-annotated class may look like this:

@XmlRootElement(name = "WsdlGeneratedObj")
public class WsdlGeneratedObj {
    @XmlElementRef(name = "testField", namespace = "http://com.datasonnet.test", type = JAXBElement.class, required = true)
    protected JAXBElement<TestField> testField;
...
}

Mapping from JSON to the WsdlGeneratedObj is:

/** DataSonnet
version=1.0
output.application/java.OutputClass=com.datasonnet.javatest.WsdlGeneratedObj
*/
{
    "testField": {
        "name": "{http://com.datasonnet.test}testField",
        "declaredType": "com.datasonnet.test.TestField",
        "value": {
            test: "Hello World"
        }
    }
}

Polymorphic deserialization

In a situation where the property of the deserialized Java class is of an abstract type with multiple extending classes, it is necessary to provide an additional information so that Datasonnet can instantiate correct class. Consider the following mapping:

/** DataSonnet
version=2.0
output application/x-java-object; dateformat=yyyy-MM-dd; outputclass=com.foo.bar.Household;
*/
{
    family: [
       {
            "name": "Joe",
            "employer": "ModusBox"
       },
       {
            "name": "Jane",
            "school": "Elk Grove Middle School"
       }
    ]
}
package com.foo.bar;
public class Household {
    List<Person> family;
...
}
package com.foo.bar;
public abstract class Person {
    String name;
}
package com.foo.bar;
public class Adult extends Person {
...
    String employer;
}
package com.foo.bar;
public class Child extends Person {
...
    String school;
}

This mapping will fail, because Datasonnet doesn’t know which classes it should instantiate for the elements of the family list. To fix this, first we need to create a mixin class, e.g.:

package com.foo.bar;

import com.fasterxml.jackson.annotation.JsonSubTypes;
import com.fasterxml.jackson.annotation.JsonTypeInfo;

@JsonTypeInfo(
        use = JsonTypeInfo.Id.NAME,
        include = JsonTypeInfo.As.PROPERTY,
        property = "@type")
@JsonSubTypes({
        @JsonSubTypes.Type(value = Adult.class, name = "adult"),
        @JsonSubTypes.Type(value = Child.class, name = "child") })
public abstract class PersonMixIn {
}

This class maps the value of the property @type in a JSON object to the class type to which it should be deserialized.

Now change the mapping to the following:

/** DataSonnet
version=2.0
output application/x-java-object; dateformat=yyyy-MM-dd; outputclass=com.foo.bar.Household; mixins="{"com.foo.bar.Person":"com.foo.bar.PersonMixIn"}"
*/
{
    family: [
       {
            "@type": "adult",
            "name": "Joe",
            "employer": "ModusBox"
       },
       {
            "@type": "child",
            "name": "Jane",
            "school": "Elk Grove Middle School"
       }
    ]
}

The value of the property @type will be matched to one of the annotations in the mix-in class.

For the typical use case of an abstract class with concrete subtypes, where the type names are in a property on the JSON objects, you do not need to write your own Mixin, DataSonnet can handle it for you. In this case, the PolymorphicTypes header must be set, and optionally the PolymorphicTypeIdProperty, e.g.:

/** DataSonnet
version=2.0
output application/x-java-object; dateformat=yyyy-MM-dd; outputclass=com.foo.bar.Household; polymorphictypes=com.foo.bar.Person
*/
{
    family: [
       {
            "@class": "com.foo.bar.Adult",
            "name": "Joe",
            "employer": "ModusBox"
       },
       {
            "@class": "com.foo.bar.Child",
            "name": "Jane",
            "school": "Elk Grove Middle School"
       }
    ]
}
/** DataSonnet
version=2.0
output application/x-java-object; dateformat=yyyy-MM-dd; outputclass=com.foo.bar.Household; polymorphictypes=com.foo.bar.Person; polymorphictypeidproperty=__clazz
*/
{
    family: [
       {
            "__clazz": "com.foo.bar.Adult",
            "name": "Joe",
            "employer": "ModusBox"
       },
       {
            "__clazz": "com.foo.bar.Child",
            "name": "Jane",
            "school": "Elk Grove Middle School"
       }
    ]
}

Binary data

To read a binary data, set input type as application/octet-stream. The resulting payload will be an array of integers representing the bytes of the input, e.g. [-119,80,78,71,13,10,26,10,0,0,0,13,73 …​etc…​ ]. Writing the above array as application/octet-stream will produce a byte array.

Multipart Form Data

MIME types and identifiers

  • multipart/form-data

  • multipart/mixed

  • multipart/related

read

Reads a byte array of multipart form data into an internal structure of array of objects. Each object represents a part and has a following properties:

  • name - the name of the part;

  • contentType - the part content type;

  • content - the content of the part. If part is binary, the content will be a byte array;

  • fileName (optional) - if the part is file attachment, this is a file name.

For example:

[
  {
    "name": "textPart",
    "contentType": "text/plain; charset=UTF-8",
    "content": "Hello World"
  },
  {
    "name": "binaryPart",
    "contentType": "image/png",
    "fileName": "DataSonnet.png",
    "content": [
      -119,
      80,
      78,
      71,
      13,
      10,
      26,
      ...
    ]
  }
]

The following parameters are supported:

Parameter Description Default value

Boundary

Explicitly sets the parts boundary (normally it’s automatically detected by the plugin itself)

write

Creates a byte array containing multipart form data. The input structure must be in form of the array of objects each representing a part. For example, the following DataSonnet mapping will create a multipart data:

[
  {
    name: "textPart",
    contentType: "text/plain; charset=UTF-8",
    content: "Hello World"
  },
  {
    name: "binaryPart",
    fileName: "DataSonnet.png",
    contentType: "image/png",
    content: payload.image
  }
]

The following parameters are supported:

Parameter Description Default value

Boundary

Explicitly sets the parts boundary (normally it’s automatically generated by the plugin itself)

YAML Format

MIME types and identifiers

  • application/x-yaml

read

Reads input YAML structure and converts it to the internal DataSonnet representation.

No additional read parameters are supported.

write

Creates YAML structure from the provided input.

The following write parameters are supported:

Parameter Description Default value

MarkerLine

If set to false, the resulting YAML will not contain the three-dashes boundary markers (---)

true

DisableQuotes

If set to true, output values will be unquoted, i.e. will not be wrapped in quote characters

false