XML and JSON Data Formats

Applications today are quite complex and are no more developed as a single program. Most applications have multiple programs, which work together to provide the complete solution, with each doing its own bit. While all programs may be written and managed by the same developer team, it is also possible that programs use the services of other programs developed by a third party. Such an application architecture requires data to be passed between programs.

If every program were to use its own arbitrary data format, the exchange of data between programs would be very cumbersome and error-prone. To avoid this problem, common data exchange formats were defined and are widely used today to allow programs to interact with each other efficiently.

Three of the more universally accepted data formats are:

  • Character-separated Values
  • Extensible Markup Language (XML)
  • JavaScript Object Notation (JSON)

Character-Separated Values

This is a simple format where the values are written in sequence, separated by a character, with one record per line. Any character that is not expected to be a part of the data can be used as a separator. Two of the more commonly used separators are commas and pipes, and the formats are referred to as comma- or pipe-separated values.

Character-separated values can be used where the data is structured in a pre-defined sequence that all programs in the solution are aware of. Since there is no additional formatting information, it is easy to write and read data in this format and the overall size of the data being transferred is low. The challenge is that every program sending or receiving this data must be programmed to understand the exact sequence. And if there is a change in the sequence, every program will need to be modified. Character-separated values are simple structures, as shown below. The separator can be any character that is not a meaningful value in the context of the data stream.

Comma-separated Values: 20, 21.4, 22, 21.9, 22, 23, 22.5

Pipe-separated Values: 20|21.4|22|21.9|22|23|22.5

JSON

JavaScript Object Notation (JSON) is a very widely used format to structure data. It is more complex than a character-separated format, but the added structure makes it easier for programs to exchange data with each other. Data in JSON format can be converted into structures that can be used easily by several high-level programming languages. Most high-level programming languages come with libraries or functions that make it very easy to convert data structures used by the language to a JSON data stream or file, and vice-versa.

JSON uses the key-value pair design pattern, where every data value is assigned to a key and can be retrieved using the key. The key-value pairs can be in sequence, they can be nested, and you can have arrays of similar key-value pairs. This approach also makes the data reasonably human-readable.

JSON structures the data in a manner that makes it easy for multiple variations of the data to be sent without the sending and receiving systems to be reprogrammed for each. There needs to be some understanding between both systems about what kind of data is being sent and received, specifically they should know the keys available, but within the defined structure, several variations are possible, unlike in a character-separated format where no variations are possible.

XML

A markup language is a data structuring language that uses tags to format data. Extensible Markup Language (XML) is a specific version of the markup language that allows data to be structured using any number of custom-defined tags.

Tags are keywords that enclose the data and identify what the data is. Tags are always used in pairs, an opening tag and a closing tag, with the data in between. Tags can be nested such that a pair of opening and closing tags can have multiple nested opening and closing tags. Tags are written as for the opening tag and for the closing tag.

<open_tag>data value</close_tag>

XML pre-dates JSON, and both serve the same purpose and are well supported by high-level programming languages. Either format can be used, depending on the application. JSON is a simpler structure than XML and requires less effort to design and parse.

JSON and XML Compared

A comparison of the same data in JSON and XML is shown below: