File Information Schemas

Javascript Object Notation (JSON)

CC-BY-SA
This work is licensed under the Creative Commons Attribution-ShareAlike License 4.0
.

Extension(s):
.json
Media Type(s):
application/json, text/json [Deprecated].
UTI:
public.json
Latest Version:
https://ns.sm0tvi.net/finfo/application/json/index.shtml

Introduction

JSON (JavaScript Object Notation, pronounced [ˈdʒeɪsən/]; also [ˈdʒeɪˌsɒn]) is an open standard file format and data interchange format that uses human-readable text to store and transmit data objects consisting of attribute–value pairs and arrays (or other serializable values). It is a common data format with diverse uses in electronic data interchange, including that of web applications with servers.

JSON is a language-independent data format. It was derived from JavaScript, but many modern programming languages include code to generate and parse JSON-format data. JSON filenames use the extension .json.

Douglas Crockford originally specified the JSON format in the early 2000s. He and Chip Morningstar sent the first JSON message in April 2001.

JSON became a strict subset of ECMAScript as of the language's 2019 revision.

History

JSON grew out of a need for a real-time server-to-browser session communication protocol without using browser plugins such as Flash or Java applets, the dominant methods used in the early 2000s.

Crockford first specified and popularized the JSON format. The acronym originated at State Software, a company co-founded by Crockford and others in March 2001. The co-founders agreed to build a system that used standard browser capabilities and provided an abstraction layer for Web developers to create stateful Web applications that had a persistent duplex connection to a Web server by holding two Hypertext Transfer Protocol (HTTP) connections open and recycling them before standard browser time-outs if no further data were exchanged. The co-founders had a round-table discussion and voted whether to call the data format JSML (JavaScript Markup Language) or JSON (JavaScript Object Notation), as well as under what license type to make it available. The JSON.org website was launched in 2001. In December 2005, Yahoo! began offering some of its Web services in JSON.

JSON was based on a subset of the JavaScript scripting language (specifically, Standard ECMA-262 3rd Edition—December 1999) and is commonly used with JavaScript, but it is a language-independent data format. Code for parsing and generating JSON data is readily available in many programming languages. JSON's website lists JSON libraries by language.

In October 2013, Ecma International published the first edition of its JSON standard ECMA-404. That same year, RFC 7158 used ECMA-404 as a reference. In 2014, RFC 7159 became the main reference for JSON's Internet uses, superseding RFC 4627 and RFC 7158 (but preserving ECMA-262 and ECMA-404 as main references). In November 2017, ISO/IEC JTC 1/SC 22 published ISO/IEC 21778:2017 as an international standard. On 13 December 2017, the Internet Engineering Task Force obsoleted RFC 7159 when it published RFC 8259, which is the current version of the Internet Standard STD 90.

Crockford added a clause to the JSON license stating that “The Software shall be used for Good, not Evil”, in order to open-source the JSON libraries while mocking corporate lawyers and those who are overly pedantic. On the other hand, this clause led to license compatibility problems of the JSON license with other open-source licenses, as open-source software and free software usually imply no restrictions on the purpose of use.

Syntax

[Image]
Figure #: JSON Value railroad diagram.

Whitespace is allowed and ignored around or between syntactic elements (values and punctuation, but not within a string value). Four specific characters are considered whitespace for this purpose: space, horizontal tab, line feed, and carriage return. In particular, the byte order mark must not be generated by a conforming implementation (though it may be accepted when parsing JSON). JSON does not provide syntax for comments.

[Image]
Figure #: JSON white space railroad diagram.

Early versions of JSON (such as specified by RFC 4627) required that a valid JSON text must consist of only an object or an array type, which could contain other types within them. This restriction was dropped in RFC 7158, where a JSON text was redefined as any serialized value.

Comments were intentionally excluded from JSON. In 2012, Douglas Crockford described his design decision thus: "I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability."

JSON disallows "trailing commas", a comma after the last value inside a data structure. Trailing commas are a common feature of JSON derivatives to improve ease of use.

Array

[Image]
Figure #: JSON Array railroad diagram.

An ordered list of zero or more elements, each of which may be of any type. Arrays use square bracket notation with comma-separated elements.

Boolean

[Image]
Figure #: JSON Boolean railroad diagram.

Either of the values true or false.

null

An empty value, using the word null

Number

[Image]
Figure #: JSON Number railroad diagram.

A signed decimal number that may contain a fractional part and may use exponential E notation, but cannot include non-numbers such as NaN. The format makes no distinction between integer and floating-point. JavaScript uses IEEE-754 double-precision floating-point format for all its numeric values (later also supporting BigInt), but other languages implementing JSON may encode numbers differently.

Numbers in JSON are agnostic with regard to their representation within programming languages. While this allows for numbers of arbitrary precision to be serialized, it may lead to portability issues. For example, since no differentiation is made between integer and floating-point values, some implementations may treat 42, 42.0, and 4.2E+1 as the same number, while others may not.

The JSON standard makes no requirements regarding implementation details such as overflow, underflow, loss of precision, rounding, or signed zeros, but it does recommend expecting no more than IEEE 754 binary64 precision for "good interoperability". There is no inherent precision loss in serializing a machine-level binary representation of a floating-point number (like binary64) into a human-readable decimal representation (like numbers in JSON), and back, since there exist published algorithms to do this exactly and optimally.

Object

[Image]
Figure #: JSON Object railroad diagram.

A collection of name–value pairs where the names (also called keys) are strings. The current ECMA standard states: "The JSON syntax does not impose any restrictions on the strings used as names, does not require that name strings be unique, and does not assign any significance to the ordering of name/value pairs." Objects are delimited with curly brackets and use commas to separate each pair, while within each pair the colon ':' character separates the key or name from its value.

String

[Image]
Figure #: JSON String railroad diagram.

A sequence of zero or more Unicode characters. Strings are delimited with double quotation marks and support a backslash escaping syntax.

Character encoding

Although Crockford originally asserted that JSON is a strict subset of JavaScript and ECMAScript, his specification actually allows valid JSON documents that are not valid JavaScript; JSON allows the Unicode line terminators U+2028 LINE SEPARATOR and U+2029 PARAGRAPH SEPARATOR to appear unescaped in quoted strings, while ECMAScript 2018 and older do not. This is a consequence of JSON disallowing only "control characters". For maximum portability, these characters should be backslash-escaped.

JSON exchange in an open ecosystem must be encoded in UTF-8. The encoding supports the full Unicode character set, including those characters outside the Basic Multilingual Plane (U+0000 to U+FFFF). However, if escaped, those characters must be written using UTF-16 surrogate pairs. For example, to include the Emoji character U+1F610 😐 NEUTRAL FACE in JSON:

{ "face": "😐" }
// or
{ "face": "\uD83D\uDE10" }

Semantics

While JSON provides a syntactic framework for data interchange, unambiguous data interchange also requires agreement between producer and consumer on the semantics of specific use of the JSON syntax. One example of where such an agreement is necessary is the serialization of data types defined by the JavaScript syntax that are not part of the JSON standard, e.g., Date, Function, Regular Expression (RegExp), and undefined.

Derivatives

Glossary

Words and terms introduced in this document.

railroad diagram · syntax diagram:
A railroad diagram (or syntax diagram) is a way to represent a context-free grammar. They represent a graphical alternative to Backus–Naur form, EBNF, Augmented Backus–Naur form, and other text-based grammars as metalanguages.
See Also: English Wikipedia article on “Syntax Diagram”.

References and Further Reading