SQL Server 2016: JSON Support

 This is not the JSON you're looking for.
This is not the JSON you're looking for.

I was at MS Ignite last week, and attended the Foundational Keynote, entitled "The SQL Server Evolution." In that session they announced a new feature that will be available in the next version of SQL Server (and that you will get to play with in some capacity when CTP2 is released later this year): JSON.

Yes, you read that right: JSON.

I can't get into too many technical details just yet, and I'm not even sure how much functionality will actually be available in the forthcoming CTP, but I can share a little bit from what I've been able to play with so far. And I want to do that because, even though I don't have a direct need for JSON support myself, I've heard that as an argument against SQL Server a number of times (though it is often difficult to gauge how serious people are when they say that).

First of all, this is going to be standard functionality available in all editions, even Express. There is not going to be a JSON data type or any special kind of index; you would just store it as a traditional string (or extract relational data as a JSON string). But there will be extensions to T-SQL to support moving JSON data in and out of relational forms, much like we already have for XML - like FOR JSON PATH, FOR JSON AUTO, ISJSON(), JSON_QUERY(), JSON_VALUE(), JSON_MODIFY(), and OPENJSON(). It will be case sensitive, collation unaware, there will be no strict schema support, and it will be substantially less elaborate than, say, PostgreSQL's JSON support. (Just remember that this is a V1, and like many other V1 features, should get better as it matures.)

JSON will be supported in Hekaton In-Memory OLTP (even in natively compiled stored procedures) right out of the gate, though when used there you will have the same no-LOB and 8,060 byte limits we've grown to know and love. Compression, encryption, and both types of Columnstore indexes will be supported. And you'll be able to persist and index computed columns based on values extracted from the JSON string that is stored. Almost all data types are supported, with the exception of complex CLR types (geometry, geography, and any custom CLR types you use). As a first glimpse of the syntax and a demonstration of this limitation:

DECLARE @g GEOGRAPHY;
SELECT g = @g FOR JSON PATH;

Results:

Msg 13604, Level 16, State 1
FOR JSON does not support CLR types – cast CLR types explicitly into one of the supported types in FOR JSON queries.

However, this does not apply to all CLR types; hierarchyid, for example, works fine:

DECLARE @h HIERARCHYID = '/1/2/';
SELECT [My_first_JSON_query] = @h FOR JSON PATH;

Results (note that forward slashes are escaped using a backslash, and that the output column is given a new alias - "JSON_" with a GUID suffix):

{
    "My_first_JSON_query": "\/1\/2\/"
}

Other characters are escaped as well: quotes, tabs, carriage returns, line feeds, and backslash:

DECLARE @x NVARCHAR(256) = N'tab: ' + CHAR(9)
 + N'cr/lf: ' + CHAR(13) + CHAR(10)
 + N'slashes: /\'
 + N'quote: "';
 
SELECT x = @x FOR JSON PATH;

Result:

[
    {
        "x": "tab: \tcr\/lf: \r\nslashes: \/\\quote: \""
    }
]

Your columns have to have an alias. If they don't:

SELECT 1,'foo' FOR JSON PATH;

Result:

Msg 13605, Level 16, State 1
Unnamed tables cannot be used as JSON identifiers as well as unnamed columns cannot be used as key names. Add alias to the unnamed column/table.

You need to use a table structure of some kind to use FOR XML AUTO. If you try:

SELECT f = 1 FOR JSON AUTO;

You'll get this:

Msg 13600, Level 16, State 1
FOR JSON AUTO requires at least one table for generating JSON objects. Use FOR JSON PATH or add a FROM clause with a table name.

This works:

SELECT f FROM (SELECT 'foo') AS x(f) FOR JSON AUTO;

Result:

[
    {
        "f": "foo"
    }
]

You can't use SELECT INTO with either method:

Msg 13602, Level 16, State 1
The FOR JSON clause is not allowed in a SELECT INTO statement.

There are some quirks involving string lengths - it seems that if your JSON string (including delimiters) is longer than 2033 characters, it will get spread over multiple rows in the output.

DECLARE @t TABLE(x NVARCHAR(4000));
INSERT @t(x) SELECT REPLICATE(N'a',2023);
SELECT x FROM @t FOR JSON PATH;
 
DECLARE @u TABLE(y NVARCHAR(4000));
INSERT @u(y) SELECT REPLICATE(N'b',2027);
SELECT y FROM @u FOR JSON PATH;

Results:

JSON output

I'm not sure if that's a symptom of very early builds or how it is intended to work; if the latter, your consuming applications will need to account for the way the data is returned.


Okay, let's get serious

Now, how about some tangible examples of extracting JSON strings from existing relational data? The whole point of embracing JSON is to provide an additional data exchange format in and out of SQL Server - whether you are interfacing with existing applications, other platforms, or even other Microsoft offerings (like DocumentDB). So, naturally, there are methods for turning relational data into JSON output, and for turning JSON data into a relational format. This is how the Microsoft presentation illustrated it, on a slide entitled "Data exchange with JSON" (just ignore the Customer vs. AccountNumber discrepancy):

JSON slide

(Note that OPENJSON() is not available in current builds, so I'm going to focus on the top half of that slide - extracting relational data into JSON format. Also note that in current builds there are no carriage returns or indenting, so I'll be manually making the output of these examples a little more readable - these may not look exactly the same when you are able to try them yourself.)

Let's take a very simple example, like a few columns from sys.databases. A normal query would look like this:

SELECT TOP (2) 
  name, database_id, source_database_id, create_date 
FROM sys.databases 
ORDER BY database_id;

Output:

name    database_id   source_database_id   create_date
------  -----------   ------------------   -----------------------
master  1             NULL                 2003-04-08 09:13:36.390
tempdb  2             NULL                 2015-04-30 09:00:57.587

As a refresher, and because some of the methodologies are surprisingly similar, let's look at a couple of different ways to output this data using existing XML methods:

SELECT TOP (2) 
  name, database_id, source_database_id, create_date 
FROM sys.databases 
ORDER BY database_id
FOR XML AUTO, ROOT('Databases');
 
SELECT TOP (2) 
  name, database_id, source_database_id, create_date 
FROM sys.databases 
ORDER BY database_id
FOR XML PATH('Database'), ROOT('Databases');

The following two XML fragments are rendered:

<Databases>
	<sys.databases name="master" database_id="1" create_date="2003-04-08T09:13:36.390"/>
	<sys.databases name="tempdb" database_id="2" create_date="2015-04-30T09:00:57.587"/>
</Database>
<Databases>
	<Database>
		<name>master</name>
		<database_id>1</database_id>
		<create_date>2003-04-08T09:13:36.390</create_date>
	</Database>
	<Database>
		<name>tempdb</name>
		<database_id>2</database_id>
		<create_date>2015-04-30T09:00:57.587</create_date>
	</Database>
</Databases>

Notice that source_database_id is not present in any of the output. I included that column on purpose, to make you think about what happens to NULL values (which may or may not be desirable, depending on how the output is being used). One way to force them to be present is to add ELEMENTS XSINIL to the query:

SELECT TOP (2) 
  name, database_id, source_database_id, create_date 
FROM sys.databases 
ORDER BY database_id
FOR XML AUTO, ROOT('Databases'), ELEMENTS XSINIL;

This time the XML output is:

<Databases xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
	<sys.databases>
		<name>master</name>
		<database_id>1</database_id>
		<source_database_id xsi:nil="true"/>
		<create_date>2003-04-08T09:13:36.390</create_date>
	</sys.databases>
	<sys.databases>
		<name>tempdb</name>
		<database_id>2</database_id>
		<source_database_id xsi:nil="true"/>
		<create_date>2015-04-30T09:00:57.587</create_date>
	</sys.databases>
</Databases>

The source_database_id "data" is now included as an empty element with xsi:nil="true", which is one way for the consumer of this data to know that the entity exists but in this case did not contain any data.

Now, let's look at a couple of very similar queries using JSON:

SELECT TOP (2) name
     , database_id
     , source_database_id
     , create_date
FROM sys.databases
ORDER BY database_id
FOR JSON AUTO;

SELECT TOP (2) name
     , database_id
     , source_database_id
     , create_date
FROM sys.databases
ORDER BY database_id
FOR JSON PATH
     , ROOT('Databases');

The first query (FOR JSON AUTO) produces this JSON string:

[{
        "name": "master",
        "database_id": 1,
        "create_date": "2003-04-08T09:13:36.390"
    }, {
        "name": "tempdb",
        "database_id": 2,
        "create_date": "2015-04-30T09:00:57.587"
    }
]

The second one, with FOR XML PATH, ROOT('Databases'), produces the following, very similar output:

{
    "Databases":
    [{
            "name": "master",
            "database_id": 1,
            "create_date": "2003-04-08T09:13:36.390"
        }, {
            "name": "tempdb",
            "database_id": 2,
            "create_date": "2015-04-30T09:00:57.587"
        }
    ]
}

Here, again, source_database_id is not included in the output. With JSON it is much easier to include NULL values in the output, using the INCLUDE_NULL_VALUES option. Adding that to the first example:

SELECT TOP (2) name
     , database_id
     , source_database_id
     , create_date
FROM sys.databases
ORDER BY database_id
FOR JSON AUTO
     , INCLUDE_NULL_VALUES;

Will produce this output:

[
  {
     "name":"master",
     "database_id":1,
     "source_database_id":null,
     "create_date":"2003-04-08T09:13:36.390"
  },
  {
     "name":"tempdb",
     "database_id":2,
     "source_database_id":null,
     "create_date":"2015-04-30T09:00:57.587"
  }
]

A few other things that have been discussed revolve around how to integrate JSON data with relational data structures. One way is to store JSON strings in NVARCHAR columns, and then there are extensions to make that more useful. For example, you can have a check constraint that ensures the contents are a valid JSON document, using ISJSON(); you can also extract values using JSON_VALUE into computed columns. The example they gave last week:

CREATE TABLE dbo.Orders (
     OrderID INT PRIMARY KEY
     , OrderDetails NVARCHAR(4000)
     , CONSTRAINT chk_OrderDetails_IsJSON CHECK (ISJSON(OrderDetails) = 1)
     , Quantity AS (CONVERT(INT, JSON_VALUE(OrderDetails, '$.Order.Qty')))
     );

-- you could even index the computed column:
CREATE INDEX ix_Ord_Q ON dbo.Orders (Quantity);

-- you could even index the computed column:

CREATE INDEX ix_Ord_Q ON dbo.Orders(Quantity);

http://json.org/

There is a lot more to it than this, but unfortunately this is about all I can share right now. The capabilities will be much more substantial later this year, and I will revisit with more extensive and tangible examples soon (with data exchange in both directions, I promise). You can also check out Jovan Popovic's blog post for more details.

Until then, happy JSONing!