Simulating TRY_PARSE for dates in SQL Server 2008

By:   |   Updated: 2023-03-09   |   Comments   |   Related: > Dates


Problem

A while back, I wrote a couple of tips about simulating TRY_CONVERT – a function added in SQL Server 2012 – in earlier versions (see Part 1 and Part 2). Recently, someone in the community had a similar requirement: they wanted to make TRY_PARSE work for dates in SQL Server 2008. Now, I'm not one to help people stay on versions of SQL Server that are now 15 years old, but I do acknowledge that hands may be tied, and they can't always control the version they're stuck with.

Solution

We can't make TRY_PARSE just work in SQL Server 2008; if you want to use the function directly, you'll need to upgrade. What we can do is make our own version of the function. That said, we can't make the syntax fully compatible with TRY_PARSE so that, when we upgrade, the change is minimal. The syntax for TRY_PARSE is:

TRY_PARSE(string_input AS data_type [ USING culture ] )

If we write our own function (even in CLR), we won't be able to call it with that syntax exactly, and we won't be able to have optional arguments. It would be more like this (including a type-specific suffix to avoid future naming conflicts):

dbo.try_parse_date(@string_input, @culture) 

TRY_PARSE is complex, and if we're going to write our own version, we should set some ground rules:

  • Let's stay away from CLR (mostly because it's not compatible with Edge, Azure, etc.);
  • Let's handle dates only and not try to parse strings into any types that also include time;
  • Let's handle numeric dates only and not try to determine the validity of, say, foreign month names; and,
  • Let's keep one eyebrow raised for two-digit years; two decades after Y2K, these should be fossils.

When we think about the problem at a high level, we want to provide a generic way to take a "date" in one of many different formats and interpret it as accurately as possible, using rules from potentially any culture. If we can't do that reliably, we should return NULL instead of an error like we would see from PARSE or CONVERT. However:

  • As a human, I can't tell by looking at 12/8/13 whether you meant December 8th or August 12th (and further, whether you meant 2012 or 2013, or 1912 or 1913, for that matter). This can make programmatically parsing strings very hard, and we're not even talking about bringing in regional-specific elements like day names or month names.
  • A regional string like m/d/y or d/m/y may have 2- or 4-digit years and may or may not have leading zeros on the day or month. Software is very lenient with some of these aspects – I haven't used Microsoft Access in over two decades, but I remember it being forgiving to a fault. If you typed 12/8/2022, it would store December 8th, but if you entered 13/8/2022, it would assume you messed up that value only and would happily store August 13th – without warning you that it changed its interpretation. This is dangerous behavior and leads to what I consider data corruption. We don't want to do this here.

With those things in mind, let's look at a few formats, how we want to interpret them based on the specified culture, and how we can silently fail when a value can't be parsed correctly. The whole point of TRY_PARSE is that if it can't evaluate the input string as a date according to the culture, it should return NULL instead. So consider this input, how we can translate it to a universally safe and unambiguous string format, and what the output should be.

Input string Specified culture Safe format (style #) ISDATE(Safe format) Output (date)
2/30/2022 en-US 20220230 (112) 0 (no Feb 30th) NULL
2.7.2022 en-US 20220207 (112) 1 2022-02-07
2-14-2022 en-GB 20221402 (112) 0 (no 14th month) NULL
12/8/22 en-US 221208 (12) 1 2022-12-08
12/8/22 en-GB 220812 (12) 1 2022-08-12

We can't convert the last two to a safer format (yyyyMMdd) because we don't want to blindly add '20' and make assumptions about 2-digit year cutoffs (otherwise, we'd have to add logic for when we consider a two-digit year to be from last century).

For pseudo-code of how to interpret these correctly, we can think about it this way, where | represents any of the three most common separators (/, -, or .):

  • For any culture:
  • If the culture is en-US or the few others that assume MDY:
    • if the format is ##|##|#### then we know it is MDY
    • if the format is ##|##|## then we assume MDY and rely on 2-digit year handling
  • If the culture is, basically, anything except en-US and a few others:
    • If the format is ##|##|#### then we know it is DMY
    • If the format is ##|##|## then we assume DMY and rely on 2-digit year handling
  • If any of the above is true, rearrange the string to yyyyMMdd (112) or yyMMdd (12)
  • If none of those formats match, return NULL
  • Even if a format matches, leave the result out if it can't be converted to a date, according to ISDATE

To implement this logic, I created two tables, one to store the various formats and how to parse each one for extracting the intended year, month, and day, and another for storing the regions/cultures we want to support and which formats belong to each one. Storing this information in a table means that our code to deal with any given format can be that much simpler (and not involve a bunch of CASE expressions and other redundant code).

Since SQL Server doesn't support RegEx natively, I stored a row in our lookup table for every possible format combination.

CREATE TABLE dbo.DateFormats
(
  DateStyle char(3),     -- 'all', 'mdy', 'dmy'
  Pattern   varchar(64),
  ys        tinyint,     -- character where year starts
  yl        tinyint,     -- length of year
  ms        tinyint,     -- character where month starts
  ml        tinyint,     -- length of month
  ds        tinyint,     -- character where day starts
  dl        tinyint,     -- length of day
  INDEX CIX_DateFormats_DateStyle CLUSTERED(DateStyle)
);

Now, all the potential formats and the positional markers to help extract the date parts given the format:

INSERT dbo.DateFormats(DateStyle,Pattern,ys,yl,ms,ml,ds,dl) VALUES
-- universal 112/yyyyMMdd
('all', '[0-9][0-9][0-9][0-9][0-1][0-9][0-3][0-9]'           ,1,4,5,2,7,2),
-- universal 120/yyyy-[M]M-[d]d or 102/yyyy.[M]M.[d]d or 111/yyyy/[M]M/[d]d
('all', '[0-9][0-9][0-9][0-9][./-][0-1][0-9][./-][0-3][0-9]' ,1,4,6,2,9,2),
('all', '[0-9][0-9][0-9][0-9][./-][0-9][./-][0-9]'           ,1,4,6,1,8,1),
('all', '[0-9][0-9][0-9][0-9][./-][0-9][./-][0-3][0-9]'      ,1,4,6,1,8,2),
('all', '[0-9][0-9][0-9][0-9][./-][0-1][0-9][./-][0-9]'      ,1,4,6,2,9,1),
-- MDY 1/101 [m]m/[d]d/[yy]yy or [m]m.[d]d.[yy]yy or [m]m-[d]d-[yy]yy
('mdy', '[0-1][0-9][./-][0-3][0-9][./-][0-9][0-9][0-9][0-9]' ,7,4,1,2,4,2),
('mdy', '[1-9][./-][0-3][0-9][./-][0-9][0-9][0-9][0-9]'      ,6,4,1,1,3,2),
('mdy', '[0-1][0-9][./-][1-9][./-][0-9][0-9][0-9][0-9]'      ,6,4,1,2,4,1),
('mdy', '[1-9][./-][1-9][./-][0-9][0-9][0-9][0-9]'           ,5,4,1,1,3,1),
('mdy', '[0-1][0-9][./-][0-3][0-9][./-][0-9][0-9]'           ,7,2,1,2,4,2),
('mdy', '[1-9][./-][0-3][0-9][./-][0-9][0-9]'                ,6,2,1,1,3,2),
('mdy', '[0-1][0-9][./-][1-9][./-][0-9][0-9]'                ,6,2,1,2,4,1),
('mdy', '[1-9][./-][1-9][./-][0-9][0-9]'                     ,5,2,1,1,3,1),
-- DMY 3/103 [d]d/[m]m/[yy]yy or [d]d.[m]m.[yy]yy or [d]d-[m]m-[yy]yy
('dmy', '[0-3][0-9][./-][0-1][0-9][./-][0-9][0-9][0-9][0-9]' ,7,4,4,2,1,2),
('dmy', '[1-9][./-][0-1][0-9][./-][0-9][0-9][0-9][0-9]'      ,6,4,3,2,1,1),
('dmy', '[0-3][0-9][./-][1-9][./-][0-9][0-9][0-9][0-9]'      ,6,4,4,1,1,2),
('dmy', '[1-9][./-][1-9][./-][0-9][0-9][0-9][0-9]'           ,5,4,3,1,1,1),
('dmy', '[0-3][0-9][./-][0-1][0-9][./-][0-9][0-9]'           ,7,2,4,2,1,2),
('dmy', '[1-9][./-][0-1][0-9][./-][0-9][0-9]'                ,6,2,3,2,1,1),
('dmy', '[0-3][0-9][./-][1-9][./-][0-9][0-9]'                ,6,2,4,1,1,2),
('dmy', '[1-9][./-][1-9][./-][0-9][0-9]'                     ,5,2,3,1,1,1);

Then the table that maps these regional formats to specific cultures, which I determined using the following query:

DECLARE @t table(culture char(5)), @d char(10) = '02/07/2022';

INSERT @t(culture) VALUES
('en-US'),('de-DE'),('fr-FR'),('ja-JP'),('da-DK'),('es-ES'),('it-IT'),
('nl-NL'),('nn-NO'),('pt-PT'),('fi-FI'),('sv-SE'),('Cs-CZ'),('Hu-HU'),
('Pl-PL'),('Ro-RO'),('hr-HR'),('Sk-SK'),('Sl-SI'),('El-GR'),('bg-BG'),
('Ru-RU'),('Tr-TR'),('en-GB'),('Et-EE'),('lv-LV'),('lt-LT'),('pt-BR'),
('zh-TW'),('Ko-KR'),('zh-CN'),('ar-SA'),('Th-TH');

SELECT culture, [month] = DATENAME(MONTH, TRY_PARSE(@d AS date USING culture))
FROM @t ORDER BY [month], culture;

Results:

[month] = NULL [month] = February (MDY) [month] = July (DMY)
ar-SA en-US bg-BG lv-LV
  Hu-HU Cs-CZ nl-NL
  ja-JP da-DK nn-NO
  Ko-KR de-DE Pl-PL
  lt-LT El-GR pt-BR
  sv-SE en-GB pt-PT
  zh-CN es-ES Ro-RO
  zh-TW Et-EE Ru-RU
    fi-FI Sk-SK
    fr-FR Sl-SI
    hr-HR Th-TH
    it-IT Tr-TR

So, ignoring ar-SA for the time being, we can put these values in:

CREATE TABLE dbo.CultureMapping
(
  Culture    varchar(5), 
  DateStyle  char(3)
  CONSTRAINT PK_CDFM PRIMARY KEY(Culture, DateStyle)
);

INSERT dbo.CultureMapping(Culture, DateStyle) VALUES
('en-US', 'mdy'), ('bg-BG', 'dmy'), ('lv-LV', 'dmy'),
('Hu-HU', 'mdy'), ('Cs-CZ', 'dmy'), ('nl-NL', 'dmy'),
('ja-JP', 'mdy'), ('da-DK', 'dmy'), ('nn-NO', 'dmy'),
('Ko-KR', 'mdy'), ('de-DE', 'dmy'), ('Pl-PL', 'dmy'),
('lt-LT', 'mdy'), ('El-GR', 'dmy'), ('pt-BR', 'dmy'),
('sv-SE', 'mdy'), ('en-GB', 'dmy'), ('pt-PT', 'dmy'),
('zh-CN', 'mdy'), ('es-ES', 'dmy'), ('Ro-RO', 'dmy'),
('zh-TW', 'mdy'), ('Et-EE', 'dmy'), ('Ru-RU', 'dmy'),
                  ('fi-FI', 'dmy'), ('Sk-SK', 'dmy'),
                  ('fr-FR', 'dmy'), ('Sl-SI', 'dmy'),
                  ('hr-HR', 'dmy'), ('Th-TH', 'dmy'),
                  ('it-IT', 'dmy'), ('Tr-TR', 'dmy');

Then we create an inline table-valued function that looks at the string and, based on the defined culture, checks which pattern it matches, reconstructs the string in a safe format, and then checks to see if that value is a date:

CREATE FUNCTION dbo.try_parse_date
(
    @DateString varchar(10),
    @Culture    varchar(5)
)
RETURNS TABLE WITH SCHEMABINDING 
AS
  RETURN 
  (
    SELECT TheDate = CONVERT(date, DateString, DestStyleNumber)
     FROM
     (
        SELECT TOP (1)
          DestStyleNumber = CASE WHEN yl = 2 THEN 12 ELSE 112 END,
          DateString    = SUBSTRING(@DateString, ys, yl)
            + RIGHT('0' + SUBSTRING(@DateString, ms, ml),2)
            + RIGHT('0' + SUBSTRING(@DateString, ds, dl),2)
        FROM dbo.DateFormats AS df
        INNER JOIN dbo.CultureMapping AS cm
          ON df.DateStyle IN (cm.DateStyle, 'all')
        WHERE @DateString LIKE df.Pattern
          AND (@Culture   LIKE cm.Culture)
        ORDER BY df.DateStyle -- pick 'all' first
     ) AS Translated
    WHERE ISDATE(DateString) = 1
  );

Now all the complicated logic is hidden away in our tables and function, and we can write queries like this:

DECLARE @table TABLE(string varchar(10), culture char(5));

INSERT @table(string, culture) VALUES
('2/30/2022', 'en-US'), -- invalid date
('2.7.2022',  'en-US'), -- perfectly fine
('2-14-2022', 'en-GB'), -- no 14th month
('12/8/22',   'en-US'), -- Aug 12
('12/8/22',   'en-GB'); -- Dec 8

SELECT t.string, t.culture, f.TheDate
  FROM @table AS t 
  OUTER APPLY dbo.try_parse_date(t.string, t.culture) AS f;

Output:

string        culture   TheDate
===========   =======   ==========
2/30/2022     en-US     NULL
2.7.2022      en-US     2022-02-07
2-14-2022     en-GB     NULL
12/8/22       en-US     2022-12-08
12/8/22       en-GB     2022-08-12

Checking my work more thoroughly, I ran this query to spot-check a few formats against every culture and compared the output to TRY_PARSE.

WITH src(DateString) AS 
(
  SELECT d FROM 
  (
    VALUES ('2/30/2022'), ('2.7.2022'), ('2-14-2022'),
           ('12/8/22'),   ('2022/4/5'), ('2022/04/19'),
           ('20220419'),  ('2/15/2023'),('2/29/2020'),
           ('2021-31-07')
  ) AS d(d)
)
SELECT DateString, Culture, [Function], TryParse
FROM 
(
  SELECT
    src.DateString, 
    m.Culture, 
    [Function] = f.TheDate, 
    TryParse = TRY_PARSE(src.DateString AS date USING m.Culture)
  FROM src 
  CROSS JOIN dbo.CultureMapping AS m
  OUTER APPLY dbo.try_parse_date(src.DateString, m.Culture) AS f
) AS x  
WHERE [Function] <> TryParse
  OR ([Function] IS NOT NULL AND TryParse IS NULL)
  OR ([Function] IS NULL     AND TryParse IS NOT NULL)
  ORDER BY DateString, TryParse, Culture;

Most of the output from the 320 rows returned by the function (277, or 86.6%) matched TRY_PARSE, but I did see some interesting results – 43 rows came back where the output was different:

  • [7 rows]: For all 7 of the non-en-US cultures where TRY_PARSE interprets 2/7/2022 as MDY (Feb. 7th, 2022), 12/8/22 is interpreted as DMY instead (August 22nd, 2012). So, yes, two-digit years are a problem.
  • [4 rows]: Using Th-TH, TRY_PARSE converted values like 2.7.2022 -> 1479-07-02, 12/8/22 -> 1979-08-12, and 2022/04/19 -> 1479-04-19. This was unexpected, and the function treats these correctly (the first two as DMY and the third not mysteriously subtracting 543 years).
  • [32 rows]: TRY_PARSE refused to recognize yyyyMMdd as a date in any culture, even though this is a universally safe format for using CAST or CONVERT.

If you want behavior perfectly congruent with TRY_PARSE, this function won't quite do that. I don't know if we can work around these exceptions in a tidy way, so unless these are critical formats and cultures to support, I would probably remove two-digit year formats from mdy, and – like ar-SA – just skip Th-TH completely.

DELETE dbo.DateFormats WHERE yl = 2 AND DateStyle = 'mdy';
DELETE dbo.CultureMapping WHERE Culture = 'Th-TH';

This also doesn't provide perfect congruence, so which type of different behavior you want is up to you. An easier decision, I think, is to let the function treat yyyyMMdd correctly instead of matching the incorrect (IMHO) behavior of TRY_PARSE. However, if you want to match the behavior where these return NULL(which would increase the compatibility to about 96.5%), delete that single row from dbo.DateFormats:

DELETE dbo.DateFormats WHERE Pattern = '[0-9][0-9][0-9][0-9][0-1][0-9][0-3][0-9]';

There may be other obscure format variations that cause issues with specific cultures, as it is tedious to test this thoroughly. But this should be a good starting point if you have nothing better to do than wait for someone to approve an upgrade.

Next Steps
  • This is clearly incomplete – for simplicity, I ignored two cultures altogether, discarded some two-digit year patterns, and stuck to numeric-only date formats. You can extend this with all cultures and any other date formats you need to support, though I caution against trying to interpret month names, for example, in every single language SQL Server supports. If this is where you want to be, you should think about CLR (if Azure isn't on your horizon), upgrading to a version that supports TRY_PARSE, or not handling this type of fragile string interpretation inside SQL Server in the first place. One way to avoid the problem is to make sure your applications and forms use standard date pickers or calendar controls, so you're not ever trying to interpret free text input.


sql server categories

sql server webinars

subscribe to mssqltips

sql server tutorials

sql server white papers

next tip



About the author
MSSQLTips author Aaron Bertrand Aaron Bertrand (@AaronBertrand) is a passionate technologist with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com, and also blogs at sqlblog.org.

This author pledges the content of this article is based on professional experience and not AI generated.

View all my tips


Article Last Updated: 2023-03-09

Comments For This Article

















get free sql tips
agree to terms