Updating a table with HTML data from a file
On the #SQLHelp hash tag today, @zippy1981 asked the following question:
Is there an easy one-liner (powershell,sqlcmd/SSMS/etc) to load a single html file on the local filesystem to a field in the DB?
While typing with one raised eyebrow (wanting to ask, but it wouldn't fit in 140 characters, "why would you want to do that?"), my quick answer was yes. Well, there isn't necessarily a one-liner, and I'm not sure about doing this through PowerShell, SQLCMD or from within SSMS. I do know you can do this with a stored procedure call, but there are a few of caveats to the solution below.
One caveat is that my old-school approach requires xp_cmdshell to be enabled, which may not be an option in some shops. When you try to use xp_cmdshell, by default, you should get this error:
Msg 15281, Level 16, State 1, Procedure xp_cmdshell, Line 1
SQL Server blocked access to procedure 'sys.xp_cmdshell' of component 'xp_cmdshell' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'xp_cmdshell' by using sp_configure. For more information about enabling 'xp_cmdshell', see "Surface Area Configuration" in SQL Server Books Online.
To enable xp_cmdshell, you can run the following code…
EXEC sp_configure 'show advanced options', 1; RECONFIGURE WITH OVERRIDE; EXEC sp_configure 'xp_cmdshell', 1; RECONFIGURE WITH OVERRIDE; EXEC sp_configure 'show advanced options', 0; RECONFIGURE WITH OVERRIDE;
Another caveat is that, because the type command will break the HTML file up line by line, we rely on undocumented string concatenation – so you need to make assumptions that the SELECT will pull the "rows" out in the same order. I tested with rather small files, and if you publish minimized HTML files where everything is all on a single line, it may not be an issue. But I can't guarantee that this will work for all files in all environments. For these reasons, I strongly recommend that you don't consider this a long-term solution, but rather investigate using an alternate solution. One would be to use CLR inside the database to handle filesystem-related tasks; another would be to have an external application read the file contents and then call a stored procedure which passes the data along; yet another would be to investigate storing your HTML files using the new FILESTREAM feature introduced in SQL Server 2008.
With those disclaimers out of the way, let's pretend I have an HTML file on my hard drive, called C:\test.html. Here is how I would go about updating an existing row in an arbitrary table with the HTML contents from that file (and no, single quotes in the file will not screw anything up with this, however single quotes in the path or filename might):
CREATE TABLE dbo.Foo ( FooID INT PRIMARY KEY, HTML NVARCHAR(MAX) ); GO INSERT dbo.Foo(FooID, HTML) SELECT 1, NULL; GO CREATE PROCEDURE dbo.Foo_UpdateHTML @FooID INT, @HTMLFilePath VARCHAR(8000) AS BEGIN SET NOCOUNT ON; DECLARE @txt NVARCHAR(MAX), @cmd VARCHAR(8000) = 'type "' + @HTMLFilePath + '"'; CREATE TABLE #f(t NVARCHAR(MAX)); BEGIN TRY INSERT #f EXEC master..xp_cmdshell @cmd; IF @@ROWCOUNT > 0 BEGIN DELETE #f WHERE t IS NULL; IF EXISTS ( SELECT 1 FROM #f WHERE t LIKE '%The system cannot find the file specified%' ) BEGIN RAISERROR('File not found.', 11, 1); END SELECT @txt = COALESCE(@txt + CHAR(13) + CHAR(10), '') + t FROM #f; UPDATE dbo.Foo SET HTML = @txt WHERE FooID = @FooID; END END TRY BEGIN CATCH SET @txt = ERROR_MESSAGE(); RAISERROR('HTML file error (%s)', 11, 1, @txt); END CATCH DROP TABLE #f; END GO EXEC dbo.Foo_UpdateHTML @FooID = 1, @HTMLFilePath = 'C:\test.html'; GO SELECT FooID, HTML FROM dbo.Foo; GO DROP PROCEDURE dbo.Foo_UpdateHTML; DROP TABLE dbo.Foo; GO
In this case if the file is not valid you will receive the following error message:
Msg 50000, Level 11, State 1, Procedure Foo_UpdateHTML, Line 44 HTML file error (File not found.)
Of course you are more than welcome to adjust the error handling to your liking. But I will recommend again that, while you may want to use this as a stop-gap solution, you are going to be better off in the long run using a solution that requires several assumptions to be correct.
EDIT: From Adam Machanic's suggestion, here is a different version of the procedure which uses BULK INSERT, avoiding both the xp_cmdshell requirement as well as the quirky string concatenation:
CREATE PROCEDURE dbo.Foo_UpdateHTML @FooID INT, @HTMLFilePath VARCHAR(8000) AS BEGIN SET NOCOUNT ON; DECLARE @txt NVARCHAR(MAX); CREATE TABLE #f(t NVARCHAR(MAX)); BEGIN TRY SET @txt = 'BULK INSERT #f FROM ''' + @HTMLFilePath + ''' WITH ( ROWTERMINATOR = ''[email protected]#$%^&*()'', FIELDTERMINATOR = ''[email protected]#$%^&*()'' );'; EXEC sp_executeSQL @txt; UPDATE dbo.Foo SET HTML = (SELECT t FROM #f) WHERE FooID = @FooID; END TRY BEGIN CATCH DECLARE @txt NVARCHAR(MAX) = ERROR_MESSAGE(); RAISERROR('HTML file error (%s)', 11, 1, @txt); END CATCH DROP TABLE #f; END GO
Your suggestion duly noted. Filesystem is probably ideal for this. I wold not rule out FILESTREAM backed varbinary columns either.
I also think this might be more of a mongodb job. This database is only used by a service that sends confirmation emails It is called by other applications that have their own, more complex, databases. Since the data will be readonly, I don't have to worry about the durability issues with mongo.
With all of that taken into consideration. The service ain't broke now. A little script-fu would be the quickest way to skin this cat.
As you can see there are many ways to skin this cat. I might suggest though, since you are already protecting your e-mail content with SVN, that you change the logic in your database to pull the HTML contents from the file at runtime (just reference the path the HTML file, instead of storing the HTML data). Then you don't need to deploy anything, and it will also keep your database much slimmer.
Thanks for the answer. Gotta love the SQL server community. The short answer to "why would I want to do that" is I store email templates in the database. The email text is stored in SVN. I'm trying to automate deployment of updated email texts.
I'll definitely report back my progress..
How about using OPENROWSET(BULK)?:
SET HTML=(SELECT * FROM OPENROWSET(BULK 'C:\Test.HTML', SINGLE_NCLOB) X)
(If the HTML file is not actually in Unicode, then you'd declare it above as a SINGLE_CLOB).
Or, in the procedure that accepts @HTMLFilePath as a parameter:
DECLARE @sql NVARCHAR(MAX)
SET @sql =
SET HTML=(SELECT * FROM OPENROWSET(BULK "'[email protected]+"', SINGLE_NCLOB) X)
WHERE [email protected]'
EXEC sp_executesql @sql, N'@FooID INT', @FooID
Yeah, I didn't even think of doing it that way, so used to using BULK INSERT for log files that this didn't occur to me. Makes perfect sense and I'll update the post.
How about using BULK INSERT with field and row terminators that won't exist in the file (e.g. "!!!—THISISTHETERMINATOR—!!!")?