Updating a table with HTML data from a file
Is there an easy one-liner (powershell,sqlcmd/SSMS/etc) to load a single html file on the local filesystem to a field in the DB?
While typing with one raised eyebrow (wanting to ask, but it wouldn't fit in 140 characters, "why would you want to do that?"), my quick answer was yes. Well, there isn't necessarily a one-liner, and I'm not sure about doing this through PowerShell, SQLCMD or from within SSMS. I do know you can do this with a stored procedure call, but there are a few of caveats to the solution below.
One caveat is that my old-school approach requires xp_cmdshell to be enabled, which may not be an option in some shops. When you try to use xp_cmdshell, by default, you should get this error:
Msg 15281, Level 16, State 1, Procedure xp_cmdshell, Line 1
SQL Server blocked access to procedure 'sys.xp_cmdshell' of component 'xp_cmdshell' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'xp_cmdshell' by using sp_configure. For more information about enabling 'xp_cmdshell', see "Surface Area Configuration" in SQL Server Books Online.
To enable xp_cmdshell, you can run the following code…
EXEC sp_configure 'show advanced options', 1; RECONFIGURE WITH OVERRIDE; EXEC sp_configure 'xp_cmdshell', 1; RECONFIGURE WITH OVERRIDE; EXEC sp_configure 'show advanced options', 0; RECONFIGURE WITH OVERRIDE;
Another caveat is that, because the type command will break the HTML file up line by line, we rely on undocumented string concatenation – so you need to make assumptions that the SELECT will pull the "rows" out in the same order. I tested with rather small files, and if you publish minimized HTML files where everything is all on a single line, it may not be an issue. But I can't guarantee that this will work for all files in all environments. For these reasons, I strongly recommend that you don't consider this a long-term solution, but rather investigate using an alternate solution. One would be to use CLR inside the database to handle filesystem-related tasks; another would be to have an external application read the file contents and then call a stored procedure which passes the data along; yet another would be to investigate storing your HTML files using the new FILESTREAM feature introduced in SQL Server 2008.
With those disclaimers out of the way, let's pretend I have an HTML file on my hard drive, called C:\test.html. Here is how I would go about updating an existing row in an arbitrary table with the HTML contents from that file (and no, single quotes in the file will not screw anything up with this, however single quotes in the path or filename might):
CREATE TABLE dbo.Foo ( FooID INT PRIMARY KEY, HTML NVARCHAR(MAX) ); GO INSERT dbo.Foo(FooID, HTML) SELECT 1, NULL; GO CREATE PROCEDURE dbo.Foo_UpdateHTML @FooID INT, @HTMLFilePath VARCHAR(8000) AS BEGIN SET NOCOUNT ON; DECLARE @txt NVARCHAR(MAX), @cmd VARCHAR(8000) = 'type "' + @HTMLFilePath + '"'; CREATE TABLE #f(t NVARCHAR(MAX)); BEGIN TRY INSERT #f EXEC master..xp_cmdshell @cmd; IF @@ROWCOUNT > 0 BEGIN DELETE #f WHERE t IS NULL; IF EXISTS ( SELECT 1 FROM #f WHERE t LIKE '%The system cannot find the file specified%' ) BEGIN RAISERROR('File not found.', 11, 1); END SELECT @txt = COALESCE(@txt + CHAR(13) + CHAR(10), '') + t FROM #f; UPDATE dbo.Foo SET HTML = @txt WHERE FooID = @FooID; END END TRY BEGIN CATCH SET @txt = ERROR_MESSAGE(); RAISERROR('HTML file error (%s)', 11, 1, @txt); END CATCH DROP TABLE #f; END GO EXEC dbo.Foo_UpdateHTML @FooID = 1, @HTMLFilePath = 'C:\test.html'; GO SELECT FooID, HTML FROM dbo.Foo; GO DROP PROCEDURE dbo.Foo_UpdateHTML; DROP TABLE dbo.Foo; GO
In this case if the file is not valid you will receive the following error message:
Msg 50000, Level 11, State 1, Procedure Foo_UpdateHTML, Line 44 HTML file error (File not found.)
Of course you are more than welcome to adjust the error handling to your liking. But I will recommend again that, while you may want to use this as a stop-gap solution, you are going to be better off in the long run using a solution that requires several assumptions to be correct.
EDIT: From Adam Machanic's suggestion, here is a different version of the procedure which uses BULK INSERT, avoiding both the xp_cmdshell requirement as well as the quirky string concatenation:
CREATE PROCEDURE dbo.Foo_UpdateHTML @FooID INT, @HTMLFilePath VARCHAR(8000) AS BEGIN SET NOCOUNT ON; DECLARE @txt NVARCHAR(MAX); CREATE TABLE #f(t NVARCHAR(MAX)); BEGIN TRY SET @txt = 'BULK INSERT #f FROM ''' + @HTMLFilePath + ''' WITH ( ROWTERMINATOR = ''[email protected]#$%^&*()'', FIELDTERMINATOR = ''[email protected]#$%^&*()'' );'; EXEC sp_executeSQL @txt; UPDATE dbo.Foo SET HTML = (SELECT t FROM #f) WHERE FooID = @FooID; END TRY BEGIN CATCH DECLARE @txt NVARCHAR(MAX) = ERROR_MESSAGE(); RAISERROR('HTML file error (%s)', 11, 1, @txt); END CATCH DROP TABLE #f; END GO