If you ever had the problem where you need to extract files from a SharePoint Content Database or normal SQL Database stored as binary, this post will help you.
The script I have included will export all the content from the SharePoint Content Database to the file structure set by SharePoint. This script should work as it is on your SharePoint Database. If you modify this script a little you can use it to extract and binary data from SQL to files.
Problem: The SharePoint Content Database got corrupted because of a third-party add-on. This caused all kinds of problems as the files could not be found anymore. The Content Database was still accessible through SQL Server.
Solution: To fix this problem we had to extract all the current images and documents to a file system. The first thing you will have to do is enable the Ole Automation Procedures. This will allow SQL Server to write to the file system. You will also have to check that you have the correct permissions and access to write to the file system from SQL Server.
Enabling file writing: Run the following script in SQL:
sp_configure 'show advanced options', 1; GO RECONFIGURE; GO sp_configure 'Ole Automation Procedures', 1; GO RECONFIGURE; GO |
Now that you have access to the file system you can run the following script. Extracting files from database:
--DECLARING THE CURSOR, THIS IS THE ITEMS YOU WANT TO RUN THE EXTRACTING ON DECLARE CURSOR_Images CURSOR FOR (SELECT Id FROM [dbo].[AllDocs]) --DECLARE THE TYPE OF THE COLUMN YOU SELECTED ABOVE DECLARE @ImageID uniqueidentifier; --START THE CURSOR AND RUN THROUGH ALL THE ITEMS IN CURSOR_Images OPEN CURSOR_Images FETCH NEXT FROM CURSOR_Images INTO @ImageID WHILE (@@FETCH_STATUS <> -1) BEGIN --DECLARE THE VARIABLE THAT WILL KEEP THE BINARY DATA DECLARE @ImageData varbinary(MAX); --SELECT THE BINARY DATA AND SET IT TO @ImageData. THE BINARY DATA FOR ALLDOCS ARE LOCATED IN ALLDOCSTREAMS --AND THE ID IS THE SAME AS IN ALLDOCS SELECT @ImageData = (SELECT TOP 1 CONVERT(varbinary(MAX), Content, 1) FROM [dbo].[AllDocStreams] WHERE Id = @ImageID ORDER BY InternalVersion ASC); --GET THE LOCATION OF THE DIRECTORY THE FILES WAS SAVED IN AND CHANGE REPLACE THE / WITH \ TO BE USED IN FILESYSTEM DECLARE @DIRPATH NVARCHAR(MAX); SET @DIRPATH = REPLACE((SELECT DirName FROM [dbo].[AllDocs] WHERE Id = @ImageID),'/','\'); --SET THE PATH DECLARE @Path nvarchar(1024); SELECT @Path = 'C:\Export\' + @DIRPATH + '\'; --CREATE THE DIRECTORIES EXEC master.dbo.xp_create_subdir @Path; --GET THE FILE NAME OF THE FILE FROM LEAFNAME DECLARE @Filename NVARCHAR(1024); SELECT @Filename = (SELECT LeafName FROM [dbo].[AllDocs] WHERE id = @ImageID); --SET THE FULL PATH FOR WHERE THE FILES WILL BE STORED DECLARE @FullPathToOutputFile NVARCHAR(2048); SELECT @FullPathToOutputFile = @Path + '\' + @Filename; --SAVE THE FILE TO THE FILE SYSTEM DECLARE @ObjectToken INT EXEC sp_OACreate 'ADODB.Stream', @ObjectToken OUTPUT; EXEC sp_OASetProperty @ObjectToken, 'TYPE', 1; EXEC sp_OAMethod @ObjectToken, 'OPEN'; EXEC sp_OAMethod @ObjectToken, 'WRITE', NULL, @ImageData; EXEC sp_OAMethod @ObjectToken, 'SaveToFile', NULL, @FullPathToOutputFile, 2; EXEC sp_OAMethod @ObjectToken, 'Close'; EXEC sp_OADestroy @ObjectToken; --LOOP TO THE NEXT ENTRY IN THE CURSOR FETCH NEXT FROM CURSOR_Images INTO @ImageID END CLOSE CURSOR_Images DEALLOCATE CURSOR_Images |
After running this script it could take some time depending on the size of the SharePoint Content Database. You will see some errors that you can ignore. When done have a look in the folder you extracted the files and you will find all your files in the directory. If you have any questions don’t hesitate to ask. I will be glad to help where I can. You can also find me on Skype: Corvitech
Does this also work with a SharePoint 2007 database?
Thanks,
Jason.
Hi Jason
Sorry for the late reply. I have not tested this on 2007. I think you will be able to get this to work but with a little tweaking.
Regards
This is a thing of beauty. This script is exactly what we needed to clean out the DB of nearly a million documents! Thank you for taking the time to post this little gem.
One issue I has was the backslash was doubled when setting the @FullPathToOutputFile variable.
Is there anything like this for SP 2013?
Hi Curtis,
I haven’t had this problem in SharePoint 2013 yet. But I will most definitely add a new post the moment I have to do this in 2013.
Have a wonderful day.
Kind Regards
Hello.
I’m having an issue. I cannot find the table “AllDocStreams” nut I’ve got the tables DocsToStreams, DocStreams and AllDocVersions.
Can you please help?
Thank you in advance.
Hi Mevin,
If you are using SharePoint 2013, they have changed AllDocStreams to DocStreams.
Kind Regards
Hello, doing this on SP 2013, and there are a couple of tidbits that I hope I did correctly…
Line 18: Changed “AllDocStreams” to “DocStreams”
Line 18: Changed “WHERE Id = @ImageID ORDER BY InternalVersion ASC” to “WHERE DocId = @ImageID ORDER BY RbsId ASC”
Line 35: Removed: + “” ( to prevent doubled backslashes.
It’s running now, and populating the export directory. I’m seeing the files, and opened several and they appear to be intact.
My main question is – now that I have these things, is there a way to get them back into SP, or is that a manual process?
Thanks,
Jolly
Hi Jolly,
At the moment I don’t have the means of getting the data back into SharePoint. As far as I know, it will probably be a custom app as well.
Kind Regards
Hi your script works great on SharePoint 2010. But I was wondering if you could get all the versions as well. To extract the documents with the different versions is what I am looking for, can you help. thanks.
Hi SR,
I will see what I can do. At the moment I have some deadlines approaching some will have to get them out of the way first.
Kind Regards
Thank you. I have been looking to get the versions from SharePoint and could not find anything else close. Your script works great for the published files, so I will wait for your reply.
I’m wondering, if I could run a Nintex workflow that can execute a SQL stored proc.. so it could run on a particular document in Sharepoint, and then copy the Binary data to a separate DB table. Would this be possible ? Rather than exporting the files, I simply want to transfer the binary to another system.
i tryed with extracting document from my content database but everytime when i execute your script i received error
Msg 22048, Level 16, State 1, Line 0
xp_create_subdir() returned error 183, ‘Cannot create a file when that file already exists.’
the script create folder structure with files but size of files is 0kb in folder c:export
plz help me
thanks
I am trying to find the mysites data but it only shows me the folder names of my users as files with a 0KB file size. Does that mean the data isn’t there?
Hi.
I’ve done the changes jolly suggested as we have SP2013, but the script only generates empty files (.file) with 0kb of data – one file for each Site Collection. I also see errors saying it cannot create subfolders as the folders already exist. Can someone please assist. I’m having an emergency where docs are missing (but are still in database) and this is my best option.
Jonas
I am having this same issue. I look in the DocStreams and i see rows for my ID but the “content” is null? Anybody have a thought?
Could this be due to using blob storage for the files?
Hello, i Need you help
Anyone know how to extract PDF files from alldocs table?
I have one database under sharepoint.
Regards.
Hi ,
I need to get the sharepoint images from the database through ssrs. is it possible to get the images?
Hi Jason,
Thanks for the suggestion. Unfortunately I cannot make this work in SharePoint 2013.I believe this is related to “shredded storage” somehow. I’ve made the changes mentioned above – no luck. I’ve tried this program too: http://stackoverflow.com/questions/5719866/extracting-files-from-a-sharepoint-database Understand it works in SharePoint 2007+2010 – but not 2013 with Shredded Storage as DocToStreams is a new table where chunk of the file is stored.
Suggestions on how to fix this would be very much appreciated.
Best regards
Morten
Thank you Johan,
I got a call from a very nervous admin after incorrectly implementing a SharePoint 2010 update, seems as if the db has grown over the 10GB size limit and unable to perform the upgrade, the admin then went and hacked at it until the site was unrecoverable.
With your scripting I could export all site contents and rescue the site content.
Thank you so much for documenting, you are a true legend.
Kind regards
Daniel Mostert
Johan,
Do you think this could work on a SP2000 database? Thanks.
Hello
Thanks for your amazing script ! As I’m not very familiar with SQL I have a question (I hope not too stupid) : is it possible to have the script extract the files from a specific path ? So that it doesn’t extract all the files when only a few are needed ?
Thanks again
Cheers
Thanks JOHAN
You really saved our bacon with this one.
As a little bit of give back here is some commands I used to work out if the files exist in the store or not:
SELECT * FROM [dbo].[AllDocStreams] INNER JOIN [dbo].[AllDocs] ON
[dbo].[AllDocStreams].[Id] = [dbo].[AllDocs].[Id]
WHERE [dbo].[AllDocs].[DirName] like ‘%{Directory you are looking for}%’
I also found that InternalVersion was not a field available in my AllDocStreams table. That could be a 2007/2010/2013 thing.
I also intended adding this to the last post.
A WHERE clause can be added to help filter the restore content you need. Sometimes the whole lot is not needed only one directory.
-DECLARING THE CURSOR, THIS IS THE ITEMS YOU WANT TO RUN THE EXTRACTING ON
DECLARE CURSOR_Images CURSOR FOR (SELECT Id FROM [dbo].[AllDocs]WHERE [DirName] like ‘%{Directory You are looking for}%’)
In my use (against WSS_Content from SP 2010) I made two changes in order to get the correct doc output — 1. Add DISTINCT to this line: DECLARE CURSOR_Images CURSOR FOR (SELECT DISTINCT Id FROM [dbo].[AllDocs]) and 2. Change sort order from ASC to DESC in this line: ORDER BY InternalVersion DESC. Without these changes, the file that I got was not the latest version but only the first version that was returned in the cursor. Not the same thing. You should test this for yourself.
Hello
I tried this on SharePoint 2013, CU August 2015. The files are extracted from the custom directory but they are damaged and could not be read (docx, pdf, jpg, png). Sure, I have to work on Shredded Storage.
Moreover, the parameter “Order By RbsId ASC (thanks Jolly) ” extract the files with different size in Kb.
Thanks again
This is a great solution!
I’ve had to modify the script (for a Sharepoint 2003 database) and I had to migrate a copy of the SQL Server 2000 (!) database to SQL Server 2005, because the OLE automation procedures are not available on the engine of SQL Server 2000. And now it works! I need to export more than a 130k docs, so yeah!
o/ Cheers for sharing, Johan!
Best regards,
Jon (SQL Server dba)
Having the same issue as some people before:
Msg 22048, Level 16, State 1, Line 0
xp_create_subdir() returned error 183, ‘Cannot create a file when that file already exists.’
already given Eveyone FULL CONTROL to the local directory.
Any suggestions?
You should add IF statement between comments –SET THE PATH and –CREATE THE DIRECTORIES
–SET THE PATH
DECLARE @Path nvarchar(1024);
SELECT @Path = ‘C:Export’ + @DIRPATH + ”;
DECLARE @fileExt varchar(50);
SELECT @fileExt = (SELECT TOP 1 ExtensionForFile FROM [dbo].[AllDocs] WHERE id = @ImageID ORDER BY ContentVersion)
IF @fileExt ”
BEGIN
–CREATE THE DIRECTORIES
…
END
I just used this on a sharepoint database from an SBS 2003 machine. I have no idea what version of sharepoint that might of been, but there were records for the directories themselves which resulted in empty files being created and preventing the directories from being crated. I had to add ‘where Type = 0’ to the cursor statement to filter them out.
Many thanks for sharing!
Running the script with some of the modifications suggested. Some of the extracted files are readable, some are not (corrupt??). The only difference I can see is that the file size is incorrect for the unreadable files (mostly smaller, but sometimes, bigger, than the correct file size). I can have files in the same SP directory – some will be OK, some not. Any ideas? Maybe it’s not getting the correct version?? Thanks.
I ran into the same problem. In my cursor definition I added: WHERE size IS NOT NULL AND size > 0
It could also be versioning turned on for a given library. In that case you could select the max value for [contentversion] or even the max value for [TimeLastModified].
Alternatively, you could add an extra IF to see if the file has already been copied to the folder, and if so, append something to the filename to differentiate multiple files then you can decide which ones are the ligitimate ones later.
running this export script creates 0kb files?
Please help!