Thursday, April 29, 2010

Icons and Indexing for PDF files on SharePoint

Part of our SharePoint project is making sure that users can find what they are looking for (including within PDF documents) and that there are icons next to the documents that accurately reflect the file type. There are a variety of blog posts and information out there regarding this from several years ago, but I'd like to summarize this up for those of you who might be doing what I've done - installed WSS 3.0 on Server 2008.

Out of the box, WSS 3.0 only indexes standard Windows file types, which are Office files and basic text files. It will also only show the proper icons for Office files. All other files get the default "blank paper" icon.

To allow for searching of PDF files, you'll need the proper iFilter installed on your server. We are using version 9 of the Adobe Acrobat product line, so by installing Acrobat Reader 9 on the server all the necessary files were installed. No need to download any other iFilter components separately.

Then I followed the steps as outlined in
KB 927675. This article was last reviewed in May 2007, but the steps haven't change for Server 2008. However the data I found in Step 3 for the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\.pdf registry setting was different than in the article. It was {4C904448-74A9-11D0-AF6E-00C04FD8DC02} on my installation.

Next you'll need the icon files. You'll want the 17 x 17 pixel one from the
Adobe website. If you have any other icons for specialty file types you'll wanted added to your server, you might as well gather them all up and make sure they are 17 x 17 as well. Copy them all to \Program Files\Common Files\microsoft shared\Web Server extensions\12\TEMPLATE\IMAGES.

Then open the XML file that WSS uses to reference which file types display which icons. This is at \Program Files\Common Files\microsoft shared\Web Server extensions\12\TEMPLATE\XML\DOCICON.XML. (Saving a backup copy is always a good idea at this point.)

Add a mapping key for each of the file types at the bottom of the file, above the /ByExtension closing tag. XML is case sensitive so make sure you use same case and spacing as previous entries. The key will be Mapping Key="ext" Value="iconfile.gif" OpenControl='''/. Replace "ext" with "pdf" or whatever file extension you are adding a icon for, and adjust the "iconfile.gif" name to reflect the correct name of the image files you added.
Then save the XML file.

To ensure a full new crawl of all the PDF files for indexing, you should restart the server or stop/restart the spsearch service and force a full index using the stsadm -o spearch -action fullcrawlstart. I just restated the server, as we are using a virtual server for SharePoint and VMs restart pretty quickly.


Finally, if you'd like to check out the older blog posts I used as references, check out Configure PDF IFilter in WWS 3.0 and Searching PDFs with WSS 3.0 SP1.

1 comment:

  1. Adding those icons in my case was a must. We use SharePoint WSS 3.0 with Knowledgelake Software to index all our corporate and branches documents. Not too many PDF but I can see benefits of adding the icons especially with the growing third party integration software with SharePoint. Adding the icons was simple and straight forward from the article.

    ReplyDelete