Migrating Documents to Content in Salesforce.com
Here are the basic instructions to move SF Attachments to SF Content in a Salesforce Org that has legacy documents stored as Attachments and wishes to move them to Content. It assumes the user has comfort with using the Data Loader and data in general. This document focuses on Attachments, but a similar process can be used to move Documents to Content within Salesforce.
Note: there is a 5,000 file per day limit on Content submissions.
Convert the Org to Use Content
At a minimum, Enable Content and add all users as Content users
Replace Attachments related lists throughout system and add related lists for Content (done by adding lookup to the objects as custom fields on Content Type layouts)
Now, nobody should be able to add new attachments during the migration (might have to do this after-hours to limit disruption)
Attachments & Document Files : Start by doing a full Export including all files from within SF. This will get you all the attachments, documents, etc. named by their record ID in SF.
Setup->Data Management->Data Export->Export Now
Be sure to check “Include in Export” at top and “Include all Data” at bottom
Note: If an export has been done in the last 48 hours, you will not have the option to “Export Now”. You will have to wait.
The Export will complete in 5 minutes to 2 hours depending on its size and SF system availability and an email will be sent to the app-xprod email address. Alternatively, you can periodically refresh the page
One or more .zip files should be on the Data Export screen after it has completed
Download all of them…they will only be available for 48 hours
Place the .zip files somewhere on your local machine
Extract all of them within a folder:
Will result in a bunch of .csv files of the data. These can be deleted or filed elsewhere
Will result in Attachments folder with all attachments. You will need these. Make sure they are all in the same folder
Will result in Documents folder with all documents. If migrating these, you will also need these
Attachments Data.csv: Export all current attachments in Data Loader. Exclude the column “Body” from the export.
Keep the original Attachments export .csv in a safe place. You’ll probably need to refer to it.
Verify the number of Attachments in this export matches the number you have in the Attachments folder from the Export. Ditto if you are also migrating Documents.
ContentVersion.csv: Export the current CONTENT (Content Version) table in Data Loader. Exclude the column “VERSIONDATA” from the Export.
Keep original file as you’ll probably need it later.
Build your Content Version .csv by first exporting a test record from the Content Version table in data loader. Exclude the “VERSIONDATA” column from the export. Include the following columns or delete the rest after export.
ID: Unique ID of Current version of Document that is displayed when you click on a document; Will be blank in import file
CONTENTDOCUMENTID: Unique ID of a Content Document. Different than above; Will be blank in import file
TITLE: This is what the document will be called; Title of the document from the Attachments table or other source
VERSIONDATA: Full file path to the document being inserted without file type extension (i.e. C:\Clients\Attachments\00PA000002gpI3Mai). See Helpful Hints below; Must be populated and without file extension
PATHONCLIENT: Full file path to the document being inserted with file type extension (i.e. C:\Clients\Attachments\00PA000002gpI3Mai.pdf). This tells Content what the file type is so that the document can be previewed in SF. If extension not assigned correctly, the import is pretty much worthless. See Helpful Hints below; Must be populated and with file extension
OWNERID: In the case of attachments related to other records, this is the owner id (18 char) of the other record, not the attachment (depending on how attachments were loaded, you may be able to simply take the owner of the attachment). This is because some attachments have the owner as the Admin if they were loaded via an API call. You want the actual Owner as any security in the org is driven off of that (Requires additional vLookup to export of the related records tables); must be populated
FIRSTPUBLISHLOCATIONID: This is the workspace ID (18 char) that you are inserting each document into; Must be populated
RECORDTYPEID: This is the ContentType ID (18 char) of the Content type you want to assign to this document; Must be populated if more than one Content Type defined in the org. Otherwise, it defaults to General and you don’t need the column
ALL APPROPRIATE LOOKUPS to other objects (i.e. Contact__C, Account__C); Optional
ALL APPROPRIATE FILTERS (i.e. Document_Type__C, Year__C. Needs to be populated manually if you want filter information; Optional
ALL APPROPRIATE LEGACY IDS: I like to add the id of the document from where it came from (may be internal or external IDs); Optional, but highly suggested
Build your VERSION DATA: Add the static path in one cell in your .csv (i.e. G2 has “C:\Clients\Attachments\” in it). Add the appropriate attachment ID for this document (i.e. H2 has “00PA000002gpI3Mai” in it). Build the file path by using concatenate function below; G2&H2; Should result in C:\Clients\Attachments\00PA000002gpI3Mai
Determine File Type: Use formula below to extract the characters after the period at the end of the title of the document to grab the file extension. A2 is the cell reference containing the Title of the document that has an extension (i.e. Sample.pdf); =RIGHT(A2,LEN(A2)-FIND("^^",SUBSTITUTE(A2,".","^^",LEN(A2)-LEN(SUBSTITUTE(A2,".",""))))); Result should be “pdf”; Eyeball all the results for sort by the results as many times you get “.pd” or “.xl”. Fix the title of these documents so you get a legit file extension
Build your PATHONCLIENT: Concatenate the contents of VERSIONDATA that you already built (C:\Clients\Attachments\00PA000002gpI3Mai) with the result of the file type you have determined (i.e. “.pdf”); Use concatenate function to give you the following result (i.e. =G2&”.”&H2...note the addition of period in quotes); C:\Clients\Attachments\00PA000002gpI3Mai.pdf
Data Loader 21.0 does not report on “File not Found” Errors. It gives you no error or success file record for this situation. Suggest using Data Loader 20.0 for loading of Content Version table
CSV’s are finicky. You cannot have more than 1 worksheet in a given .csv. Well, you can, but when you save, it will delete all but the first
VLOOKUP: Excellent function. You will need to use it.
FIXID: If you happen to have 15 char ids, you can use Excel function =FIXID(A2) where A2 contains the 15 char id