Migrating Salesforce rich text fields is a simple task UNTIL you have to migrate a rich text fields which contains images. This article explores the secrets of migrating any rich text field — with or without images.

You may need a cold beer while learning this topic! Rich Text Fields with images is not exactly a well documented feature.
Salesforce Rich Text Trivia You Need to Know
Salesforce rich text fields without images are stored in straightforward HTML. For example, the next two rich text snippets from Salesforce show a field without an image, followed by nearly the same field with an image.
In the second example notice that the image is represented using a hyperlink location which may be unfamiliar to you.
- https://capstorm–CStormTest–c.documentforce.com/servlet/rtaImage?eid=a0Z1F000001Hakr&feoid=00N1F000009Qvjr&refid=0EM1F0000009QqI
The location, capstorm–CStormTest–c.documentforce.com, is a reference to a storage subsystem where Salesforce stores rich text images.
How does this work? When a user pastes an image to a rich text field in the Salesforce GUI the image is sent to Salesforce and then:
- Salesforce stores the image in a special storage subsystem.
- Salesforce rewrites the image tag in the html to point to Salesforce’s storage subsystem.
This is one last bit of trivia that is helpful to know before exploring rich text migration issues:
- How are rich text images initially sent to Salesforce from the GUI (or a Salesforce API?)
Do you remember how to embed images in a HTML base using base64 encoding? If not then you need to brush up on this technique because this is what is needed for the Salesforce APIs. Here is an example of the HTML required to upload an image to a Salesforce rich text field:
Step 1: Completely Read a Salesforce Rich Text Field
If you read the first part of this post you now know:
- Reading a Salesforce rich text field is easy. Any Salesforce API will work.
- Reading the embedded images is harder. You have to talk the Salesforce document server.
The basic logic for each rich text field, however, is pretty simple.
There are several tricky parts to reading raw image bytes from the Salesforce server:
- Constructing a HttpClient with the necessary security headers is a bit difficult.
- The Salesforce image server fails with a HTTP 404 sometimes and a query needs to be attempted more than once. This is uncommon but does happen.
Let’s look at the first step — setting up the HttpClient. If your first notion is to “look up the documentation” then I wish you good luck (I had no luck and had to resort to Chrome’s tracing tools). Here is what seems to work.
- Run a fake login URL to Salesforce using an active session id.
- Grab the headers returned and construct a cookie required to grab images from the Salesforce storage subsystem.
The following bit of code is an example of how to build the required cookie:
Are you with me so far? Once the cookie required by the Salesforce document system has been determined, reading an individual image is simple — it is basic HttpClient code which must include the cookie in the header.
Here is a example you should be able to follow to do this successfully yourself. (In practice there is a bit more to the problem because communication problem can happen — a full code example, however, is a bit too much code to drop into a blog post).
Step 2: Write a Rich Text Field to Salesforce
If you have been paying attention then this will be the easiest step for you. The basic algorithm is:
Anyone who has read this far will not need sample code for the rich text field writing step. Go forth and write your own code!
The beach is beautiful today!
