Generate ROI From Salesforce and Amazon/AWS… Today!
Sales came to me with an interesting analytics problem early this week. The solution required a combination and analysis of data from two different places:
- Our Salesforce production instance
- A back-end database from a website hosted on AWS.
The basic problem:
We are spending a couple of hours each week manually searching for companies who are trying out CopyStorm with Salesforce production instances. We do not care about licenses for current customers, but the process to filter customers from prospects takes forever because we have to cross-reference Salesforce with the license portal. These are good opportunities for us but are taking too much time to find. Please help us!
Here are a few technical details that will help you understand the problem:
- New CopyStorm license are automatically generated by calling out to a license web service hosted on AWS.
- The AWS license web service uses a Amazon/RDS MySQL database to store:
- The Salesforce Organization Id
- The name of the Salesforce Organization
- Location data that can found with whois (based on IP address of the license request). Technically we use a service from www.maxmind.com.
- When a company has purchased CopyStorm the sales team enters an Asset record in Salesforce and records the associated Salesforce Organization Id.
So…the problem sales posed was deceptively simple:
- Find active CopyStorm licenses in Amazon/RDS MySQL instance which do not have a corresponding Salesforce Asset record. Tell the sales team the results every morning in an email.
How would you solve this problem?
- In the end, I wrote a single SQL statement which generated the data wanted by Sales (it took me 5 minutes to write the SQL).
If you want to know the solution and how you can apply it to answer other difficult questions, read the rest of this post.
Step 1: Create a Copy of Salesforce Data On Amazon/RDS
The question sales asked required combining Salesforce data with data we store on Amazon/RDS. Since I am a bit lazy, I wanted a solution I could finish in an afternoon.
Yes, I know I could use Salesforce Connect and write my final query in Salesforce (assuming SOQL limitations did not bite me) or migrate our work to PostgreSQL with Heroku Connect but both of these approaches required a lot of work. I wanted to finish up today, preferably by 3. (Friday = weekly developer stand up at the pub.)
What I did was:
- Create a new empty schema in our Amazon/RDS MySQL (CapstormSalesforce).
- Use CopyStorm to copy just the data I needed from Salesforce to the CapstormSalesforce database.
- Scheduled the CopyStorm job created in step #2 to run a 3am local time every day. For this case, a once a day update from Salesforce was good enough (I could have done every 5 minutes to have really fresh data but did not need it for this problem).
The Result: I had a current copy of the Salesforce tables I needed in the same database as my customer license data. Now I could write a query!
Setting up CopyStorm and creating the database for the first time took under 2 minutes (I had to look up AWS credentials).

Once the Amazon/RDS MySQL database was created for the first time, I scheduled a script to keep it up to date (using CRON in my case).
Step 2: Write the SQL and Deliver the Answer to Sales
Now that my license database (CapstormLicense) and a copy of Salesforce (CapstormSalesforce) were in the same Amazon/RDS MySQL database getting the results sales wanted to see was pretty easy.
The following SQL finds active software licenses on production Salesforce instances where there is no corresponding active Asset record in Salesforce.
- CapstormLicense is the MySQL schema which holds our customer license key information.
- CapstormSalesforce is the MySQL schema which has a copy of our Salesforce production instance
The key point is combining these two data sources in a single database engine made this key operational questions easy to answer (and ask).
Step 3: Package It Up for Delivery
If you need help packaging a SQL statement into a report/email/spreadsheet/etc. that can be delivered automatically, then you are looking in the wrong place. There are dozens of ways to take raw SQL and deliver data — pick a tool where you have familiarity.
For prototyping I simply pasted a generated CSV into a spreadsheet and emailed it to sales (they were thrilled). Given our environment, our next step will likely be to package a small Python script and schedule it to run nightly.
Total Time Invested: 30 minutes
Amount of Time Sales Says this Will Save Them: 1-2 hours a week.
Would you call this a success? I would.
Yes, We Really Work and Live Here.
(Want to join our team?)
