Managing Email Campaign Lists with Excel Macros
At TrueJob we've spent a significant amount of time building up an email list to use for marketing and outreach. Inevitably, some processes have gotten a bit messy over time - we store contacts in our CRM, but we've been sending out email campaigns via Sendgrid. Sendgrid ends up with important information not reflected in our CRM (bounced emails, blocks, spam reports, etc).
Maintaining good email list 'hygiene' is essential for continuing to get your messages delivered (and it's the law), so we're careful about removing any email addreses from our list that have delivery problems.
Bad Old Days
The workflow when creating a list for an email campaign used to look something like this:
- Download list of relevant contacts from the CRM
- Download a bunch of 'suppression' reports from Sendgrid (separate reports for bounces, blocks, spam, invalid addresses, etc.)
- Copy the CRM contacts list into a new spreadsheet
- Copy all the 'suppressions' into another column in that spreadsheet
- Use some excel formulas (e.g. VLOOKUP) to filter out all the suppressed emails from the list of contacts
- If there were any specific organizations we didn't want to email, search the resulting list for any with that domain
- Copy the final filtered list into a CSV ready for upload
This didn't take a huge amount of time (say 10-15 minutes) but it was annoying and repetitive.
Good New Days
The new process looks like this:
- Download contacts from the CRM and all the reports from Sendgrid into one folder
- Open a spreadsheet and run a macro that does all the necessary filtering and writes out the final file for upload
Why use Excel macros and not a real programming language like Python? Sometimes it is nice to skip the command line and open a spreadsheet with a big button that says 'GO'. More importantly spreadsheets make it easy to provide configuration visually (by updating cell values), and when working with CSV files it's also nice to be able to view and search entries when debugging.
Step 1: Import all the CSVs
The first step was to write a macro to import all the data we need into one spreadsheet. This isn't strictly necessary - but it looks cool, makes things easier later, and leaves all the data in one place if there is any debugging to do.
This macro opens every CSV file in a directory one at a time and copies the contents into a new worksheet of the main workbook.
Sub import_all_csvs()
Dim s As String
Dim totalSheets As Long
Dim thisBook As String
Dim baseDir As String
baseDir = Range("input_directory").Value
thisBook = Application.ActiveWorkbook.Name
s = Dir(baseDir + "*.csv")
Do Until s = ""
If s <> Range("out_file_name").Value Then
Workbooks.Open (baseDir + s)
totalSheets = Workbooks(thisBook).Worksheets.Count
Workbooks(s).Worksheets(1).Copy after:=Workbooks(thisBook).Worksheets(totalSheets)
Workbooks(s).Close
End If
s = Dir
Loop
End Sub
Note the Dir
function - call once in a target directory to get the first matching file, then subsequent calls without arguments return the next matching file (see generators).
The result when you see this run (a rapid opening and closing of spreadsheets) will make you look like an Excel wizard.
Step 2: Read the CRM contacts
In our case, a user's email address might appear in one of several columns that start with email (e.g., 'Email (Personal)' or 'Email (Work)'). The function below handles this very typical data cleaning problem by storing all potential email columns in a Dictionary, then iterating through each potential email column for each row.
Function get_starting_list() As Dictionary
Dim cur As Worksheet
Dim c As Long
Dim r As Long
Dim key As Variant
Dim emailColumns As New Dictionary
Dim emails As New Dictionary
For Each cur In Worksheets
'In our case this identifies which sheet has the list of contacts
If Left(cur.Name, 7) = "truejob" Then
'First find the columns that might hold email addresses
c = 1
Do Until cur.Cells(1, c) = ""
If LCase(Left(cur.Cells(1, c), 5)) = "email" Then
emailColumns.Add c, 1
End If
c = c + 1
Loop
'Now go through each row and find an email address'
r = 2
Do Until cur.Cells(r, 1) = ""
For Each key In emailColumns.Keys()
If cur.Cells(r, key) <> "" Then
emails.Add LCase(cur.Cells(r, key).Value), 1
Exit For
End If
Next
r = r + 1
Loop
End If
Next
Set get_starting_list = emails
End Function
Step 3: Read the 'Suppressions'
Now, we loop through each worksheet - skipping the sheet we read the CRM contacts from - and build up a list of email addresses to suppress. These are the ones we don't want to appear in the final output. This is straightforward so I'm leaving this code out: on each sheet we find the column with a heading that starts with 'email' and just read down the list.
Of course in our case there is some extra complexity which comes from another common data cleaning challenge: some of the CSVs we download come in a slightly different format with a single column of email addresses and no header. We use a simple and reliable test: check for an @ sign in the contents of the first cell (alternatively, you could use a regex).
Step 4: Write the output file
Finally, we put it all together. One main function (do_all) calls each of the steps in turn, and another writes the output file. It's particular easy to write CSV files that only have a single column (maybe version 2.0 will include additional information like first and last names for more personalized campaigns). Note I've included a convenient ability to filter out particular domains.
Sub do_all()
Call import_all_csvs
Dim startingList As Dictionary
Dim suppressionsList As Dictionary
Set startingList = get_starting_list()
Set suppressionsList = get_suppressions_list()
Call write_output_file(startingList, suppressionsList)
End Sub
Sub write_output_file(start As Dictionary, suppress As Dictionary)
Dim domainsSuppress As Dictionary
Dim out As String
Dim email As Variant
Dim fn As Long
Dim i As Long
Dim domain As String
'get_excluded_domains reads values from a named range on the spreadsheet'
Set domainsSuppress = get_excluded_domains()
out = Range("input_directory").Value + Range("out_file_name").Value
fn = FreeFile
i = 0
Open out For Append As #fn
Print #fn, "email"
For Each email In start.Keys()
domain = Split(email, "@")(1)
If Not suppress.Exists(email) And Not domainsSuppress.Exists(domain) Then
Print #fn, email
i = i + 1
End If
Next
Close #fn
MsgBox Str(start.Count) & " initial emails" & Chr(13) & Str(i) & " written to file"
End Sub
Fill in the relevant fields, click GO, and:
Success!
You can find the spreadsheet code on Github. To use it, you'll first need to add it to a spreadsheet with the appropriate named ranges.
The ideal world might see us with a better system to keep our data in sync, but sometimes simple is better and I'm guessing this spreadsheet will keep things working for some time.