Email to Data (with an LLM)

Photo of Tom Dekan
by Tom Dekan
Updated: Sun 03 November 2024

Problem: I needed to file my taxes.

I had many email receipts (around 330) in .eml format. I needed to convert them to .pdf format before sending them to my accountant as a link to a cloud folder.

Issues:

  • My email provider (FastMail) only allows me to download the emails in .eml format.
  • The receipts follow a non-standard format (so getting a suitable regular expressions may be difficult).

My simple repo does the following:

  1. Converts all email receipts (.eml files) to PDFs (The PDFs are largely plain text).
  2. Extracts key financial data from those PDFs into a CSV file (which I can import neatly into Google Sheets)


The process

Example Output

The PDF output looks like this (I manually redacted the actual data):

PDF Example

The CSV output looks like this:

File Name,Total Amount ($),Currency,Transaction Date,Descriptive Details
Receipt-2566-5568.pdf,47.42,USD,2050-06-01,"Render - Servers, PostgresDB, Redis usage for May 2050"
Receipt-2952-5288.pdf,9.52,EUR,2050-03-03,Twitter International ULC - Twitter Blue subscription

And the imported Google Sheets looks like this:

Google Sheets Example

Use Cases

  • Generally converting .eml files to PDFs
  • Tax preparation
  • Expense tracking
  • Accounting reconciliation
  • Digital receipt organization
  • Audit preparation

Full repo here

Let's get visual.

Do you want to create beautiful frontends effortlessly?
Click below to book your spot on our early access mailing list (as well as early adopter prices).
Copied link to clipboard 📋

Made with care by Tom Dekan

© 2024 Photon Designer