Parse inbound email with AI, not regex

Stop wrestling with MIME parsers, regex, and brittle email formatting. Mailhooks delivers pre-structured JSON so your LLM gets clean data, not a parsing headache.

The Problem

  • Parsing inbound email is universally painful — MIME boundaries, quoted-printable encoding, multipart messages, and attachment handling are error-prone
  • Regex-based extraction breaks the moment email formatting changes or an unexpected header appears
  • Different email clients produce wildly different HTML — Gmail, Outlook, Apple Mail each generate unique markup
  • Structured data hidden in email bodies (dates, amounts, order numbers) requires brittle extraction logic
  • Processing attachments means handling file uploads, virus scanning, and storage before your app even sees the content

Why Existing Solutions Fall Short

Writing a custom MIME parser is a full project — RFC 2822 has 100+ edge cases, and that's before multipart/alternative and inline images

IMAP-based parsing means polling delays (30-60 seconds), OAuth management, and stale data

Mailgun/SendGrid inbound webhooks still require you to handle multipart form data or URL-encoded payloads

Regex extraction breaks with format changes — a line break, extra space, or different date format kills your parser

Pre-processing email for LLMs means stripping HTML, decoding base64 attachments, and constructing clean prompts — all before the AI even runs

You shouldn't have to build this yourself.

How Mailhooks Solves This

Pre-parsed JSON

Sender, subject, plain text, HTML body, headers, and thread IDs — all delivered as structured JSON. No MIME parsing on your end.

Attachment URLs

Attachments are extracted, stored, and served as URLs with content types and sizes. Your LLM gets a link, not a base64 blob.

Clean text for LLMs

Plain text body is automatically extracted and decoded. Pass it straight to your LLM without HTML stripping or encoding cleanup.

Instant delivery

Email arrives, webhook fires within seconds. No IMAP polling, no OAuth tokens to refresh, no delays.

Sender filtering

Only forward emails from domains or addresses you care about. Cut noise before it reaches your parser or LLM.

Thread context

Thread IDs group related replies. Your AI agent sees the full conversation, not isolated messages.

How It Works

1

Create a mailhook

Set up a dedicated email address for the type of email you want to parse (e.g. [email protected]).

2

Point to your LLM endpoint

Configure the mailhook to POST structured JSON to your AI processing endpoint — or use SSE for firewalled environments.

3

Email arrives

An email hits your address. Mailhooks parses the MIME, extracts text/HTML, decodes attachments, and delivers clean JSON.

4

LLM processes

Your LLM receives pre-structured data. Extract dates, amounts, entities, or classify intent — no custom parsing required.

5

Store or route

Route the extracted data to your database, CRM, Notion, or trigger downstream workflows.

Code Example

Mailhooks delivers this JSON to your endpoint — no MIME parsing, no encoding, no attachment handling required.

Webhook Payload

{
  "id": "msg_parse_abc123",
  "from": "[email protected]",
  "to": ["[email protected]"],
  "subject": "Invoice #INV-2024-0847 — Payment Due April 30",
  "text": "Please find attached invoice #INV-2024-0847 for £2,340.00...",
  "html": "

Please find attached invoice...

", "threadId": "thread_inv_789", "attachments": [ { "filename": "INV-2024-0847.pdf", "contentType": "application/pdf", "size": 145678, "url": "https://files.mailhooks.dev/..." } ] }

Handler Code

// Forward to your LLM for extraction
app.post('/webhook/mailhooks', async (req, res) => {
  const { from, subject, text, attachments } = req.body;

  // Clean text is already extracted — just pass it to the LLM
  const extraction = await openai.chat.completions.create({
    model: 'gpt-4o',
    messages: [{
      role: 'system',
      content: 'Extract: invoice number, amount, due date, vendor name.'
    }, {
      role: 'user',
      content: text
    }],
  });

  const structured = JSON.parse(extraction.choices[0].message.content);
  // { invoice_number: "INV-2024-0847", amount: 2340.00, due_date: "2024-04-30", vendor: "Vendor Ltd" }

  await db.invoices.create({ ...structured, source_email: from });
  res.sendStatus(200);
});

Frequently Asked Questions

Every inbox can be configured with sender filtering rules. You can whitelist specific domains or email addresses, or use our webhook to implement your own spam filtering logic. Emails that don't match your rules are automatically rejected.

Webhooks are typically delivered within 100-500ms of email receipt. We process emails in real-time with no polling delays. For high-availability applications, we also offer webhook retries with exponential backoff.

Mailhooks is built specifically for inbound email. We offer simpler setup (no DNS changes required for testing), better attachment handling with direct download URLs, and a developer-first API for fetching emails programmatically—perfect for E2E testing.

Yes! You can connect your own domain with simple DNS configuration. We also provide free subdomains on inbox.mailhooks.dev for testing and development.

We automatically retry failed webhooks with exponential backoff for up to 24 hours. You can also use our API to fetch any missed emails. All emails are stored and accessible via the dashboard.

Get Started in 3 Steps

Takes ~2 minutes — no email infrastructure required.

1

Create a Mailhooks account

Sign up for free in seconds.

2

Create an inbox

Get a unique email address for your use case.

3

Add your webhook URL

Point to your endpoint and start receiving emails.

Stop parsing email. Start using it.