Aug 21, 2021

Accuracy of MessageBird's Permissioned Panel Data

Accuracy of MessageBird's Permissioned Panel Data

Accuracy of MessageBird's Permissioned Panel Data

One of the questions we regularly receive about our Permissioned Email Panel is how accurate it is in terms of forecasting inbox rates. Historically, this has been a difficult question to answer with any authority as there was no source of ground truth to measure against, and so opinions (and general faith in sample statistics) ruled the discussion.

Now, though, with a major mailbox provider licensing inbox placement data for their platform, it’s possible to do a real analysis, which we did over some 20,000 distinct sending domains of senders large and small, both on our sending platform and on other providers. 

The results are exciting. The permissioned panel is highly accurate, even with relatively low signal, and gets extremely accurate as the number of distinct panelists seen in a send is increased. Using common statistical methods, we look at the root mean square error (RMSE – an analogue of the standard deviation) between the inboxing rate as seen at the major provider with what our panel sees.

In our analysis, we noted that senders who send their mail through top email service providers see a materially better correlation between the panel inbox rate and true inbox rate. The mechanism for this isn’t known, but we postulate that the compliance standards that large service providers hold their customers to generally result in more consistent inboxing rates across their audience and so are less prone to skew. We can see this if we restrict our plot to senders only on top ESPs, which also reduces the RMSE by about 30%.

Even when a small number (10) of daily panelists see the mail stream, we see a very strong correlation between the inbox rate as seen by the panel and the ground truth.

If we consider only streams where 50 or more panelists are seen daily, the correlation becomes even tighter.

If we look at how this error rate varies over time, we see a few things: 

  • Even at extremely small numbers of unique panelists receiving the mail, the error rate is under 10%.

  • It quickly drops to 4% as the number of panelists increases.

  • It eventually approaches 2% – showing that panel data is 98% accurate. 

For the purposes of identifying mail stream deliverability issues, this accuracy is fantastic.

So you might ask: with a major provider offering ground truth numbers, what’s the utility of having panel data as well, even if it is highly correlated? The majority of mailbox providers – including titans like Google and Microsoft – do not offer inbox placement data, and so for messages delivered there, you still need a source like panel data to understand inboxing rates. 

And now we can all be confident in its accuracy for those cases.