Wednesday, March 25, 2009

The great Mailstore migration


Today's guest blogger is Bryan Allen, our Operations head. For a long time now, Mailstore accounts have been undergoing a makeover, as the number of accounts grew. He's been working on making Mailstore a top-notch service for quite some time. Tonight's hardware roll-out is one of the final steps in his plan, so he's taken a few minutes to chronicle the whole journey. Take it away, Bryan!

Pobox doesn't often talk about the technology we use to ensure the spice continues flowing, and technical posts that don't deal with a specific (typically obnoxious to the person who solved it) problem tend to be pretty dry, but hopefully you'll find this somewhat informative.

Tonight's outage is the next step in what I've come to refer to as The Great Mailstore Migration.

Previously on Pobox...

Two years ago, it became obvious that our infrastructure needed a major change. We needed to consolidate hardware, we needed to move from the x86 Linux whiteboxes we were using to something beefier and better built. After a fair amount of testing, it was decided we'd move to Solaris 10 on Sun hardware.

Changing platforms gave us a lot: All the awesomeness of ZFS (checksumming, cheap snapshots, etc), and gobs of introspection thanks to DTrace. It wasn't a trivial process, but in the end it was absolutely worth it. Perhaps most importantly to how we provision services now, we got Solaris Zones. Pretty much any Pobox service you use lives in a Solaris container. Encapsulating services in this way allows us to fine-tune resource controls, quickly migrate the zone to bigger hardware, gives us a really simple view of how many resources a given service is consuming (as it's all running in a zone, there's no hunting for ancillerary processes amongst, perhaps, thousands).

It's also very useful conceptually when provisioning, knowing that you have services which require x CPU, y RAM, z disk I/O. You can think of the services (which themselves may consist of several services with different requirements) as boxes (or, in Sun or IBM parlance, containers) and assign them to hosts with the appropriate avaible resources. The same could be done for services in a single flat topology but I've found it a very useful mental tool, if nothing else.

The Great Mailstore Migration

The first step of the Great Migration, undertaken last year, moved us:

  • from Generic x86 servers to Sun X4100 M2s
  • from Linux to Solaris 10
  • from ReiserFS to ZFS
  • from SATA to SCSI

There were a few hiccups here, mainly relating the ZFS Adaptive Replacement Cache. Essentially, ZFS does a lot of really smart things when it comes to prefetching data, and caching it in memory. Depending on the size of your dataset, the ZFS ARC will want lots and lots of RAM. In Mailstore's case, the pretty point is pinned at 6GB min, but tends to hover between 7 and 8GB.

Next, we switched mail backends, migrating from Courier to Cyrus. This move was greatly simplified thanks to Gilles Lamiral's imapsync tool. We also deployed nginx in front of Mailstore, in IMAP proxy mode. These are both very cool, very useful pieces of technology, and we're very happy to have them in our toolkit.

The primary reason for moving to Cyrus were its binary indexes. These are compacted databases which greatly speed access to metadata about the messages in your Mailstore folders. We saw major performance increases here, especially relating to Webmail. We also got push notifications for free here, whereas with Courier we had to utilize FAM at such a performance cost it became untenable.

(Note: this type of push doesn't work with iPhones. As pretty much everyone in the office has one, we really wish it did.)

In Tonight's Thrilling Episode...

This outage is a two-part upgrade. We are deploying Sun J4200 SATA arrays to replace the SCSI arrays Mailstore data currently lives on. We're also upgrading the Mailstore servers to the most recent revision of Solaris 10. This latter gets us on ZFS root pools, greatly mitgating the amount of time it can take to upgrade a system. We're also getting a newer version of ZFS, in which, if it becomes necessary, we can build Hybrid Storage Pools on the J4200s.

Moving to the J4200s and SATA also increases our disk capacity by quite a lot. To the customer, this means we can start storing snapshots for longer. Snapshots are what we use to quickly restore user data when requested. At the moment, we store about a week's worth of snapshots. With the new storage systems in place, storing a month or more becomes reasonable. It has happened, though somewhat rarely, that a customer will ask us if we can restore a specific piece of mail they deleted a few days ago. More rarely still, they want to restore something from more than a week past.

Well, now we'll be able to with a minimum of fuss. This increased snapshot capacity may also save some of our POP users who suffer local hard drive crashes (though we highly recommend moving to IMAP!)

Something I've been thinking about for a while is wrapping a web interface around the snapshots and letting customers restore their own mail. This feature may have to wait for ZFS to get a "diff" ability, but email if you think it's an interesting idea.

Coming Up...

In the final planned Mailstore upgrade task, we'll be moving from the legacy version of Cyrus to the latest version. This move will allow us to incrementally build mailbox databases more easily (for faster searching, primarily in Webmail), easier replication, and another major performance boost due to a database backend change.

So there it is: The reason for the planned outages in the last year, and where the Mailstore backend is going.

As always, if you have any questions, please email

Thanks, Bryan!


Tax time is coming up. Beware phishing attempts! The IRS does not request information via email.

Wednesday, March 18, 2009

How do you use your Inbox?

The Inbox, in many ways, is the junk drawer of people's lives. Everyone has one. You think everything in there is important. But, sooner or later, you realize you can't find anything you need in it, your finger just got snapped in a mousetrap you shoved in there 3 years ago, the crazy glue is completely dried up, and you just want to find a battery so you can play one more round of Wii Sports, and you swear you're going to clean it out first thing tomorrow.

Gmail has tried pushing the idea that your Inbox is your endless repository, "delete" is a thing of the past and searching is the way to get through it. But, a couple years ago, I realized that I was using my email Inbox as my virtual to-do list. And, if you do that, but you follow the junk drawer model, you've (more than once) ended up totally forgetting about something important, because it's scrolled off your visible messages, and, out of sight, out of mind, right?

Last year, I began a campaign to bring my messages down to zero. And I'm not the only one. The suggestions basically boil down to:
  • Don't make your system more complicated than it needs to be
  • Delete, delete, delete! It's much faster to delete immediately, than to keep coming back to it, then guiltily deleting it a month from now.
  • Move things to your calendar or your actual to-do list. Only leave messages that need a reply.
  • If the reply can be written in under 2 minutes, do it now.
  • Here's my own personal addition: always separate your work and personal email into two different boxes.
That's basically it! Now, my personal Inbox looks more like an old-fashioned stack of correspondence -- actual honest-to-goodness letters from friends, that are deserving of a lengthy reply. My work Inbox I should be stricter about (there are many more messages that will take more than 2 minutes, but less than a half an hour, and those tend to linger), but I have fewer than 15 messages there, and I could probably knock it down to 7 with about 30 minutes. Then, at the end of the week, I tackle all the ones that have been hanging around. I never *quite* get down to zero, but I've stopped losing emails in the depths of my Inbox!

For all you mega-Inboxers out there, how do you keep track of the things that still need your attention? Let me know in the comments!


ZDnet's security blog reiterates my earlier advice about protecting your passwords, in light of this week's revelation that several thousand Comcast customer username/password combos were posted to the Internet. Check out their recommendations for password security.

Tuesday, March 3, 2009

Use geography to slash your spam!

In addition to our spam recommendations, Pobox also offers the ability to block mail by location! And now, you can block more than ever before.

We've recently upgraded our country list to include virtually every country and geographic region. This doesn't block spam, this blocks all mail from that region. Here's the idea.

You don't know anyone who lives in Angola. Someone adds your email address to a list, for a legitimate provider or service in Angola. But, just because it's a legitimate provider doesn't mean you want the mail, and you don't speak Portuguese, so you can't figure out how to unsubscribe. By blocking all mail from geographic regions where you don't have any correspondents, you can cut down on all types of unwanted messages.

Because it blocks all mail from that region, you should only turn on country blacklists for countries (or continents) where you don't know anyone. In the above example, if you had a penpal in Angola, that mail would also be blocked, even though your penpal is not a spammer. Also, Blackberry is headquartered in Canada, so blocking mail from Canada can cause you to lose mail from Blackberry users. So, this is a tool to be used carefully, but that can really take a bite out of spam. (Yum!)

Try it out now from the beta version of the new index page!

Monday, March 2, 2009

De-cluttering your Home!

We've been doing some substantial site revisions in the last few months, and since putting the new Home page up, we've gotten some questions like, "Why did you take X away?" The revision to the index page (and, indeed, the site in general) has been driven by a series of surveys and phone interviews we did over the last year and a half. We tried to solicit feedback from a broad range of customers using a variety of services and service levels. From there, we tried to generate a site design that addressed their comments and complaints, and some guidelines to direct our design process now and in the future.

One of the biggest issues that we kept coming back to was that people frequently couldn't find what they were looking for, or they didn't realize that the features they most wanted from us were things we already did. As such, we realized that what you leave off a page is sometimes as important as what you put on. The Home page in particular was suffering from information overload.

So, in the redesign, some things were purposely removed from this page. They are either infrequently changed, infrequently accessed from this page, or used by a small percentage of customers. Here's where people are making changes:

Adding, Changing and Deleting Addresses (including on Personal Domains): 60%
Spam Settings: 14%
Spam Report Settings: 11%
Filters: 4.4%
Upgrades and Account Type Changes: 3.3%
Vacation Mail: 3%
Adding Trusted Senders: .7%
Adding Personal Domain: .65%
URL Redirection: .49%
Password: .2%

(In case you are interested, in terms of pages viewed, far and away, the pages that get the most hits are the spam pages. Everything else is dwarfed in comparison.)

After looking at those statistics, our revisions focused around making it easier to work with your addresses and your spam settings, right from the front page. Some other items either had their information removed (to de-clutter the page) or the links removed (if most people got to that page from somewhere other than the homepage.) Both those changes gave us more space to keep the most important information front and center.

The site design guidelines we came up with, in order of importance, are:
  • Don't let customers hurt themselves
  • Make using email a better experience
  • The easiest way to make changes should be the customer-facing way
  • Make it easy for customers to find answers to their questions
  • Present each user information appropriate to their account
We hope you like the new changes to the Home page. Let us know how you think we did!