Thursday, April 16, 2009

Tweet, tweet! Why Pobox is using Twitter

0 comments
Since the new Home page has gone live, you've probably noticed the Twitter link at the top of the page. You may be wondering, what is Twitter? And why is Pobox using it?

Twitter is a "micro-blogging" service. If you use Facebook, you're probably familiar with status updates. Twitter is a status-updates-only service. If you use IRC, you can think of Twitter as your global channel.

If you don't use any of these services, Twitter allows you to post messages, up to 140 characters long (that's 20 less than a text message!) from the web, your cell phone or instant messenger. People who are "following" you can see your message on the Twitter website, using one of the many third-party programs out there, following your feed in RSS, or (as you can see) when you integrate it into your own website.

So, why is Pobox using it? I am the first person to admit that we have frequently been remiss in posting notices about problems and downtimes. In a lot of cases, it's because someone identifies the problem, thinks, well, I could go write a post for the News page, or just fix it, and chooses to just fix it. With a 140 character limit, there's always a minute to jot a quick note about a problem.

Twitter makes it easier for you to get those notices, too. The old News page had an RSS feed, or you could come to the site to see it. Both of those would be unavailable if the notice was, "The Pobox website is down right now"! Using Twitter means you can choose to get notified through a larger number of tools, or still just visit our site if you prefer. But by using a tool that is automatically displayed on the site when it's available, but accessible using other methods when our site is down, we should (hopefully) always have a way to communicate with you, no matter how dire the problem.

Finally, there are a lot of great companies out there. By using services like Twitter for our announcements, Liquid Planner for our project planning, and GitHub for our version control, we're reducing the amount of support tools we maintain, so we can focus on what we do best -- giving you the best tools to manage your email.

----

Pobox is hiring! If you're a Perl programmer who loves email as much as we do, check out our job posting.

Wednesday, April 1, 2009

All About Spam: The Case of the Productless Spam

1 comments
All About Spam is a series of blog posts about common spammer techniques. Have a question about a type of spam that you'd like to see in a future blog post? Leave a comment, or send an email to pobox@pobox.com!

The classic spam is a smoking gun, easy to spot. Viagra. University diplomas. My new favorite, the acai berry. But some messages have a twist; they don't appear to be selling anything at all! I received the following email today:
From: hfkunm@winartproje.com
Subject: NYC judge denounces woman's self-styled sting

Militants Attack NATO Terminal In Pakistan
hfkunm and I are not best buds. That's the whole message; it doesn't even have a link in it. Aren't spammers supposed to be selling me something? So, why did a spammer bother sending me this message?

Elementary, my dear readers! The first reason is simple: they could be probing for valid email addresses.

The second reason: they're trying to beat the system. In 2002, Paul Graham popularized a plan to filter spam using all your spam and all your ham (legitimate mail) to generate a giant word list, known as Bayesian filtering. Each word would be given a score, based on how frequently it appeared in spam vs. ham. The idea had two key points:
  • it would learn about new spam words as they were introduced
  • "good" words could offset "bad" words
Good words are words that appear, proportionally, way less often in spam. For example, spammers rarely talk about themselves in the first person, so "I" or "I'm" has a negative spam score. Spammers do want you to click on links, so the word "click" has a positive spam score.

So, how does it all work? Well, let's take that most popular of all spam words, Viagra. Your gossipy friend sends you a message all about herself, and it happens to include "I hear Joe started taking viagra!" A keyword-based spam filter will block any message that contains "viagra", so out it goes. A Bayesian filter would say, all these "I"s outweigh the the one "viagra", and let it through.

For a short while, Bayesian filters were all the rage, and very effective, because they were trained per user. Spammers never let a good plan get them down, though, and came up with a simple, ingenious solution: start sending random content. In the early days, it was snippets from great books (read David Copperfield one paragraph at a time!). They've since moved on to simple randomized phrases, and headlines like today's. All these red herrings have certainly degraded the accuracy of Bayesian filters, but like a good detective, spam filters try all the tools in their arsenal, hoping to find the one that closes the case.

------
Do you love sending email so much it hurts? See some simple stretches to relieve carpal tunnel syndrome pain.