July 4, 2008

MySQL problems

Since a lot of readers seem to want more war stories, here’s one that happened tonight. About three hours ago, any person that tried to go to MyListo got a MySQL database error: “SQLSTATE[08004] [1040] Too many connections”.

The first thing we tried to do was restart our production web and database servers. EC2 provides tools that allow you to do this through the command line.

After restarting, we got a different MySQL error: “SQLSTATE[HY000] [2013] Lost connection to MySQL server at ‘reading initial communication packet”. What the heck does that mean? Perhaps the MySQL isn’t running on the db machine?

So then we login to the database machine and try to restart MySQL. Sadly, another error: “/etc/init.d/mysql: ERROR: The partition with /var/lib/mysql is too full!”.

At least we’re getting somewhere now. The results of running “df -h” confirm that one of our partitions was completely full:

Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1             9.9G  9.4G     0 100% /
varrun                851M  164K  850M   1% /var/run
varlock               851M     0  851M   0% /var/lock
udev                  851M   16K  851M   1% /dev
devshm                851M     0  851M   0% /dev/shm
/dev/sda2             147G  188M  140G   1% /mnt
overflow              1.0M     0  1.0M   0% /tmp

Everything returns to normal after deleting some unneccesary files to free up space.

You’re probably wondering why our server only had 10 GB of space. No, we’re not running some old school Pentium II. We’re using Amazon EC2 and it turns out that the small instances only have 10 GB of permanent storage.


Comments (View)
I’m not sure if I should post this because of the F-bomb, but I’m going to do it anyways because it’s pretty funny.
Here’s a note that my co-founders left on my desk. Let’s just say I’m notorious for keeping very strange hours. I hope no one is offended by the language. We blurred it out a little. I hope this doesn’t end up on Valleywag.

I’m not sure if I should post this because of the F-bomb, but I’m going to do it anyways because it’s pretty funny.

Here’s a note that my co-founders left on my desk. Let’s just say I’m notorious for keeping very strange hours. I hope no one is offended by the language. We blurred it out a little. I hope this doesn’t end up on Valleywag.


Comments (View)
July 3, 2008

MyListo Facebook Application: Swipe It

We launched a Facebook Application a couple of weeks ago called Swipe It as a way to quickly and cheaply acquire users. For those of you who aren’t familiar with social networking platforms, the primary advantage of building on Facebook is that you can leverage all of their communication channels between users such as notifications, newsfeeds, and requests to efficiently propogate your application.

The basic concepts behind Swipe It are very simple. Users have a virtual credit card that they can use to send each other gifts such as electronics, cars, jewelry, bags, shoes, etc. However, you can only send gifts that cost less their your credit limit. For example, if your credit limit is $50, you won’t be able to send an iPod because it costs $200. Users increase their credit limit by sending cheapers gifts or sending requests to friends. The application virally grows because users naturally like to send gifts to their friends and we encourage people to invite their friends to level up their credit limit.

Over time we’re going to start adding more high value features to the application that already exist on our main website such as the Own List, Want List, and product pages.

For those of you who are intersted in building Facebook apps, here are some things we learned that should help you:

  • It sounds obvious, but keep track of every install and when it occured in your database. First, we noticed that the numbers reported by Facebook aren’t always accurate. Second, you’ll want to track your daily installs and make sure they’re trending up.
  • The lowest install days for us are on Friday and Saturday. The highest install days are Sunday and Monday. Presumably people go out on Friday and Saturday and check their emails on Sunday and Monday when they go back to work. We’re not sure if this is consistent with what other app developers are seeing.
  • Keep track of the number of notifications and requests you’re sending by day so that you can understand the health of your viral loops. For us, the number of installs we got on a particular day was directly proportional to the number of notifications and requests that were sent in the previous days.
  • The nature of the platform is such that the most successful apps are the ones that that generate a lot of daily activity on a per user basis. More activity translates to more notifications and requests which translates to more installs. Think of ways in which you can get your users to generate activity on a continuous basis. For example, Swipe It reminds users to send gifts to their friends with upcoming birthdays. The average Facebook user will almost always have a friend who has an upcoming birthday.

Comments (View)

What are the most owned products on MyListo?

A couple of us at MyListo are data junkies so we love to run random queries on our database late at night instead of sleeping to find interesting statistics. Here are the most owned products on the site:

  1. Nintendo Wii - owned by 82 people
  2. Apple iPhone - owned by 59 people
  3. Apple iPod Nano (2nd generation) - owned by 54 people
  4. Apple iPod Nano (3rd generation) - owned by 45 people
  5. Apple iPod Shuffle - owned by 44 people
  6. Apple Macbook - owned by 43 people
  7. Microsoft Xbox 360 - owned by 43 people
  8. Apple iPod Classic (6th generation) - owned by 41 people
  9. Apple Macbook Pro - owned by 40 people
  10. Nintendo DS Lite - owned by 36 people

We’ll run some interesting queries every few posts and share them with our readers. Let us know if you want us to run a specific query!


Comments (View)
May 30, 2008

Comments (View)

Comments (View)
May 25, 2008

Why We Were Down.

Last weekend was pretty interesting for us. Our production webserver died when we were deploying our new facebook application (circa 3:20am). After some frantic live debugging and desperate attempts to revive the server, we had to “call it”. Time of death, May 17th, 6:15am.

After a few short hours of restless sleep, we tried again to recusistate the server. Those attempts were equally futile. Ok, so let’s look at the evidence in more detail:

  • CPU spinning
  • Memory thrashing
  • MySql DB connection maxed out
  • SSH shell intermittently responsive
  • HTTP requests fail
  • No Apache access being logged
  • etc…
We had a few lurking suspicions for what might have caused this calamity. We isolated those pieces on our dev server and hammered on them. Lo and behold, the dev server started to show the symptoms we saw in our production server. Great! Right? Well, sort of…

We patched up the culprit code, brought up a backed up version of our deceased server, and push out our new code. Hmmm, so far so good… but wait, why doesn’t our site work yet? Unfortunately for us, we had accidentally released the IP of our production webserver when we brought up our new server. In other words, www.mylisto.com doesn’t know about the new server we just brought up… so our site is still down. This should be a simple fix, just remap the mylisto.com DNS to point to our new IP and we should be good to go, right? Right, but our previous DNS entry was set to stay alive in cache for 1 week. That means it might take a week for some computers/users to realize that mylisto.com is hosted on a new server. So, our site continues to be down.

Alright alright, so we can’t just leave our users hanging for a week… we gotta come up with a solution! Our solution was rather ingenius: instead of using mylisto.com (which currently doesn’t map to our new IP), we just use the new IP. So a page that used to look like www.mylisto.com/nintendo_wii would now be http://75.101.157.86/nintendo_wii.www.mylisto.com/nintendo_wii would now be http://75.101.157.86/nintendo_wii. Yay! Now our site is clickable!

We were celebrating our great solution until we found out that our CAPTCHA solution (provided by re-captcha) authenticates image requests based on the referrer url. So when a user went to http://75.101.157.86/rgs/register to register for a new account or went to http://75.101.157.86/itm/add to add a new item, the CAPTCHA image won’t load. (re-captcha probably thinks a hacker is trying to spoof our site…) So, even though our existing users could click around the site, no new users could join and no new items could be added to the site…

Needless to say, we reverted the site back to using mylisto.com…. :P

Thankfully, this happened to us when our site’s still young and most of our users are our friends. We definitely learned a couple of valuable lessons. For one, don’t set your DNS TTL (time to live) to 1 week when your production environment might be in flux. And also, don’t use the cron to run certain jobs every minute when the job itself may take longer than a minute to execute. :P

Comments (View)
May 19, 2008

Downtime

As some of you have noticed, we had some server issues over the weekend and MyListo was down for a couple of hours. The site is up now and should be fairly stable. There are still a few problems we’re in the middle of fixing so things might be rough for a little while longer.

Comments (View)
April 28, 2008

Free Cornbread From Boston Market

  • I needed some food at night before going into the office so I went to the Boston Market near my house. I was wearing my MySpace hat that says "MySpace or Yours".
  • Guy who works at Boston Market: I like your hat.
  • Me: Thanks...
  • Guy who works at Boston Market: Do you like cornbread?
  • Me: Yea, it's pretty good...
  • Guy who works at Boston Market: I'm going to hook you up!
  • Guy proceeds to fill a bag with 5 cornbreads
  • Me: Cool thanks...
  • Guy who works at Boston Market: Just remember who hooked you up...
  • Guy lifts his hand up in a fist to give me the fist bump thing that guys do
  • I fist bump him back

Comments (View)
April 27, 2008

Stack it up

We got a couple of curious folk asking about the MyListo tech stack, so we figured we ought to give y’all a quick overview:

  • Amazon EC2 -  We’re built on the backs of Amazon’s virtualized machines.  Every single one of our boxes is just an EC2 instance.  This has turned out to be a fantastic choice.  Not only is it cheap for us poor-startup-folk, but it’s really easy to maintain.  When we want to do something adventurously-crazy to our box, like upgrade to a pre-release operating system or install a massive software package, we just backup our box into a new AMI (Amazon Machine Image) and do our server upgrade.  If the install went awry, we just kill the old instance, start the new instance, and we’re gtg. 
  • Ubuntu - So… if you know your linux distros, you know that ubuntu is hella awesome!  No contest right?? :P  Well, I’ve personally installed a dozen or so different linux distros and I’ve been a follower of Debian for the past five years.  Ubuntu is basically Debian with some extra fun frills on the side.  In the end, the choice between a straight-up Debian box versus a Ubuntu Server box was purely for maintenance simplicity. 
  • PHP - No Rails here.  We wanted an open-source language with a solid webdev history, OOP, pretty-insane customizability, and phenomenal community support.  .NET requires a M$ box and hella $$ for licensing.  Rails is impossible to read….  Perl/C++ is too ol’ skool.  So, for us, the choice was really between PHP and Python.  Let it be known, both are very sexy.  PHP 5 has all of the programming features we want, a solid community supporting it, and a TON of plugins/add-ons that we can leverage.  Python was just starting to become a more popular webdev language (at the time when we had to make this choice), but it obviously has all the programming features we were looking for and solid community support for the language itself.  You probably can’t go wrong either way you go, but we chose PHP.  (On a side note, Python does now have the added advantage of easy portability into Google’s AppEngine…)
  • Smarty - We wanted a clean separation between our true php (page-logic) code and our front-end (presentation-logic) code.  So we chose Smarty.  The clean separation between the php files that processes the “data to display” and the templates that contain the “how to display”, allowed us to cleanly organize all of our code and files for a higher degree of re-usability and parallel development.  Furthermore, the Smarty engine allowed us to build out a suite of our own site-specific functional plugins and pre-/post- rendering hooks.  So basically… what would’ve been an interesting blend of page-logic and presentation-logic was condensed into straight-forward php logic that pushed data into modularized and shared templates, all the while leveraging a uniform flow of how data and templates are called, rendered, and outputed.
  • MySQL - There isn’t much to say about MySQL itself, but we built out a federated database schema to be able to leverage the true scaling power of EC2.  Theoretically… and yes, this is somewhat untested :P … we can scale our database as easily as simply as small config tweaks and bringing up new db instances on EC2.
  • Memcached - Yup we use it too!  How did anyone survive without memcache?  We use memcache heavily for caching objects, lists of objects, and other pre-computed lists for objects.  We tend to invalidate most of our cache entries real time, but for some extended data, we invalidate in semi-realtime.
  • APC - This is one of those, install it and forget it type things.  It’s constantly running, and helping us serve pages faster, but we haven’t had to play with it much … yet.
  • Nginx - We use nginx as a reverse proxy to serve static content faster and help load balance web requests.
  • Lucene - For our search.
  • Subversion - The best source control in the world … that’s free. :P
  • Bugzilla - Gotta track those bugs and assign them… do it early and do it often. Otherwise, things will just be forgotten and bugs will compound on each other. 
  • Amazon S3 - This is our massive cloud-based storage solution.  Most of our images are served from S3.  We also use S3 for our EC2 instance backup images, database backups, subversion backups, etc.
  • Amazon SQS - The backend messaging engine between various service-based components of our site so we can have semi-realtime, or “eventually consistent”, data and data processing.
  • Google Analytics - ‘nuff said.
I suppose our stack is pretty generic, but if you have any specific questions, just post in the comments and we’ll try to answer them asap.

Comments (View)