ETech – Bloglines

Mark Fletcher, on his experience starting up companies
Garage philosophy. Started working on Bloglines on his own while he was still working at other companies. You need:

  • Passion – because it will consume your life
  • Cheap technologies – great time to start an Internet service. Hardware/software is getting cheaper
  • Keep it simple – keep it simple both for users and technology
  • Release early and often – it’s really important to get things out there and incrementally improve. Your users will have better ideas about your service than you will
  • Moonlighting limits risks – Worked nights and weekends; friends/family are the first people you should look for funds because they want to see you succeed; free services == less pressure (it’s not the end of the world if your service is down for a few hours)
  • Hire a lawyer
  • Web services APIs are a good thing
  • Find good help (especially sys admin)
  • Outsource to eLance.com (you can outsource all kinds of stuff. Have contractors bid for your work)

Architectur 101: Front-end (web, mail servers); Backend (user dbs, other dbs, storage)

Software choices

  • DBJ (http://cr.yp.to) qmail djbdns daemontools
  • ClearSilver (web templating package)
  • Berkeley DBs
  • Linux/Apache
  • C/C++/bash/python
  • Skiplist data structure (a data structure algorithm)
  • Avoid NFS (has a tendency to look up systems without explanation)
  • Avoid table-level locking in MySQL (doesn’t scale)

Hardware choices

  • Two choices: dedicated servers vs. buying/hosting. They went the dedicated server route. Cost less to get going
  • Design for cheap hardware – Google is the shining example of this
  • eBay – you can get hardware on the cheap
  • APC PDUs for remote power cycleing (power strips you can log into and cycle if a machine has crashed on you)
  • HP ProCurve (machines work great)
  • Avoid Seagate Ultra-SCSI drives
  • Good phone for SSH – likes a Treo so you can log into your machines from anywhere

Architecture choices

  • Copying files vs. client/server (they end up copy files around like bloglines RSS feeds)
  • Calculate on the fly vs. cache (subscriber counts at bloglines are delivered by a once a day process)
    Memory vs. Disk

Storage choices

  • Relational DBs vs. Flat Files (all blog articles are stored as flat files–all 3M articles)
  • RAID vs. Redundant (they ensure blog articles are replicated across all machines–why? if a box goes down, you don’t lose available of an article)
  • Linux software RAID 1 – rock solid

Sysadmin choices

  • DNS round robin for web servers – don’t have to worry about setting up a load balancer
  • Hot back-ups for off-line processing – backup every hour
  • Worry about cooling in the co-lo (if you start to have hard drive failures, that’s a good indicator that you might be having cooling problems)

Tidbit from Q&A. The size of the company was under ten at the Ask Jeeves acquisition.

Advertisements