Reasons Unbeknownst

August 3, 2008

PHP MySQL Benchmark Tool (PMBT v. 0.2)

Filed under: Open Source,Technology — Tags: , , , , , — Kirk @

The old saying “Need is the father of innovation” (or something like that) held true this weekend. I was looking for an easy way to benchmark MySQL for some RAM drive InnoDB experimentation but couldn’t find anything cross platform, user friendly, and created after 2005. So I built an early version of what I was looking for.

This is a very synthetic benchmark for now. In some instances InnoDB is much faster than MyISAM (simultaneous reads/writes) but that doesn’t come across in these results. I’m planning on beefing up the benchmark options in later versions. This tool is currently useful in benchmarking hard drive / RAID performance when using InnoDB. It’s also good for basic my.cnf tweaking.

So here is the PHP MySQL Benchmark Tool (PMBT) version 0.2 (download). You’ll need InnoDB (optional) and MySQLi (might change this) support for now. Tested with MySQL 5.0.51b-community.

Current Features:

  • Easy to use: No compiling, complicated config files, or OS requirements. Web based.
  • Specify your table type (MyISAM, InnoDB, MEMORY)
  • Insert, Select, and Update testing (CRUd)
  • Specify number of records to test
  • Results displayed in operations per second (instead of total seconds)

Future Features:

  • Custom benchmarks on existing tables
  • Aesthetic Rejuvenation
  • Support for non mysqli connections (do a find/replace on the code for now)
  • Full text benchmarking for MyISAM
  • Less Syntheticish Testing (simultaneous reads/writes (AJAX?))

Instructions:

  • Unzip to a web accessible folder.
  • Change the DB settings in dbconnect.inc.php to match your setup.

Let me know in the comments if you find this useful.

And in case you’re curious InnoDB inserts went from 100 per second to 9,955 per second on the RAM drive. MyISAM tables saw virtually no change in performance. The InnoDB log was also on the RAM drive for that benchmark. Full ACID compliance at half the speed of a Memory/Heap table ain’t bad. I’m writing up a post on how I made it happen.

In other news, I added the LastFM widget over to the right, it plays songs based on my preferences from my music library.

March 24, 2008

Solid State Drive RAID0 Vista PC for under $1000?

Filed under: Technology — Tags: , , , — Kirk @

Solid state drives (SSDs) are notoriously expensive beasts. Considering the recent price drops and growing realization that hard drives are now the last remaining bottleneck in everyday (non gaming) PC performance I set out to see if I could put together a Vista ready PC for under $1000 that will blow the doors off higher cost (non-gaming) systems. I’m looking for something that I can give to a parent who isn’t interested in Crysis timedemos, a parent who is more likely to complain about computer slowness during virus scans and web browsing than Adobe Premiere inadequacy. In short: 95% of the PC owning public.

Complaints about Solid State Drives mainly focus on the price. That’s still true but if you assume that the everyday user only needs 32GB things start to get interesting. Ignoring the disk subsystem, you can build a hugely capable system, even Vista ready, for a surprisingly low price.
The parts:
The prices below link to NewEgg.

  • Motherboard: $80 – AMD 780G based, includes Vista capable 3D and RAID support.
  • Processor: $33.99 – AMD Sempron 64 2800+ Palermo
  • Case: $55 – InWin with power supply. SSDs don’t put out much heat and the mobo has onboard video and a low watt CPU so cooling isn’t a huge concern.
  • RAM: $36 – 2 Gigabytes of no frills Kingston
  • SSDs: $395 each – 2x16GB Mtron MSD 6000 Solid State Drives.

So you can build a Vista ready machine, with no drives, for $205. Using two MTron 16GB SSDs in RAID 0 (see motherboard specs) to make one 32GB drive we arrive at a final price of $995. If you can avoid tax and get it delivered by a friendly mutant pigeon you’ll have a sub $1k SSD RAID box (software sold separately). You can get a cheaper case and add a DVD burner if you need it to keep it under $1000.

This machine will play basic 3D games (thanks to the 780G chipset) and should even handle BluRay decoding (again, thanks to the 780G) but the big difference in performance is the result of a ridiculous 200 megabytes per second flowing from the SSD RAID array. That compares to 81 from a WD Raptor and less for a typical desktop drive. The latency will also drop from 8.0 ms to 0.1 ms which is contributes the unusual speed jump when moving to SSDs. In one test Vista boot time dropped from 23.6 seconds with a WD Raptor to 10.1 with a single Mtron SSD.

Reliability:

RAID 0 scares people, and rightly so. If you lose any of your drives you lose all of your data. I have a server running a RAID 0 array in a garage in Pasadena because I needed performance on a budget. And though it’s backed up I get nervous on hot days, wondering if I’m going to have to put in a few hundred miles to get things running again. Solid state drives run cool and they don’t have moving parts so they tend to be more reliable than old style drives. I say tend because they haven’t been around long enough to determine long term reliability.

Overall Performance:

I’ve come to believe that it is the duty of wealthy nerds to benchmark systems for the good of the rest of us. Progressive testation if you will. For that reason I’m going to hold off on building this thing for now. My thesis though is that people are putting far to much emphasis on processor speed considering the bottlenecks created by years of stagnant progress in the hard drive market. I want a disk subsystem so fast that the CPU is pegged at 100% most of the time because it’s not waiting for that noisy relic of a storage device to rotate around to the right location.

Conclusion:

Pros:

  • Blazing disk performance due to SSDs in RAID 0 which is important for bootup, virus scanning, web browsing, and overall system snappiness.
  • Low power use thanks to SSDs and less beefy processor
  • Noise free drives
  • No need to defrag – performance not affected by fragmentation due to lack of moving parts

Cons:

  • 32Gigabytes is a bit of a stretch for Vista ultimate.
  • Lower end components used to keep it sub $1000
  • Not a great media/gaming PC though it’ll do basic games and Blu-Ray decoding if you spring for a drive

Image by gek_at2000 – click here for info and yes, I’m aware that it’s not a SSD.

January 29, 2008

WordPress Blog Migration Thoughts

Filed under: Efficiency,Open Source,Technology — Tags: , , , — Kirk @

If my blog seems slower lately there is a good reason. I migrated from a dedicated quad-core server to a free WordPress.com account and finally to BlueHost.com ($7/month). The following is a quick look at the pros and cons for people in the same boat.

WordPress.com

Pros:

  • Free (myblog.wordpress.com).
  • Allows use of a custom domain (optional – $10 a year).
  • Allows post import from existing blogs using the import/export tool (see caveats below).

Cons:

  • Very limited use of custom templates and plugins. If you have a real WordPress install you’ll need to make some sacrifices.
  • Import tool is slow and confirmation is delayed.
  • Slow at times, no tech support.

Hosted Shared Servers (BlueHost, DreamHost, etc.) list of more
Pros:

  • Allows a full installation of WordPress and other blogs / content management systems.
  • Much better control of the server, typically with something like CPanel or Webmin.
  • Ability to create multiple databases.
  • Use of custom domain (myblog.com).
  • Mail account support (me@myblog.com)

Cons:

  • Cost $7-$10 a month. That’s well worth it in my opinion given the limitations of free solutions.
  • With BlueHost I experience occasional slow downs because I’m using a shared server.
  • Might be confusing if you’re not comfortable working with databases and WordPress config files

Migration Tips:
Do NOT rely on the export tool that comes with the default WordPress install. I backed up using the export tool as well as the WordPress Backup Plugin. It’s a good thing too because the export didn’t work right and I would have lost most of my blog if I didn’t have a backup of the backup.

BlueHost limits your ability to name databases. I ran into the problem where my wp_config file needed to be changed a little bit so WordPress could see my restored DB considering its new, BlueHost compatible, name.

I used the WP Backup Plugin tool but the import didn’t work. It dumps a big SQL file which you can import directly into a new DB using the tools provided in CPanel. You’ll then need to install WordPress after you’ve created a new DB and tweak the config to connect to the DB. At that point you’ll be up and running but you’ll need your WordPress API Key to get some of the plugins up and running again.

December 26, 2007

Hard Drive Progress and the Viability of Vista

The slowest, most unreliable, and out-dated hunk of machinery in any computer, Mac, PC, laptop, whatever, is the hard drive. Computers today are like incredibly hi-tech cars driving around on wooden buggy wheels. Millions of dollars are poured into developing faster engines and fancy suspension systems but much of it goes to waste because the wooden wheels can only handle speeds under 30 miles per hour.

Hard drives are basically just shrunken down record players. They are no doubt marvels of engineering but they suffer from an unavoidable need to physically move an arm around, reading and writing data onto a spinning hunk of metal (see video). That’s all changing thanks to the emergence of solid state drives. These are a lot like the USB thumb drives stuck to countless key chains except they’re bigger, faster, and you can install your operating system on them.

Next Level Hardware just reviewed the latest generation solid state drive from MTron and it gives a really interesting glimpse into software performance on computers that are no longer bottlenecked by wooden wheels. I won’t get into the geeky details but the thing basically boots Vista more than twice as fast as the fastest mechanical hard drive. I had Vista on a pretty capable laptop with 2Gigs of RAM and it would take minutes of watching the hard drive light blink before I could actually use the thing. But what if Vista wasn’t such a slug? Would it be worth another look?

These drives are horribly expensive but prices are dropping fast. The NLH review is interesting because it’s now possible to imagine a world where there are no more computer bottlenecks and reliability is no longer a concern because of the durability of the new crop of drives.

If performance and reliability are no longer issues some obvious questions come to mind: Why would anybody ever buy a new computer? Will Vista finally find acceptance once the performance issues are solved by better hardware? If nobody needs to upgrade because even cheap PCs are just good enough will the hardware industry collapse?

I predict that people will continue to buy new computers because the cost will be much lower as we move towards system on chip processors from Intel, etc. near the end of the decade. Got a virus? Just buy a new computer for $40. Storage will be remote if Google keeps its promises so even a computer failure will cease to be a big deal.

I’m still skeptical about Vista. By the time these solid state drives are cheap enough for mainstream use Vista will have been declared dead. There are rumors that MS is working on a really solid new OS that could replace Vista but Linux distributions may be good enough by then that it won’t make sense to pay an extra hundred dollars for a DRM shackled OS. Apple is interesting because standards are making the OS of choice nearly irrelevant but they’re also opening up Apple to competition from Linux. Virtual machines are probably the most interesting development because they will make applications OS agnostic. Want to run Office 07 in Linux or Mac? No problem, just boot XP in a virtual machine and do what you need to do.

Soon computers will be nearly free, performance will be more than anybody can use, at least for today’s apps, and the prospect of spending $150 for an operating system running on a $100 PC seems unlikely. Technically, economically, philosophically, Linux would seem destined for world domination.

December 12, 2007

OpenID + WordPress = Finally Working

Filed under: Culture,Efficiency,Open Source,Technology — Tags: , , — Kirk @

OpenID is basically an attempt at single sign on. In theory you’ll be able to go to any site and log in or comment without the need to remember 31 passwords and user names.

So far it’s working great. I have it setup so that I just have to type in my blog’s url unbeknownst.net, it remembers my password, and I can log in and comment at a growing number of sites. My next goal was to add my blog to said growing cadre of OpenID compatible sites.

The best option currently looks like the WP-OpenID plugin. It was a pretty straight forward install but for some reason it was giving me blank pages when I submitted a comment at first. Seems to be humming along now. Feel free to try it out with your OpenID if you dare. I’ll leave the first comment using my OpenID account.

I allow anonymous comments on this blog so it’s not a big deal but you do get to bypass the spam filter if you’re using OpenID (at least for now).

* Update: Delegation doesn’t work. On other sites I can use my blog URL to log in but this plugin won’t let me do that for some reason. I’m sure they’re going to fix this in an update one of these days.

December 3, 2007

Chomsky’s Bias – Capitalism, Media, and Democracy

I can sum up the point of this whole post in one sentence. If real democracy cannot exist without a free press then real capitalism cannot exist without the Internet.

A few years ago I paid $10 for access to an online discussion board frequented by Noam Chomsky and I asked him a question. “How has the Internet affected your understanding of the media?” His response was “I’m an innocent as far as the internet is concerned. I don’t even know what blogger.com is. Better raise the question with others.”

I haven’t tried to understand his work in linguistics but I think his analysis of the media is brilliant. He is a mesmerizing speaker but I was always disturbed by his notion that capitalism is fundamentally flawed. In “Manufacturing Consent” he talks about a requirement of rational central planning as an alternative to the evils of capitalism.

Here’s what I think was going through his head. He buried himself in the study of media and realized the disastrous consequences of communication monopolies when applied to democracy. Those media monopolies were allowed by capitalism which may explain his fear and loathing. And he was absolutely right at that time. His mistake was failing to predict how fast the Internet would break down the system as it existed when he wrote “Manufacturing Consent”. You get the sense that he assumes alternative media will always be crushed due to the inefficiencies and corruption associated with capitalism. Howard Dean and Ron Paul are living proof that the Internet is starting to allow the spread of previously taboo political beliefs.

Chomsky gets communication. He doesn’t get technology (as evidenced by his reply to my question) and based on his insistence upon “rational central planning” as a way forward it looks like he also doesn’t really understand economics or at least the power of emergent order, which is a bit strange considering his love of democracy. My plan is to bump into him some day so I can ask these questions.

October 23, 2007

IPCOP with Multiple Static IPs (1 to 1 NAT)

Filed under: Open Source,Technology — Kirk @

This is going to be another quick and dirty guide for setting up static NAT aka 1:1 NAT aka 1 to 1 NAT on IPCOP.

Commercial grade routers are expensive beasts. A run of the mill Sonicwall Pro Firewall with DMZ support with set you back well over a thousand dollars. They’re nice but you also have to pay an extra monthly fee if you want fancy features enabled.

So you have an old PC lying around and a couple of spare network cards, or maybe you get Soekris 4801-60 ($260 see photo). And you’re thinking to yourself “Well, IPCOP is a free download, I just saved $1200 dollars!”

Not so fast buddy. IPCOP can do most of what my fancy Sonicwall did but you’re going to have to get your hands dirty.

Good Cop:

  • Free
  • Very powerful on a good computer
  • Intrusion detction using Snort (not intrusion prevention)
  • DNS caching, transparent proxy, lots of graphs, basic QOS
  • Basic VPN support
  • Open Source

Bad Cop:

  • Not very intuitive, can be frustrating to install
  • Advanced features require add ons and text file editing
  • Intrusion detction only detects, you need Guardian to beat up the intruders
  • DMZ setup is quirky, no transparent DMZ mode
  • Tech support limited to forums, I used to know the Sonicwall guys on a first name basis

These are the basic steps involved in getting a firewall running including a DMZ for your servers with multiple real IP addresses which will be separate from your green LAN zone. So this is a Red, Orange(DMZ), Green(LAN) setup.
* Set up aliases – This is pretty straight forward, read the manual. You’re basically telling the router about the static IPs associated with your internet connection here (at least those you’re planning on using). Our T1 came with 16 or so.
* Set up port forwarding – Again, just read the manual. Forward port 80 from your virtual IP addresses to your DMZ boxes.

Now you can visit your sites from the outside but if your server connects to an outside site it will appear to be coming from the RED interface on your firewall instead of the alias IP you set up. That’s a problem for a lot of reasons. Email, firewall rules from other machines, etc., start to freak out.

Here’s the fix:
* Turn on ssh (system, SSH status, enable)
* Use putty or whatever, to log into port 222 (not 22). (or just use the router’s keyboard/monitor)
* Edit your /etc/rc.d/rc.firewall.local file. It’s very important to put this in the right place. This should go beneath the line containing start) and above the line containing ;;

/sbin/iptables -t nat -I POSTROUTING -o eth1 -s 192.168.1.2 -j SNAT --to-source 23.23.23.23
Make sure you change eth1 to eth0 or whatever your red NIC is named. (go to status, network status and look for the red font). Also, change 23.23.23.23 to whatever the is external IP you want to use (same IP as used in the alias step above).

Reboot. Test it. On your DMZ server install lynx, the text based browser. For Ubuntu that’s sudo apt-get install lynx

At the command line type lynx -dump whatismyip.com
That should spit out something like Your IP is 23.23.23.23
If your DMZ server is running Windows just visit whatismyip.com in a web browser. It should return the IP address of the alias IP you configured earlier and NOT the IP address of your router’s red interface.

*note IPCOP will not let you simply set your DMZ servers up with real IP addresses and use transparent DMZ mode like you can with the Sonicwalls. You have to put them on a subnet separate from the red interface and port forward.

*note2 – Because IPCOP can’t support transparent DMZ mode you have to set the gateway for your DMZ boxes to the DMZ IP address of your IPCOP and not the one provided by your ISP.

*If you want to turn on intrusion prevention read the following words I found at Snort.org

Just to clarify, guardian is not an addon to ipcop, it’s just a program that read snort files and modifies the linux firewall using iptables. Once you get a good sample of your network traffic viewing snort logs, you should get a general idea of what to enable/disable in the SNORT rules. To test it, just run a port scan to the device, and then try to go into the internet from the same device. To make extra sure the blocking is done, you can vi the iptables file in /etc. You should see the ip that’s blocked. It’s not hard to set it up. What’s harder is to configure rules in snort and the ignore list of guardian.

October 17, 2007

Houndwire Private Launch

Filed under: Media,Technology — Kirk @

You can get a sneak peak at the site I’ve been working on by clicking on the logo and entering woofwoof as the password. We’re going to launch publicly in the not too distant future.

October 10, 2007

Houndwire Private Beta

Filed under: Economics,Gamut,Law,Random Thoughts,Technology — Kirk @

There’s a saying about startups. That you have to work 12 hour days. Well it’s all a lie. It’s more like 14 hour days. I’m really, really in the zone right now though. My brain’s hanging in there but my eyes are nearing their limit. Visine is keeping me in the hunt.

I’m presenting the site to a bunch of investors on Monday. I’m going to stand up in front of a executives and talk for 20 minutes. Normally I’d don’t like talking in public but I can ramble on about journalism and technology for hours at a stretch.

Artist is working on a logo for tomorrow. Looking for security holes. Fixing bugs.

Right now HoundWire.com is password protected but if you’re interested just leave me a message at
kirk@YOURPANTSabinventio.com Just be sure to remove YOURPANTS first (that’s a creepy anti spam technique FYI).

Looks like I might have a patent app in the works for an advertising idea I had this afternoon. Some lawyers are looking into it.

New RadioHead album is definitely worth the price, whatever it is. Also listening to M.I.A. Imagine Bjork in a jungle hunting Gwen Stefani down with an AK-47 and you’ll get the gist of it.

I can say, without hesitation, that I’ve learned more in the last few months than during any other point in my life.

Here’s some marketing stuff I’m working on for HoundWire.

We aim to provide a replacement for the newspaper and an outlet for all local writers and journalists. Owning a printing press shouldn’t be a requirement for communicating with other citizens. The front page of any given community consists of news voted on by the users. You can also vote on comments so if someone is consistently insightful it will be apparent.

Finally, a brilliant quote from Tim O’Reilly

” Alas, I find the Web 3.0 arguments as clear evidence that the proponents don’t understand Web 2.0 at all. Web 2.0 is not about front end technologies. It’s precisely about back-end, and it’s about meaning and intelligence in the back end.

Click the photo for credits.

October 7, 2007

Paul Graham Was Right, Here’s how to get started on that server.

Filed under: Economics,Efficiency,Predictions,Technology — Kirk @

As Paul Graham pointed out, the cost of starting a web based business is approaching zero. As someone who’s starting up a .com I can say that the cost has definitely not reached zero yet but Mr. G’s point remains correct. The new barrier to entry is knowledge. Servers may be cheap but a lack of knowledge means you’ll need to bring in expensive consultants to setup your server and network. My goal here is to help the person with an idea and some basic coding skills to get a simple but fast server up and running.

I’m going to cut to the chase for those short on time (it’s a 2800 word post). In order of effectiveness, here are the tweaks I made to my setup to increase performance:

  • Index tables – easy to do, massive speedup
  • Try switching your tables from MyISAM to InnoDB. This is not a sure thing and there are tradeoffs but it really sped things up for me
  • Turn on Apache’s mod_deflate. My pages dropped in size from 100KB to roughly 17KB. Faster page loads, less bandwidth.
  • Tune your my.cnf file. Out of the box MySQL assumes you have a very slow server.

All said and done, without changing the software, I dropped page load times from .4 seconds to about .04 seconds. Making the thing run 10x faster using more hardware would have been rather moronic

If Web 2.0 is referred to as the Read/Write Web then Web 1.0 should be remembered as the Read Web. The implications for what type of server you’ll need are huge. You could get away with slow drives back then because your rarely changing data would probably be cached in RAM. If you look at a site like Digg it’s another story. Hundreds of people are writing comments, submitting stories, voting on stories, and engaging in various other activities that can be logged to improve the site.

(Updated thoughts: If the CPU again becomes the bottleneck then optimization will become arguably more important. You’ll just optimize for the CPU instead of IO. My application has a quality dial. I can scale back the quality of the results depending on the load on the server.)

There are lots of good bits of knowledge about the various steps involved in building a LAMP server scattered throughout the web but nothing I’ve found really compiles the information into a usable guide. This post is going to focus on server hardware and networking. If this post gets a decent response I’ll take my book of notes about software and turn it into a software HOWTO.

Just a little background. I’m not a the best software engineer nor a hardware expert. The site I’m launching, HoundWire.com, is really about journalism and geography (that’s ‘hyperlocal content aggregation’ if you’re a hipster). The fact that I was able to build a prototype without bringing in expensive consultants and expensive hardware probably didn’t hurt my cause when it came to getting funded.
I’m going to assume that you’re building a database driven web site using Linux-Apache-MySQL-PHP. LAMP is a good place to start if you’re not a computer scientist and just want a working prototype.

Hardware:

8 Gigs of RAM now costs in the neighborhood of $300. Dual core processors are fast, cheap, and getting cheaper. Run of the mill PCs are equipped with stupendously fast server grade PCI-Express slots. For under a thousand dollars you can put together a seriously fast system with one major shortcoming. The storage system.

From chapter 6 of the MySQL high performance Book:

http://dev.mysql.com/books/hpmysql-excerpts/ch06.html

The fundamental battle in a database server is usually between the CPU(s) and available disk I/O performance; we’ll discuss memory momentarily. The CPU in an average server is orders of magnitude faster than the hard disks. If you can’t get data to the CPU fast enough, it must sit idle while the disks locate the data and transfer it to main memory…

This all means that the first bottleneck you’re likely to encounter is disk I/O. The disks are clearly the slowest part of the system. Like the CPU’s caches, MySQL’s various buffers and caches use main memory as a cache for data that’s sitting on disk. If your MySQL server has sufficient disk I/O capacity, and MySQL has been configured to use the available memory efficiently, you can better use the CPU’s power.

IOPS are a good measure of disk performance on databases. The average consumer grade drive can handle 100-150 IOPS. One 15,000 RPM Seagate Savvio is in the 300s. My Raptor RAID array can probably handle 400+. Now consider that a good CPU can handle nearly 100,000 IOPS. Super expensive RAM based drives are useful because they eliminate the drive bottleneck.

If your database rarely changes then disk IO is much less of an issue because most of your data will be cached in system memory anyway. But newish websites like Digg or Reddit have constantly updating discussions in their comments sections. If you want to harness the brain power of the masses you’re going to need a setup that can write that information to a disk at some point.

Desktop PCs were rarely used as servers in the past because they were limited by the PCI bus. Gigabit network cards and a RAID array could easily swamp the meager bus. Now we’re blessed with PCI-Express and the difference between a desktop and server has more to do with reliability than performance. Reliability is nice but you can save a ton of money if you don’t need it.

The bottleneck on your fledgling system probably isn’t going to be your quad core CPU. If you’re squeezing Apache and MySQL into the same box to keep costs low you’ll need to have a good storage setup. So that’s where I’ll begin.

Drives:

http://www.texmemsys.com/files/f000164.pdf Understanding IOPS

Hard drives are like sports cars. If you take a 4000 horsepower funny car to the Nurburgring it’ll probably lap slower than a Mazda Miata. A drive that boots Vista in 9 seconds may not be very good at randomly writing to a database, which is what you’re probably going to be doing. Drives that specialize in high load database activity, like the much heralded Seagate Savvio, can be had for around $350. That will only get you 36 Gigabytes but you have to ask yourself; is your web app really going to need a Terrabyte of storage? You can argue that more RAM will solve any problem but at some point user input has to be written to the database.

Most consumer grade PCs do not have SAS connectors but you can get a PCI-Express SAS adapter for under $200. So for under $600 you can turn your funky desktop PC with a PCI Express slot into a pretty darned powerful database server. You can add a drive and turn it into a RAID 0 array (72GB) for around a grand. SAS drives are designed for enterprise use and so are much more reliable than a re-purposed desktop drives under heavy load. In other words your RAID 0 array will last a longer.

I went the cheap route and put two WD Raptors (SATA instead of SAS) in a RAID 0 array using a 3Ware RAID card. Ill know within a few months whether or not my idea is going to take off so I’m more concerned with speed and cost than reliability. Whatever you do, don’t plug a SAS drive into a SATA port. I tried once with an adapter cable and fried the southbridge. You can plug a SATA drive into a SAS port though.

It could be argued that your RAID card should be the most expensive component in your server.

(Random anecdote) I once setup two 15K RPM SAS drives in RAID 0 for a mass ghosting operation and we easily saturated the gigabit switch. The photo of that very drive (3.5”) is still used in the Wikipedia SAS article.

Back up frequently and if a drive goes you can just reinstall on the 2nd drive. Doubling the number of drives to make it RAID 10 may be a bit pricey especially considering you won’t see a performance increase. Compared to RAID 0, RAID 5 will slow you down and cost more but you lose the single point of failure.

Motherboard/Case

I bought an ASUS based barebones kit. I’m happy that it can boot from the PCI-Express slot, not happy that it doesn’t support all of my RAM. The ASUS P3-P5G33 might be a better bet if you want support for more than 3Gigabytes of RAM. I’m not saying this is the best option but it works and it’s fairly inexpensive.

Un-interruptible Power Supply

Get a UPS, you’ll sleep better. Power surges aren’t the real issue. We have a big AC unit that kicks in and dims the lights. Mini brownouts ala Sim City. You don’t want that stress on your server. Also, get a Kill-A-Watt or something similar so you can see the power draw and don’t accidentally exceed the rated battery capacity of the UPS.

Cost – On a budget

Barebones PC – $200
4Gigs of RAM – $170
CPU – $230
UPS – $100
————
$700

Storage System:
SAS/SATA RAID card for the PCI-Express Slot – $300
Two Raptors or one Seagate Savvio – $350
————
$650

=====================
$1,350

Cost – Expensive, bang for buck

SAS/SATA RAID card for the PCI-Express Slot – $300
2 x Hyperdrive4s = $8,800 (32GB and 77,000? IOPS)
———–
$9,100

Some Predictions:
* RAM based SSD drives are going to get more popular as people realize that IOPS are more important than drive capacity or peak read performance in web servers.
*Someone will eventually release a RAM Based storage system with SATA-2 support that doesn’t require ECC memory. At that point, for under $2,000, you’ll be able to build a 32GB RAM drive with something like 80,000 IOPS and transfer rates of more than 220 MB/s, without the need for a RAID card.
* People will stop complaining about how slow Vista is. Maybe Microsoft should release this hypothetical device so bloatware truly no longer matters.
*Once that happens hard drives will find a niche as mass storage devices.
Flash based SSDs will remain hugely popular in mobile devices and laptops due to low power consumption.
* Power supplies will start coming with built in batteries which can power your volatile memory if the power shuts off.
* As performance becomes nearly free reliability will become a more important factor when buying a server.
* LAMP setups will become more proactive in tuning themselves. This will remain application dependent but a my.cnf generator based on your system specs will probably emerge. The bottleneck will become application design.
* Caching systems will vanish due to their complexity as drives cease to be the bottleneck (except maybe in very large systems).
* As performance ceases to be an issue we’ll see all sorts of interesting new applications. If Web 2.0 is/was about harnessing the knowledge of the masses and your database server chokes when it’s trying to write to the DB.
*Database performance consulting will become unnecessary because even sloppy code will run quickly. If your page loads in .004 seconds instead of .02 seconds nobody will care.
*I get the feeling Canonical(Ubuntu people) will eventually sell prebuilt LAMP servers. Fonality used to give away Trixbox as a Linux distribution but they recently started selling a pre-built phone “appliance” in addition to support.
* Future hard drives will be large PCI-Express cards populated with bunches of 4GB memory modules running at the speed of the PCI-Express bus. Current RAID cards have upgradeable memory modules for caching purposes. What if you had a virtual hard drive the same size as the RAM cache on the card? (ed. I was close on this one. Check out the video)
Some Random Thoughts:
*The Gigabyte RAM disk was pretty popular and insanely fast but no follow up was ever released in spite of consumer demand. Possibly because you could build an insanely fast storage system with a few of them in RAID which would completely devastate the enterprise hard drive market in a few short months. Yeah that sounds like a conspiracy theory but whenever I write about the HyperDrive4 I get a bunch of blog visitors from web mail accounts.

If Houndwire.com gets some traffic I’m going to push for some help for adding new features and a maybe a HyperDrive.

Networking:

Connectivity – T1s cost about $400 a month but you get 1.5 megabits upstream. Downstream isn’t great but you want to send out web pages not download mp3s right? Use someone else to host your images if you have a lot of them. No sense clogging up you T1 with big jpegs. You can host from a residential internet connection while you’re prototyping. Just be prepared to update DNS and configure port forwarding (and violate TOS). I used DD-WRT on a spare Buffalo to create a wireless bridge and hosted my little LAMP Laptop at home for a while unbeknownst to the guy who’s name is on the bill.


Use solid 10/100 switches instead of fast but potentially flaky gigabit switches, especially for DMZ switches where you’re not going to exceed a few Mb/s anyway. Don’t make your own cables unless you’re a masochist. You can get a multitude of colors cheap from NewEgg.com, or TigerDirect to stay organized. Have a couple crossover cables handy for computer to computer connections (IPCOP -> Server).

Firewall – I’m using IPCop because we had a spare PC lying around. IPCOP is like DD-WRT on crack without the wireless settings. Smoothwall is a cousin of IPCOP and a little easier to get going for the beginner but I prefer IPCop(you don’t have to register to read the manual). If you’ve had a big fancy corporate firewall before IPCOP will disappoint you. My problem was putting the web server on the DMZ with a real IP address. IPCop’s ORANGE zone can NOT handle using an IP on the same subnet as the RED. There are ways to hack it by putting the DMZ IPs in the RED zone and port forwarding to ORANGE but even then you’re only addressing the one IP problem. I’m going to look into DevilLinux which is apparently better suited to non residential configurations with multiple real IPs. The alternative is paying $700 for a 3 port SonicWall. DIY may not be worth your time, especially if you don’t have a spare PC to cannibalize.

Ideally you’d find an old laptop and put a cheap solid state drive in it for reliability. Old ToughBooks are great if you can find one. I once bought a flash to IDE converter and installed IPCOP on a camera sized memory chip. No moving parts means it’s cooler and less likely to fail. It’s still running as far as I know.

Software

For the sake of simplicity get Ubuntu Server Edition. It has a LAMP option when you’re installing which saves a lot of time. Get the 64bit version unless you’re not sure if your CPU is too old.

Don’t be afraid to break stuff in the short run. If you have to reinstall your LAMP stack a few times it’s just practice for down the line when your overheating Raptor RAID 0 array bites you.

MySQL Performance Tuning
Add the code for benchmarking.

Stuff I’m running:
Apache 2
PHP 5
MySQL 5
64 bit v. 2.6 Linux kernel

SSH
Samba
Webmin
Webalyzer
PHPMyAdmin
Here’s a video of an overcaffeinated guy with an equally hype dog explaining the basics.

The linux top command is your friend. You can watch MySQL CPU gobble up less and less CPU time as you optimize things.

If Ubuntu Server edition comes with all of that stuff installed in addition to a functional LAMP stack they’ll have a winner. Sell support to fledgeling startups using RedHat’s model.

Config Files and performance tuning:
Databases:
You can spend days tweaking my.cnf but the big gains early on come from making sure you’re indexing tables properly. If you’re running a query like “SELECT names FROM people WHERE age > 40″ make sure you have an index on the age column.

Newer Posts »

Powered by WordPress