04 January 2007

The Grid in My Basement, part 1

The name of the game is parallelism...in short: take apart a problem, break it up into independent pieces, and run as many of those independent pieces at once on separate computers (well at least on separate CPUs). This is nothing new...parallel computing has been around since Cro-Magnon Geek solved problems by dropping boxes of punch cards bearing almighty FORTRAN into card reading machines and then lurking impatiently in the line printer room for pages of greenbar while multi-gazillion dollar CPUs the size of refrigerators cogitated about his fast fourier transforms and what not. What is compelling in THIS century is that you can do it cost effectively. Ok, not even cost effectively - downright cheap.

In cheapest form, parallel computing has become nearly free. Witness the Elastic Compute Cloud over at Amazon (buzz kill: ECC is in beta and they aren't accepting new users at the moment; double plus bad: the largest number of machines you can have is something like 20) where you can rent time on virtual machines for cents per hour. In more expensive forms parallel computing is, well, still pricey. If you're a high energy physicist or financial type with a big grid and you are running name brand hardware you are putting up some mighty big dollars.

So I find myself - as an employed practitioner wanting to test my employer's new software, and as an entreprenuer designing and trying new concepts in search of the Next Big Geeky Thing - wanting to have access to my very own grid. Thus is born the idea of the Grid in my Basement, or as I prefer to call it: The Data Basement.

My plan is to build and deploy a basic grid capable of doing real work in a cost effective manner. I want to accomplish this with some fairly real-world parameters, so I need real computing horsepower. So after hours and hours of combing through CPU specs, motherboard specs, CPU cooler specs (naw, I would NEVER overclock...), power supply specs and the like I now have boxes and boxes of cool stuff en route from my good friends at NewEgg (what self-respecting techno-weenie doesn't love newegg?). Over the next few weeks I'll write about the Data Basement as it gets built out and evolves into something useful. With any luck I'll also publish a tutorial with photos about rolling your rackables.

Budgetary Issues and Physical Plant

I need to build my Data Basement on a budget. After all, I still need to pay the mortgage, feed the family, buy Hoosiers for the racecar, and pay for all the electricity my spiffy new grid will need. I randomly chose a budget in the range of $3,000 - $3,500. The grid will live in the basement, and I want to save space there, so I will use a single standard rack that I picked up in the past. The rack will sit on a wooden platform I'll build from scrap wood (cheap protection in case of minor flooding). Since my basement has reasonably high humidity, I'll put a dehumidifier plumbed to a wasteline to keep humidity levels under control. The grid will pull between around 1.4Kw (about 13 amps @ 110VAC), so there will not be a need for any special electrical work. I want UPS support eventually on the grid, but initially I'll just use some spike filters on each AC line. I already have broadband via DSL with a wired/wireless router, but I'll need a gig switch with several ports so that the grid nodes can communicate with each other at speed.

The next posting will discuss parts selection for the individual compute nodes.

No comments: