17 January 2012
CENSORED
10 February 2010
Hadoop HDFS: Deceived by Xciever
It's undocumented. It's misspelled. And without it your (insert euphemism for "really big" here) Hadoop grid won't be able to crunch those multi-hundred gigabyte input files. It's the "xciever" setting for HDFS DataNode daemons, and the absence of it in your configuration is waiting to bite you in the butt.
raised java.io.EOFException, with a traceback flowing down into DFSClient.
< property >
< name >dfs.datanode.max.xcievers< /name >
< value >4096< /value >
< /property >
Finally, why is the default value for this setting so low? Why not have it default to a more reasonable value like 2048 or 4096 out of the box? Memory and CPU is cheap, chasing infrastructure issues is expensive.
31 December 2008
Down Time == Productive Time
Back in my college days - long enough ago that mainframe computing was still the rage - I discovered the pleasure of The Holiday Break. Classes were over, everybody went home, the university was largely empty, and as a result you could grab all the computing time you wanted, and you could work for hours undisturbed. The Christmas holiday was always the best.
03 June 2008
Visible Measures wins 2008 MITX Technology Award!
Visible Measures (my current gig) won an award tonight at the Massachusetts Innovation and Technology Exchange (MITX) 2008 What's Next Forum and Technology Awards. We were recognized in the Analytics and Business Intelligence category - the same category that Compete (my old company) was entered in last year.
A lot of great companies were finalists in our category, including Progress Software, salary.com, SiteSpect, and Lexalytics. This was tough competition, which made winning this award all the more sweet. A big shout out to Version 2 Communications as well - we were their guests at the awards.
Visible Measures is an awesome company, with an extremely hard-working highly motvated team. I am extremely proud and humbled to be part of this company.
The MITX event was very nice. There were plenty of opportunities to network with a lot of interesting people doing a lot of cool stuff. It was great to listen to Larry Weber (Chairman of the Board for MITX and founder of W2 Group) host the awards and dispense free advice ("...with 37 offices worldwide - that's too much overhead..."). MITX honored Amar Bose, who gave a very interesting talk. Bose is legenday - at least in the New England high tech community and particularly within MIT, so hearing him speak live is a privilege.
The only downside to the evening was the fire alarm going off mid-way through the ceremony. This lead to a rather awkward pause in the action while the fire department made sure nothing was wrong.
22 April 2008
Hadoop Summit Slides
A few weeks ago I went to California for the Hadoop Summit. I posted a bunch of notes in real-time during the summit until the network connection became too flakey to continue.
The Yahoo folks have come to the rescue however. The slides from the presentations, which are tons better than my notes, are freely available on-line here. There are also slides from the Data Intensive Computing Symposium which was held the next day.
I wish I had know about the Data Intensive symposium as it looks really really interesting (not to mention an excuse to stay in Califorinia one more day...).
10 April 2008
Infrant/NetGear ReadyNAS NV+
Just picked up one of these last week. The plan is to use the box as a shared storage resource to back up family data (pictures, etc.), and to back up other systems, and the grid machines in the rack.
I was originally going to build a box to handle the task, but a friend of mine recommended the ReadyNAS server as a cost effective (and less labor intensive) alternative. This box is basically plug-and-play...the operating system is delivered in firmware, and you configure and operate the box via a web interface and with a program called RAIDar. The box speaks a variety of protocols and can talk to Windows, Linux, Macs, and streaming media players so it should get along well with all the servers, workstations, etc.
I bought a diskless version, and populated it with 2x500G Western Digital drives. Initially nothing worked and for a brief time I thought the server was DOA. After a bunch of trial and error I concluded that one of the WD drives was DOA. I brought the box up on 1 drive, configured things, and it just worked. NewEgg RMA'd the bad drive (and even gave me freebie shipping label to send the bad device back...good stuff).
I've got 2 more 500G drives arriving tomorrow - the box is hot-pluggable so in theory installation is simple. It should be interesting to get the box up to 2T with X-RAID and do some performance testing.
Product reviews of the ReadyNAS have been widely varied, but so far all things look positive. I'll post more about the box once I get my bad drive issues sorted out...