Subscribe via E-mail

Your email:

Follow StackIQ

Resources

White Paper - Boosting Retail Revenue and Efficiency with Big Data Analytics

 

GigaOM Research report - Scaling Hadoop Clusters: The Role of Cluster Management:

Download

White Paper - The StackIQ Apache Hadoop Reference Architecture:

Download

Current Articles | RSS Feed RSS Feed

Hadoop Management: How to Recover From Hadoop Cluster Failures

  
  
  

iStock 000018179637XSmall resized 600There are three major reasons Hadoop is so popular in commercial IT: It can handle big data, it's open source, and it's very robust. But anything built by man is destined to fail, even if it's infrequently, including Hadoop clusters. Now what?

To understand how to recover a failed cluster it's important to understand why it fails. Hadoop may be very robust, but there is one very big, well documented single point of failure, -- the NameNode. The NameNode stores the metadata, the data about the data. This metadata includes owners, permissions, etc. If the NameNode crashes, the data still exists, but there's no map so it can be used.  And, of course, all that is very important. Hadoop 1.0 doesn't support automatic recovery if the NameNode fails. The Hadoop management staff at Yahoo!, one of the largest implementations of Hadoop, has learned NameNodes typically fail for three reasons: when they're misconfigured, when there's a network issue or when clients misbehave. Lower on the list is hardware failure.

But in the event of failure all is not lost. NameNode data is backed up in both the FsImage and edit file and the Hadoop management staff replicates two copies of both the FsImage and edit file on separate hard disks. So, when the worst happens, the IT staff has several options to recover a bad cluster.

Better End-of-File Validation

The big problem with edit log files is that they include extra space, padding, at the end. This extra space makes it difficult to find the end of the file because it creates an end-of-file condition. One solution to this is to locate the OP_INVALID because it's a single bit and it’s highly unlikely that random corruption would produce a single bit. Another, more reliable way to find the end of the file is to locate the last transaction ID written to the file. Finding that last transaction means the end of the file can be verified.

HDFS Fsck

Old, reliable Fsck examines offline disks, checks the structure and fixes corrupted blocks. HDFS Fsck finds and repairs corrupt blocks. The problem with this method is that it only works on data, not metadata, and it only works on offline disks. 


Manual NameNode Metadata Recovery

If the Hadoop management staff correctly configures HDFS, it will protect the metadata better than a local file system because it stores multiple copies of all the data everywhere across the storage architecture.

Another point of Hadoop cluster failure is MapReduce, the re-execution mechanism that's supposed to improve reliability. During map tasks it allocates a few back-ups associated with that task and runs only when that task fails. This sounds good but the hitch is that the back-ups must be hand configured based on the reliability of the system. IT has to make a bet that straddles the risk of not allocating enough space with the cost of allocating too much.

All the methods described above require time and skilled staff. Regarding time, financial applications can't afford to go offline. No processing means no business. Regarding Hadoop management staff, the time spent hand tooling the fixes describe above is time wasted, time that could have been spent on application development for example. The other alternative is to use a commercial system. Whatever this package costs, the overall Hadoop implementation will still be dramatically less expensive than an end-to-end commercial system, both in dollars and staff.

Comments

Wow! This is another home run. Thanks so much for the post and keep them coming. This is very valuable information and I'd love to read more!
Posted @ Sunday, August 05, 2012 4:55 PM by Brian Johnson
Post Comment
Name
 *
Email
 *
Website (optional)
Comment
 *

Allowed tags: <a> link, <b> bold, <i> italics

Follow Us