Badcaps.net Forum
Go Back   Badcaps Forums > General Topics > General Computer & Tech Discussion
Register FAQ Calendar Search Today's Posts Mark Forums Read

 
Thread Tools Display Modes
Old 07-03-2019, 01:56 PM   #1
CapLeaker
Leaking Member
 
CapLeaker's Avatar
 
Join Date: Dec 2014
City & State: Atlantic Canada
My Country: Canada
Line Voltage: Ground, 0Hz
I'm a: Hobbyist Tech
Posts: 4,173
Default RAID5 failure: 2 bad HDD's at the same time

Well, I guess I ran out of luck and shit hit the fan all right at home, ugh!
I am running a WD PR4100 16TB NAS. All was good until I noticed some slow file transfer occasionally. Done a HDD test and drive 2 failed. O.K. no problem, order new drive, replace drive and rebuild RAID5. Easy, right? Well not so fast. Wouldn't you suppose, the drive 3 failed, during the half way mark of rebuilding the RAID5 array on drive 2?

Question is: How do I recover all the files or recover the RAID?
CapLeaker is offline   Reply With Quote
Old 07-03-2019, 03:03 PM   #2
Curious.George
Badcaps Veteran
 
Join Date: Nov 2011
Posts: 1,590
Default Re: RAID5 failure: 2 bad HDD's at the same time

Quote:
Originally Posted by CapLeaker View Post
Well, I guess I ran out of luck and shit hit the fan all right at home, ugh!
I am running a WD PR4100 16TB NAS. All was good until I noticed some slow file transfer occasionally. Done a HDD test and drive 2 failed. O.K. no problem, order new drive, replace drive and rebuild RAID5. Easy, right? Well not so fast. Wouldn't you suppose, the drive 3 failed, during the half way mark of rebuilding the RAID5 array on drive 2?
This is a known problem with RAID -- esp with larger drives! The time it takes to rebuild the array represents a sizeable window in which a second failure can eat your lunch...

Of course, the "cost" (window of vulnerability) of rebuilding the failed drive will vary (e.g., RAID5 being more expensive than RAID1).

Last edited by Per Hansson; 07-10-2019 at 10:14 AM.. Reason: fixed quote
Curious.George is offline   Reply With Quote
Old 07-03-2019, 03:55 PM   #3
eccerr0r
Solder Sloth
 
eccerr0r's Avatar
 
Join Date: Nov 2012
City & State: CO
My Country: USA
Line Voltage: 120VAC 60Hz
I'm a: Hobbyist Tech
Posts: 4,412
Default Re: RAID5 failure: 2 bad HDD's at the same time

As always RAID is not a backup.
Question is, how bad are the drives. If you pull them up on their own on a PC (DON'T WRITE TO THEM!) can you at least read a few bytes? SMART information?

If you have two drives completely dead, you're probably SOL. If just one is dead and one has a few bad sectors, depending on your NAS firmware you may be able to recover something...unfortunately I don't have any experience with WD's RAID, just Linux mdraid...
eccerr0r is offline   Reply With Quote
Old 07-03-2019, 05:04 PM   #4
ChaosLegionnaire
HC Overclocker
 
Join Date: Jul 2012
City & State: Singapore
My Country: Singapore
Line Voltage: 240VAC 50Hz
I'm a: Hobbyist Tech
Posts: 1,618
Default Re: RAID5 failure: 2 bad HDD's at the same time

thats why i dont buy nas boxes off the shelf. they are typically composed of a homogeneous set of drives so this means that the drives have a tendency to all fail at the same time! talk about very convenient planned obsolescence there! im sure the companies that make these nas boxes couldnt care less either if it means more buying and more money!

therefore, i prefer to diy my own nas and thus pick drives with different platter density technologies and different number of heads etc. so they would fail at different times instead.

do what eccerror said. for me, i fire up linux, pull the smart data, see how many pending, uncorrectable and reallocated sectors there are and run gnu ddrescue to pull as much data off the bad drive as possible if its still acessible and not bricked in which case the drive is totally unaccessible and undetectable neither by the bios nor os.

if the drive is bricked and the data is critical, send it to a data recovery company. the fee could cost thousands of dollars for the recovery.
ChaosLegionnaire is offline   Reply With Quote
Old 07-03-2019, 05:59 PM   #5
eccerr0r
Solder Sloth
 
eccerr0r's Avatar
 
Join Date: Nov 2012
City & State: CO
My Country: USA
Line Voltage: 120VAC 60Hz
I'm a: Hobbyist Tech
Posts: 4,412
Default Re: RAID5 failure: 2 bad HDD's at the same time

I've been able to successfully reassemble my Linux md-RAID5 arrays that were destroyed by two disk failures, but there's no guarantee that the data I pull off is accurate. However I was able to get a good portion of the data off after the failure.

Which reminds me, I need to backup my array again soon...
eccerr0r is offline   Reply With Quote
Old 07-03-2019, 06:41 PM   #6
CapLeaker
Leaking Member
 
CapLeaker's Avatar
 
Join Date: Dec 2014
City & State: Atlantic Canada
My Country: Canada
Line Voltage: Ground, 0Hz
I'm a: Hobbyist Tech
Posts: 4,173
Default Re: RAID5 failure: 2 bad HDD's at the same time

Well, drive 2 is FUBAR. Won't read from it period. Not sure why Drive 3 has some bad sectors and I was able to get the important stuff off the Raid5. So that is good. However I am not able to recover the whole Raid array. But that is o.k. I kept too much junk anyway.
CapLeaker is offline   Reply With Quote
Old 07-03-2019, 10:37 PM   #7
Curious.George
Badcaps Veteran
 
Join Date: Nov 2011
Posts: 1,590
Default Re: RAID5 failure: 2 bad HDD's at the same time

Quote:
Originally Posted by ChaosLegionnaire View Post
therefore, i prefer to diy my own nas and thus pick drives with different platter density technologies and different number of heads etc. so they would fail at different times instead.
They still have many things in common: the hardware/software that's implementing the array, power supply, thermal experience, software that is accessing the array, etc.

I prefer to trade robustness for convenience -- I only spin up a drive when I'm accessing its contents. If that content is munged, then I have to consider how much of the other content may be at risk. Or, if the box that I'm using to access that drive may, instead, be the culprit.

[Software/firmware/clients/apps/PEBKAC have been known to be buggy]

As I don't expect to encounter problems, when/if I do, it gives me a moment to think about what's happening before I propagate a failure (to other copies of the data).
Curious.George is offline   Reply With Quote
Old 07-04-2019, 12:03 AM   #8
Uranium-235
Muffins
 
Uranium-235's Avatar
 
Join Date: Aug 2007
City & State: tehas
My Country: US
Line Voltage: 120VAC 60Hz
I'm a: Professional Tech
Posts: 3,246
Default Re: RAID5 failure: 2 bad HDD's at the same time

this is why for large arrays, raid 6 is a better idea
__________________
Cap Datasheet Depot: http://www.paullinebarger.net/DS/
^If you have datasheets not listed PM me
Uranium-235 is offline   Reply With Quote
Old 07-04-2019, 12:24 AM   #9
diif
Badcaps Veteran
 
Join Date: Feb 2014
City & State: Midlands
My Country: England
I'm a: Professional Tech
Posts: 4,276
Default Re: RAID5 failure: 2 bad HDD's at the same time

This is why RAID IS NOT BACKUP.
diif is offline   Reply With Quote
Old 07-04-2019, 12:53 PM   #10
Stefan Payne
Badcaps Veteran
 
Join Date: Dec 2009
City & State: Northern Germany
My Country: Germany
Line Voltage: 230VAC/50Hz or 400VAC/3P/50Hz
I'm a: Knowledge Seeker
Posts: 1,217
Default Re: RAID5 failure: 2 bad HDD's at the same time

Quote:
Originally Posted by CapLeaker View Post
Well, I guess I ran out of luck and shit hit the fan all right at home, ugh!
I am running a WD PR4100 16TB NAS. All was good until I noticed some slow file transfer occasionally. Done a HDD test and drive 2 failed. O.K. no problem, order new drive, replace drive and rebuild RAID5. Easy, right? Well not so fast. Wouldn't you suppose, the drive 3 failed, during the half way mark of rebuilding the RAID5 array on drive 2?
You _NEVER EVER EVER_ do that!
If a Drive in a RAID Array fails, you build a new one and copy the content from the old to the new one as long as it works. Start with the most important things.

Also RAID is NOT a replacement for the BACKUP!

So all you can do right now is to clone the drives and hope you have everything you need, then rebuild the RAID with the new drives....
Stefan Payne is offline   Reply With Quote
Old 07-05-2019, 07:36 PM   #11
CapLeaker
Leaking Member
 
CapLeaker's Avatar
 
Join Date: Dec 2014
City & State: Atlantic Canada
My Country: Canada
Line Voltage: Ground, 0Hz
I'm a: Hobbyist Tech
Posts: 4,173
Default Re: RAID5 failure: 2 bad HDD's at the same time

Quote:
Originally Posted by Stefan Payne View Post
You _NEVER EVER EVER_ do that!
If a Drive in a RAID Array fails, you build a new one and copy the content from the old to the new one as long as it works. Start with the most important things.

Also RAID is NOT a replacement for the BACKUP!

So all you can do right now is to clone the drives and hope you have everything you need, then rebuild the RAID with the new drives....
Interesting... So you are saying to clone the bad HDD's in the RAID 5 array with clonezilla to a new drive and put it back into the array? I thought the array knows the HDD by serial number or something, so it would detect it as a "new" drive?

No, I've lost nothing important and that is a good thing. I do have a few offline HDDs. Some of the stuff on the RAID array was so old, it gives me a chance to clean up my file storage. Rather than copying everything and deleting the stuff no longer wanted, I just revesed it by copying only the stuff I want. This gives me more space.
CapLeaker is offline   Reply With Quote
Old 07-05-2019, 07:40 PM   #12
CapLeaker
Leaking Member
 
CapLeaker's Avatar
 
Join Date: Dec 2014
City & State: Atlantic Canada
My Country: Canada
Line Voltage: Ground, 0Hz
I'm a: Hobbyist Tech
Posts: 4,173
Default Re: RAID5 failure: 2 bad HDD's at the same time

Quote:
Originally Posted by Uranium-235 View Post
this is why for large arrays, raid 6 is a better idea
that is what I am aiming for, something where 2 drives can fail. Anyone tried the SHR2 from Synology?
CapLeaker is offline   Reply With Quote
Old 07-05-2019, 10:42 PM   #13
Curious.George
Badcaps Veteran
 
Join Date: Nov 2011
Posts: 1,590
Default Re: RAID5 failure: 2 bad HDD's at the same time

Quote:
Originally Posted by CapLeaker View Post
that is what I am aiming for, something where 2 drives can fail. Anyone tried the SHR2 from Synology?
Note that you don't need a second "disk failure" -- a URE (during the rebuild) will effectively render a RAID5 (w/ failed disk) "broken". Make sure your NAS is doing patrol reads of the entire array lest you discover that URE when you can least afford it!
Curious.George is offline   Reply With Quote
Old 07-07-2019, 02:44 AM   #14
Stefan Payne
Badcaps Veteran
 
Join Date: Dec 2009
City & State: Northern Germany
My Country: Germany
Line Voltage: 230VAC/50Hz or 400VAC/3P/50Hz
I'm a: Knowledge Seeker
Posts: 1,217
Default Re: RAID5 failure: 2 bad HDD's at the same time

Quote:
Originally Posted by CapLeaker View Post
Interesting... So you are saying to clone the bad HDD's in the RAID 5 array with clonezilla to a new drive and put it back into the array?
Its worth a try.
You might want to clone the other HDDs as well or move them immediately over to a new RAID Array.

Quote:
Originally Posted by CapLeaker View Post
I thought the array knows the HDD by serial number or something, so it would detect it as a "new" drive?
No, that should be written in the MBR or wherever it does that.



Anyway, rule of the thumb:
If one Drive in a RAID Array dies, do not rebuild it, backup your data and move it over to another Array!

Because when all are the same make/model, other drives failing is highly likely.
Stefan Payne is offline   Reply With Quote
Old 07-08-2019, 09:06 AM   #15
CapLeaker
Leaking Member
 
CapLeaker's Avatar
 
Join Date: Dec 2014
City & State: Atlantic Canada
My Country: Canada
Line Voltage: Ground, 0Hz
I'm a: Hobbyist Tech
Posts: 4,173
Default Re: RAID5 failure: 2 bad HDD's at the same time

Cloning the HDD with Clonzilla, didn't work for me.
CapLeaker is offline   Reply With Quote
Old 07-09-2019, 07:03 AM   #16
Curious.George
Badcaps Veteran
 
Join Date: Nov 2011
Posts: 1,590
Default Re: RAID5 failure: 2 bad HDD's at the same time

Quote:
Originally Posted by CapLeaker View Post
Cloning the HDD with Clonzilla, didn't work for me.
Without knowing how (and WHERE!) the particular NAS stores the array configuration data on the drive, there's no way of knowing if CZ will even SEE it as "data". CZ cheats by only copying the portions of the drive that it KNOWS to contain data (i.e., by understanding file systems and other common disk structures). This lets it skip over the parts of the medium that it thinks are "empty" -- otherwise CZ would take as long as a bytewise copy operation.

(Watch CZ in action and you will see how the thruput changes over the course of the operation)

You may have to resort to a bytewise copy to be sure you are preserving all of the "stuff that matters" -- to your NAS!

And, you're still stuck with the highly likely URE interfering with that operation -- the U in URE -- without the benefit of the redundant drives to compensate for it.

16TB = 128,000,000,000,000 bits = 1.28 x 10^14. Assume a URE rate of 1 in 10^14...
Curious.George is offline   Reply With Quote
Old 07-12-2019, 08:12 PM   #17
CapLeaker
Leaking Member
 
CapLeaker's Avatar
 
Join Date: Dec 2014
City & State: Atlantic Canada
My Country: Canada
Line Voltage: Ground, 0Hz
I'm a: Hobbyist Tech
Posts: 4,173
Default Re: RAID5 failure: 2 bad HDD's at the same time

that's why i thought it's not possible. I have to wait for some drives. Prime day is coming and I need a shit load of HDD's and a new NAS.
CapLeaker is offline   Reply With Quote
Old 07-13-2019, 01:34 PM   #18
Curious.George
Badcaps Veteran
 
Join Date: Nov 2011
Posts: 1,590
Default Re: RAID5 failure: 2 bad HDD's at the same time

Quote:
Originally Posted by CapLeaker View Post
that's why i thought it's not possible. I have to wait for some drives. Prime day is coming and I need a shit load of HDD's and a new NAS.
dd(1) should clone the drive completely (there may be some issues with portions of the MBR under some OS's).

Of course, now you're faced with the time it takes to read the entire medium.

And, the real possibility that dd(1) will encounter a URE somewhere along the way (you'll have to sort out what "value" should be substituted for the "unknown" value, in that case).

ISTR CZ has an option to just fall into dd(1) mode (instead of trying to understand the filesystem's structure)...?
Curious.George is offline   Reply With Quote
Old 07-14-2019, 09:25 AM   #19
CapLeaker
Leaking Member
 
CapLeaker's Avatar
 
Join Date: Dec 2014
City & State: Atlantic Canada
My Country: Canada
Line Voltage: Ground, 0Hz
I'm a: Hobbyist Tech
Posts: 4,173
Default Re: RAID5 failure: 2 bad HDD's at the same time

I can clone it with dd or Clonezilla no problem, but my NAS sees it as a new HDD.
CapLeaker is offline   Reply With Quote
Old 07-14-2019, 11:43 AM   #20
Curious.George
Badcaps Veteran
 
Join Date: Nov 2011
Posts: 1,590
Default Re: RAID5 failure: 2 bad HDD's at the same time

Quote:
Originally Posted by CapLeaker View Post
I can clone it with dd or Clonezilla no problem, but my NAS sees it as a new HDD.
If it is truly cloning the entire media surface, then the NAS must have some NVRAM in which it stores data from drive inquiry commands. E.g., I track drives in my "disk sanitizer" by storing the serial number, model number, etc. from the drive inquiry in a large database. So, when I next encounter the drive (e.g., when I install an OS image), I know its history.

Usually, the drive is used to store this stuff (in a special partition or in the "unused" area right after the MBR).

Regardless, this is one of the ways RAID f*cks you; had that been a "regular" disk, you could have thrown it in another machine and accessed its contents like normal (losing whatever part of the disk that may be afflicted with UREs).

If you've already written off the data (as lost), you could try to recover the contents using one of the Windows/Linux tools that claim to be able to do so. At the very least, it will be a learning experience (and COULD yield positive results).

Google "raid recovery" (and, please, report on any results!)

Last edited by Curious.George; 07-14-2019 at 11:45 AM..
Curious.George is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump



Badcaps.net Technical Forums 2003 - 2019
Powered by vBulletin ®
Copyright ©2000 - 2019, Jelsoft Enterprises Ltd.
All times are GMT -6. The time now is 09:58 PM.
Did you find this forum helpful?