[Cialug] Replacing failed RAID 1 drive

David Champion dchamp1337 at gmail.com
Thu Oct 2 14:14:27 CDT 2014


That'll buff right out.

-dc

On Wed, Oct 1, 2014 at 5:46 PM, Rob Cook <rdjcook at gmail.com> wrote:

> That's a failed drive right there... when I got home tonight it was a
> screaming unholy racket. The last pic looks to be where the head had been
> machining a groove in the platter by the spindle. There are chunks on the
> platter and the entire drive is dirty.
>
> Popped in the new drive, booted up fine, rejoined the disk to the RAID via
> mdadm after copying the partition scheme over from the existing drive, now
> somewhere between the next 4-6 hours the array will be rebuilt. Next month
> I'll get another new drive (I went from 1.5Tb to 3Tb on the replacement)
> fail out the old drive and replace it with another 3Tb. Then expand the LVM
> and *magic* twice the space... I hope ;)
>
>
>
>
>
>
>
> On Wed, Oct 1, 2014 at 10:38 AM, Rob Cook <rdjcook at gmail.com> wrote:
>
> > Well after today I may be able to give that demo. Lets hope that it all
> > works properly.
> >
> > On Wed, Oct 1, 2014 at 8:39 AM, Matthew Nuzum <newz at bearfruit.org>
> wrote:
> >
> >> This would make a great topic for a LUG meeting sometime. Bring a box
> in,
> >> set it up with RAID then fail and swap a drive.
> >>
> >> It's one of those things we all know we should do, and many of us do it,
> >> but the actual hands-on experience of dealing with a failed drive is not
> >> nearly as common. (I've never done in with software RAID, only done a
> >> cold-swap with a hardware RAID controller)
> >>
> >> On Tue, Sep 30, 2014 at 5:35 PM, Rob Cook <rdjcook at gmail.com> wrote:
> >>
> >> > "When you say “LVM on top of RAID” I assume you mean something like
> >> this:
> >> >         /dev/sd* (physical block devices) => md0 (mdadm array) => pv1
> >> (LVM
> >> > “physical” volume) => vg1 (LVM volume group) => lv1 (LVM logical
> >> volume) =>
> >> > /mnt/foo (filesystem)"
> >> >
> >> > Yes, like that exactly.
> >> >
> >> > Ok, off to buy a new drive.
> >> >
> >> > On Tue, Sep 30, 2014 at 5:25 PM, Zachary Kotlarek <zach at kotlarek.com>
> >> > wrote:
> >> >
> >> > >
> >> > > On Sep 30, 2014, at 3:07 PM, Rob Cook <rdjcook at gmail.com> wrote:
> >> > >
> >> > > > I have a CentOS 6.5 box with 2 1.5Tb drives in a RAID 1 with LVM
> >> > > partitions
> >> > > > on top of that. One of the drives /dev/sdb has failed.
> >> > > >
> >> > > > I've been googling quite a bit and I think that I should be ok
> >> > following
> >> > > > this guide:
> >> > > >
> >> > > > http://www.howtoforge.com/replacing_hard_disks_in_a_raid1_array
> >> > > >
> >> > > > Fail then remove the drive from the array, replace with similar or
> >> > larger
> >> > > > then recreate. The one question I have is what to do with the LVM
> >> > > > partitons? Naively they should recreate given this is a RAID 1 so
> >> it's
> >> > > the
> >> > > > same data on both drives so I shouldn't have to worry. Or is that
> to
> >> > > > simplistic of a view?
> >> > >
> >> > >
> >> > > When you say “LVM on top of RAID” I assume you mean something like
> >> this:
> >> > >         /dev/sd* (physical block devices) => md0 (mdadm array) =>
> pv1
> >> > (LVM
> >> > > “physical” volume) => vg1 (LVM volume group) => lv1 (LVM logical
> >> volume)
> >> > =>
> >> > > /mnt/foo (filesystem)
> >> > >
> >> > > If that’s the case then the LVM physical volume and everything
> higher
> >> in
> >> > > the stack has no idea that you’re swapping disks and doesn’t need to
> >> be
> >> > > told anything.
> >> > >
> >> > > —
> >> > >
> >> > > On a related note, sometimes mdadm commands that reference physical
> >> > > devices, like this:
> >> > >         mdadm --manage /dev/md1 --fail /dev/sdb2
> >> > > will fail with an error like:
> >> > >         No such device: /dev/sdb2
> >> > > because the file /dev/sdb2 no longer exists (because the disk is
> dead
> >> or
> >> > > pulled).
> >> > >
> >> > > But you still need to tell mdadm about it so it can update the
> array.
> >> > > Instead you should use the short name:
> >> > >         mdadm --manage /dev/md1 --fail sdb2
> >> > > or whatever other device name shows up when you ask mdadm about the
> >> array
> >> > > or look at /proc/mdstat. That bypasses any device-file lookup and
> uses
> >> > the
> >> > > references that mdadm tracks internally.
> >> > >
> >> > >         Zach
> >> > >
> >> > >
> >> > > _______________________________________________
> >> > > Cialug mailing list
> >> > > Cialug at cialug.org
> >> > > http://cialug.org/mailman/listinfo/cialug
> >> > >
> >> > >
> >> > _______________________________________________
> >> > Cialug mailing list
> >> > Cialug at cialug.org
> >> > http://cialug.org/mailman/listinfo/cialug
> >> >
> >>
> >>
> >>
> >> --
> >> Matthew Nuzum
> >> newz2000 on freenode, skype, linkedin and twitter
> >>
> >> ♫ You're never fully dressed without a smile! ♫
> >> _______________________________________________
> >> Cialug mailing list
> >> Cialug at cialug.org
> >> http://cialug.org/mailman/listinfo/cialug
> >>
> >
> >
>
> _______________________________________________
> Cialug mailing list
> Cialug at cialug.org
> http://cialug.org/mailman/listinfo/cialug
>
>


More information about the Cialug mailing list