最近制作了测试服务器。 系统安装在两个RAID1磁盘上,在剩下的三个磁盘上创建的RAID5包含数据。 结果,系统可能无法正常启动,也可能无法启动。
在网上找到了报道。
实验目的:测试3.0.5根分区安装在软件raid 1分区上的故障恢复,以解决原始服务器主机raid不支持的故障
实验环境: Vmware虚拟机guest os包括两个2048M磁盘,3.0.5系统分为三个区域(/boot、/、swap )安装在software raid1分区中,并
磁盘/dev/sda : 2147 MB,2147483648 bytes
255 heads,63 sectors/track,261 cylinders
units=cylinders of 16065 * 512=8225280 bytes
dvicebootstartendblocksidsystem
/dev/sda 113104391 fdlinuxraidautodetect
/dev/sda 21478522112 fdlinuxraidautodetect
/dev/sda 3792611469947 fdlinuxraidautodetect
磁盘/dev/sdb : 2147 MB,2147483648 bytes
255 heads,63 sectors/track,261 cylinders
units=cylinders of 16065 * 512=8225280 bytes
dvicebootstartendblocksidsystem
/dev/SD B1 * 113104391 fdlinuxraidautodetect
/dev/sdb 21478522112 fdlinuxraidautodetect
/dev/sdb 3792611469947 fdlinuxraidautodetect
人员: [ raid1]
MD 1:活动raid1sdb2[1] sda2[0]
522048积木[2/2] [疯狂大豆]
MD 2:活动raid1sdb3[1] sda3[0]
1469824积木[2/2] [疯狂大豆]
MD 0:活动raid1sdb1[1] sda1[0]
104320积木[2/2] [疯狂大豆]
filesystemsizeusedavailuse % mounted on
/dev/md2 1.4G 775M 651M 55% /
/dev/md0 99M 7.3M 87M 8% /boot
/dev/md1作为swap分区
模拟故障:关闭虚拟机电源后,直接删除虚拟机配置,使原始scsi0:0掉落
从故障中恢复:
此时,引导磁盘是现有系统的/dev/sdb,但3.0.5在系统默认安装时,不会在第一个磁盘以外的磁盘的mbr中写入grub信息,因此此时系统不会启动。 当然,生产服务器不会遇到此问题,因为即使有一个磁盘损坏,它也不会停机。
为了解决停机问题,请记住在系统安装完成后,在第二个硬盘上安装grub信息。
root@localhost ~# grub
grub install (hd0,0 )/grub/stage1d ) HD1 ) hd0,0 )/grub/grub.conf
系统重新启动后可以正常进入系统。
根据/dev/sda的分区情况对新添加的磁盘进行分区
要检查原始分区的情况,请:
root @ localhost~# fdisk-l/dev/sda
磁盘/dev/sda : 2147 MB,2147483648 bytes
255 heads,63 sectors/track,261 cylinders
Units=
cylinders of 16065 * 512 = 8225280 bytesDevice Boot Start End Blocks Id System
/dev/sda1 * 1 13 104391 fd Linux raid autodetect
/dev/sda2 14 78 522112+ fd Linux raid autodetect
/dev/sda3 79 261 1469947+ fd Linux raid autodetect
为新增加的磁盘分区,并将新分的分区格式更改为Linux raid autodetect:
root@localhost ~# fdisk /dev/sdb
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel. Changes will remain in memory only,
until you decide to write them. After that, of course, the previous
content won't be recoverable.
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-261, default 1):
Using default value 1
Last cylinder or +size or +sizeM or +sizeK (1-261, default 261): 13
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 2
First cylinder (14-261, default 14):
Using default value 14
Last cylinder or +size or +sizeM or +sizeK (14-261, default 261): 78
Command (m for help): n
Command action
e extended
p primary partition (1-4)
p
Partition number (1-4): 3
First cylinder (79-261, default 79):
Using default value 79
Last cylinder or +size or +sizeM or +sizeK (79-261, default 261):
Using default value 261
Command (m for help): t
Partition number (1-4): 1
Hex code (type L to list codes): fd
Changed system type of partition 1 to fd (Linux raid autodetect)
Command (m for help): t
Partition number (1-4): 2
Hex code (type L to list codes): fd
Changed system type of partition 2 to fd (Linux raid autodetect)
Command (m for help): t
Partition number (1-4): 3
Hex code (type L to list codes): fd
Changed system type of partition 3 to fd (Linux raid autodetect)
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
Syncing disks.
将新切出的分区加入raid
root@localhost ~# mdadm /dev/md0 -a /dev/sdb1
mdadm: added /dev/sdb1
root@localhost ~# mdadm /dev/md1 -a /dev/sdb2
mdadm: added /dev/sdb2
root@localhost ~# mdadm /dev/md2 -a /dev/sdb3
mdadm: added /dev/sdb3
查看刚回复的raid正在rebuild:
root@localhost ~# more /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb2[1] sda2[0]
522048 blocks [2/2] [疯狂的黄豆]
md2 : active raid1 sdb3[2] sda3[0]
1469824 blocks [2/1] [U_]
[===========>.........] recovery = 55.8% (821632/1469824) finish=0.1min s
peed=63202K/sec
md0 : active raid1 sdb1[1] sda1[0]
104320 blocks [2/2] [疯狂的黄豆]
unused devices:
查看到raid已经恢复正常:
root@localhost ~# more /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb2[1] sda2[0]
522048 blocks [2/2] [疯狂的黄豆]
md2 : active raid1 sdb3[1] sda3[0]
1469824 blocks [2/2] [疯狂的黄豆]
md0 : active raid1 sdb1[1] sda1[0]
104320 blocks [2/2] [疯狂的黄豆]
unused devices:
最后记得将新的raid配置保存到/etc/mdadm.conf,要不然系统重新引导之后不能恢复raid配置:
root@localhost ~# mdadm -Ds >/etc/mdadm.conf
实验结论:因尚未测试software raid1对系统性能带来的影响,所以只能肯定在对性能要求不是太高、而对数据安全性要求优先的情况下,将整个linux部署在software raid 1上是完全可行的。