Bug#623611: Lot of IO errors during boot

June 23rd, 2011 - 05:50 am ET by Laurent Bigonville | Report spam
Hi,


Any particular reason for re-opening this bug ?



I was busy typing the mail :)

It's just to add a comment (tell me if it worth reopening the bug).

I've discovered that the more LUN mapped to a machine you have, the
more time the machine takes to boot (due to the IO errors) and well in
some situations it's not acceptable.

I've found on a centos ML[0] that loading the scsi_dh_rdac kernel
module really early (in the initrd) could mitigate this issue. I've
made some test and it seems to work if the module is loaded in init-top
(and before udev). I'm not too sure this is possible to do, as it seems
that the 1st script to load (scsi) modules is the udev one.

The only suspicious message I get during boot is
"ldm_validate_partition_table(): Disk read failed." (two times). I
guess this is normal as I was doing my test with 1 LUN (2 active paths
and 2 inactives), the boot seems faster and no IO error can be seen.

Cheers

Laurent Bigonville

[0] http://comments.gmane.org/gmane.lin...rhel5/7264



To UNSUBSCRIBE, email to debian-bugs-dist-REQUEST@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
email Follow the discussionReplies 3 repliesReplies Make a reply

Replies

#1 Laurent Bigonville
June 23rd, 2011 - 05:50 am ET | Report spam
Le Thu, 23 Jun 2011 14:44:31 +0530,
Ritesh Raj Sarraf a écrit :

On 06/23/2011 02:31 PM, Laurent Bigonville wrote:
> It's just to add a comment (tell me if it worth reopening the bug).
>

There's not much to do here. So let's leave it closed.



OK


> I've discovered that the more LUN mapped to a machine you have, the
> more time the machine takes to boot (due to the IO errors) and well
> in some situations it's not acceptable.
>

That's ought to happen given the design. Ideally, there should be a
mechanism to ignore the ghost devices. Have you discussed this on
dm-devel?



Well multipath is not even running at that time, so without any help
(scsi_dh_rdac) the kernel know nothing about the sdX device.

I should probably talk about that on the ml. I've other weirdness with
the target I'm using but I'm a bit ENOTIME at the moment.

BTW, how many devices (LUNs x Paths) are we talking here ?



Well I was testing with 5 lun (2 active paths and 2 inactives).


> I've found on a centos ML[0] that loading the scsi_dh_rdac kernel
> module really early (in the initrd) could mitigate this issue. I've
> made some test and it seems to work if the module is loaded in
> init-top (and before udev). I'm not too sure this is possible to
> do, as it seems that the 1st script to load (scsi) modules is the
> udev one.
>

The only reason to put something into initrd is for early boot. In
case of storage, if your root LUN is on a SAN. Otherwise, I don't see
a need. But if you feel that putting it into initrd is helping, go
with it. Does the rdac module have any initialization delay ?



Looks like that even with multipath-utils-boot package installed, this
module is not loaded either before udev is started.

initialization delay? Not too sure what you mean, the module seems
to load instantaneously.

Cheers

Laurent Bigonville



To UNSUBSCRIBE, email to
with a subject of "unsubscribe". Trouble? Contact

Similar topics