dm core patches

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

* dm core patches
@ 2004-02-10 16:35 Joe Thornber
  2004-02-11 10:16 ` Lars Marowsky-Bree
  0 siblings, 1 reply; 19+ messages in thread
From: Joe Thornber @ 2004-02-10 16:35 UTC (permalink / raw)
  To: Linux Mailing List, Andrew Morton; +Cc: thornber

Hi,

Here's the latest set of patches to core dm.  Please apply.

Thanks,

- Joe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-10 16:35 Joe Thornber
@ 2004-02-11 10:16 ` Lars Marowsky-Bree
  2004-02-11 10:35   ` Joe Thornber
  0 siblings, 1 reply; 19+ messages in thread
From: Lars Marowsky-Bree @ 2004-02-11 10:16 UTC (permalink / raw)
  To: Joe Thornber, Linux Mailing List

On 2004-02-10T16:35:48,
   Joe Thornber <thornber@redhat.com> said:

> Hi,
> 
> Here's the latest set of patches to core dm.  Please apply.

Hi Joe,

when will you be submitting the DM multipath personality?


Sincerely,
    Lars Marowsky-Brée <lmb@suse.de>

-- 
High Availability & Clustering	      \ ever tried. ever failed. no matter.
SUSE Labs			      | try again. fail again. fail better.
Research & Development, SUSE LINUX AG \ 	-- Samuel Beckett


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-11 10:16 ` Lars Marowsky-Bree
@ 2004-02-11 10:35   ` Joe Thornber
  2004-02-12 18:51     ` Lars Marowsky-Bree
  0 siblings, 1 reply; 19+ messages in thread
From: Joe Thornber @ 2004-02-11 10:35 UTC (permalink / raw)
  To: Lars Marowsky-Bree; +Cc: Joe Thornber, Linux Mailing List

On Wed, Feb 11, 2004 at 11:16:59AM +0100, Lars Marowsky-Bree wrote:
> when will you be submitting the DM multipath personality?

Not for a bit, it's still changing too much as I find more out about
the hardware (see the dm-devel@redhat.com list).

- Joe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-11 10:35   ` Joe Thornber
@ 2004-02-12 18:51     ` Lars Marowsky-Bree
  2004-02-12 20:13       ` Joe Thornber
  0 siblings, 1 reply; 19+ messages in thread
From: Lars Marowsky-Bree @ 2004-02-12 18:51 UTC (permalink / raw)
  To: Joe Thornber; +Cc: Linux Mailing List

On 2004-02-11T10:35:41,
   Joe Thornber <thornber@redhat.com> said:

> > when will you be submitting the DM multipath personality?
> Not for a bit, it's still changing too much as I find more out about
> the hardware (see the dm-devel@redhat.com list).

I checked the archives, but I couldn't find anything really 'in flux'.
Your priority based approach seems just fine to me.

What is still missing? This is really a killer feature for 2.6. Any help
I can offer?


Sincerely,
    Lars Marowsky-Brée <lmb@suse.de>

-- 
High Availability & Clustering	      \ ever tried. ever failed. no matter.
SUSE Labs			      | try again. fail again. fail better.
Research & Development, SUSE LINUX AG \ 	-- Samuel Beckett


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-12 18:51     ` Lars Marowsky-Bree
@ 2004-02-12 20:13       ` Joe Thornber
  2004-02-13 15:12         ` Lars Marowsky-Bree
  0 siblings, 1 reply; 19+ messages in thread
From: Joe Thornber @ 2004-02-12 20:13 UTC (permalink / raw)
  To: Lars Marowsky-Bree; +Cc: Joe Thornber, Linux Mailing List

On Thu, Feb 12, 2004 at 07:51:45PM +0100, Lars Marowsky-Bree wrote:
> I checked the archives, but I couldn't find anything really 'in flux'.
> Your priority based approach seems just fine to me.
> 
> What is still missing? This is really a killer feature for 2.6. Any help
> I can offer?

I think the main concern now is over the testing of paths.  Sending an
io down an inactive path can be very expensive for some hardware
configurations.  So I'm considering changing a couple of things:

- Only ever send io to 1 priority group at a time (even test ios).
  To test the lower priority groups we'd have to periodically switch to
  them and use them for a bit for both test io and proper io.

- For some hardware there are better ways of testing the path than
  sending the test io.  Should the drivers expose a test function ?
  In the absence of this we'd fallback to the test io method.

The other thing we need is to try and get the drivers to deferentiate
between a media error and a path error, so that media errors get
reported up quickly and don't cause false path failures.  This is
possibly an area that you could help with ?

- Joe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-12 20:13       ` Joe Thornber
@ 2004-02-13 15:12         ` Lars Marowsky-Bree
  2004-02-13 15:39           ` Joe Thornber
  2004-02-13 16:03           ` Jens Axboe
  0 siblings, 2 replies; 19+ messages in thread
From: Lars Marowsky-Bree @ 2004-02-13 15:12 UTC (permalink / raw)
  To: Joe Thornber; +Cc: Linux Mailing List, axboe

On 2004-02-12T20:13:40,
   Joe Thornber <thornber@redhat.com> said:

> I think the main concern now is over the testing of paths.  Sending an
> io down an inactive path can be very expensive for some hardware
> configurations.  So I'm considering changing a couple of things:
> 
> - Only ever send io to 1 priority group at a time (even test ios).
>   To test the lower priority groups we'd have to periodically switch to
>   them and use them for a bit for both test io and proper io.

You are missing the obvious answer:

- Periodically checking paths is a user-space issue and doesn't belong
  into the kernel. User-space gets to handle this policy.

> - For some hardware there are better ways of testing the path than
>   sending the test io.  Should the drivers expose a test function ?
>   In the absence of this we'd fallback to the test io method.

Again, with user-space taking care of this, it doesn't really matter.

Though exposing a test function does sound nice, even for user-space.

Moving it into kernel land is something which can always be done later,
if there is a really pressing problem.

> The other thing we need is to try and get the drivers to deferentiate
> between a media error and a path error, so that media errors get
> reported up quickly and don't cause false path failures.  This is
> possibly an area that you could help with ?

I thought the IO stack in 2.6 provided us with such sense keys already,
which you'd then need to handle in the DM personality. Of course,
drivers need to make sure they pass up appropriate sense-keys, but
that's a hardware vendor issue and not something which should delay the
DM personality...

Jens, do you have the pointer on this handy?



Sincerely,
    Lars Marowsky-Brée <lmb@suse.de>

-- 
High Availability & Clustering	      \ ever tried. ever failed. no matter.
SUSE Labs			      | try again. fail again. fail better.
Research & Development, SUSE LINUX AG \ 	-- Samuel Beckett


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-13 15:12         ` Lars Marowsky-Bree
@ 2004-02-13 15:39           ` Joe Thornber
  2004-02-13 16:08             ` Arjan van de Ven
                               ` (2 more replies)
  2004-02-13 16:03           ` Jens Axboe
  1 sibling, 3 replies; 19+ messages in thread
From: Joe Thornber @ 2004-02-13 15:39 UTC (permalink / raw)
  To: Lars Marowsky-Bree; +Cc: Joe Thornber, Linux Mailing List, axboe

On Fri, Feb 13, 2004 at 04:12:14PM +0100, Lars Marowsky-Bree wrote:
> On 2004-02-12T20:13:40,
>    Joe Thornber <thornber@redhat.com> said:
> 
> > I think the main concern now is over the testing of paths.  Sending an
> > io down an inactive path can be very expensive for some hardware
> > configurations.  So I'm considering changing a couple of things:
> > 
> > - Only ever send io to 1 priority group at a time (even test ios).
> >   To test the lower priority groups we'd have to periodically switch to
> >   them and use them for a bit for both test io and proper io.
> 
> You are missing the obvious answer:
> 
> - Periodically checking paths is a user-space issue and doesn't belong
>   into the kernel. User-space gets to handle this policy.

Yes, that is obvious, I had wanted to do failback automatically.  But
pushing it to userland does allow people to write hardware specific
tests.  I'll try it and see what people think.

Thanks,

- Joe

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-13 15:12         ` Lars Marowsky-Bree
  2004-02-13 15:39           ` Joe Thornber
@ 2004-02-13 16:03           ` Jens Axboe
  1 sibling, 0 replies; 19+ messages in thread
From: Jens Axboe @ 2004-02-13 16:03 UTC (permalink / raw)
  To: Lars Marowsky-Bree; +Cc: Joe Thornber, Linux Mailing List

On Fri, Feb 13 2004, Lars Marowsky-Bree wrote:
> > The other thing we need is to try and get the drivers to deferentiate
> > between a media error and a path error, so that media errors get
> > reported up quickly and don't cause false path failures.  This is
> > possibly an area that you could help with ?
> 
> I thought the IO stack in 2.6 provided us with such sense keys already,
> which you'd then need to handle in the DM personality. Of course,
> drivers need to make sure they pass up appropriate sense-keys, but
> that's a hardware vendor issue and not something which should delay the
> DM personality...
> 
> Jens, do you have the pointer on this handy?

The mechanism is in place, but the SCSI stack still needs a few changes
to pass down the correct errors. The easiest would be to pass down
pseudo-sense keys (I'd rather just call them something else as not to
confuse things, io error hints or something) to
end_that_request_first(), changing uptodate from a bool to a hint.

I can help get this done, it's not something that should hold up dm-mp
by any stretch.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-13 15:39           ` Joe Thornber
@ 2004-02-13 16:08             ` Arjan van de Ven
  2004-02-16  8:19               ` Lars Marowsky-Bree
  2004-02-13 23:46             ` Mike Anderson
  2004-02-16 12:17             ` Heinz Mauelshagen
  2 siblings, 1 reply; 19+ messages in thread
From: Arjan van de Ven @ 2004-02-13 16:08 UTC (permalink / raw)
  To: Joe Thornber; +Cc: Lars Marowsky-Bree, Linux Mailing List, axboe

[-- Attachment #1: Type: text/plain, Size: 435 bytes --]


> Yes, that is obvious, I had wanted to do failback automatically.  But
> pushing it to userland does allow people to write hardware specific
> tests.  I'll try it and see what people think.

one thing you can do is provide a way for drivers to wake the userspace
tester early. Say by default it polls every minute, but if the fiber
channel driver gets a LIP UP event it (via a central API) makes the
userspace daemon *now*.

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
@ 2004-02-13 16:44 James Bottomley
  2004-02-16  8:22 ` Lars Marowsky-Bree
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: James Bottomley @ 2004-02-13 16:44 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel

> The mechanism is in place, but the SCSI stack still needs a few changes
> to pass down the correct errors. The easiest would be to pass down
> pseudo-sense keys (I'd rather just call them something else as not to
> confuse things, io error hints or something) to
> end_that_request_first(), changing uptodate from a bool to a hint.

Yes, I'm ready to do this in SCSI.  I think the uptodate field should
include at least two (and possibly three) failure type indications:

- fatal: error cannot be retried
- retryable: error may be retried

and possibly

- informational: This is dangerous, since it's giving information about
a transaction that actually succeeded (i.e. we'd need to fix drivers to
recognise it as being uptodate but with info, like sector remapped)

Then, we also have a error origin indication:

- device: The device is actually reporting the problem
- transport: the error is a transport error
- driver: the error comes from the device driver.

So dm would know that fatal transport or driver errors could be
repathed, but fatal device errors probably couldn't.

Any that I've missed?

James


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-13 15:39           ` Joe Thornber
  2004-02-13 16:08             ` Arjan van de Ven
@ 2004-02-13 23:46             ` Mike Anderson
  2004-02-16 12:17             ` Heinz Mauelshagen
  2 siblings, 0 replies; 19+ messages in thread
From: Mike Anderson @ 2004-02-13 23:46 UTC (permalink / raw)
  To: Joe Thornber; +Cc: Lars Marowsky-Bree, Linux Mailing List, axboe

Joe Thornber [thornber@redhat.com] wrote:
> > You are missing the obvious answer:
> > 
> > - Periodically checking paths is a user-space issue and doesn't belong
> >   into the kernel. User-space gets to handle this policy.
> 
> Yes, that is obvious, I had wanted to do failback automatically.  But
> pushing it to userland does allow people to write hardware specific
> tests.  I'll try it and see what people think.

Be careful here. Your failback test packet cannot be a media access type
as this could cause volume transition thrashing in some types of
storage units so most likely you will use a test unit ready type packet.
These small size tests are not very good checks on there own for optical
based networks as the laser power needed to send them is really low
(newer vertical cavity lasers have reduced these types of failures, but
they still happens). Auto failback with heuristics and a credit based
model allows the path to be failed back in with a quick ejection and a
increasing time interval to start the whole cycle again. This keeps the
systems from heading into a failover / failback storm.

-andmike
--
Michael Anderson
andmike@us.ibm.com

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-13 16:08             ` Arjan van de Ven
@ 2004-02-16  8:19               ` Lars Marowsky-Bree
  2004-02-16  9:35                 ` Arjan van de Ven
  0 siblings, 1 reply; 19+ messages in thread
From: Lars Marowsky-Bree @ 2004-02-16  8:19 UTC (permalink / raw)
  To: Arjan van de Ven, Joe Thornber; +Cc: Linux Mailing List, axboe

On 2004-02-13T17:08:59,
   Arjan van de Ven <arjanv@redhat.com> said:

> one thing you can do is provide a way for drivers to wake the userspace
> tester early. Say by default it polls every minute, but if the fiber
> channel driver gets a LIP UP event it (via a central API) makes the
> userspace daemon *now*.

I may be missing something obvious, but a LIP UP should be accompanied
with a round of 'device detections' on that link, which already should
trigger a few hotplug events, no?

So this seems pretty much solved.


Sincerely,
    Lars Marowsky-Brée <lmb@suse.de>

-- 
High Availability & Clustering	      \ ever tried. ever failed. no matter.
SUSE Labs			      | try again. fail again. fail better.
Research & Development, SUSE LINUX AG \ 	-- Samuel Beckett


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-13 16:44 dm core patches James Bottomley
@ 2004-02-16  8:22 ` Lars Marowsky-Bree
  2004-02-16 16:57 ` Jens Axboe
  2004-02-19  0:26 ` Mike Christie
  2 siblings, 0 replies; 19+ messages in thread
From: Lars Marowsky-Bree @ 2004-02-16  8:22 UTC (permalink / raw)
  To: James Bottomley, Jens Axboe; +Cc: Linux Kernel

On 2004-02-13T11:44:41,
   James Bottomley <James.Bottomley@steeleye.com> said:

> - fatal: error cannot be retried
> - retryable: error may be retried
> 
> and possibly
> 
> - informational: This is dangerous, since it's giving information about
> a transaction that actually succeeded (i.e. we'd need to fix drivers to
> recognise it as being uptodate but with info, like sector remapped)

I don't think we need informational errors. The meaning of this seems
pretty difficult to define, and it's bound to have annoying semantics. I
also can't come up with a case where you would want to use that ;-)

> Then, we also have a error origin indication:
> 
> - device: The device is actually reporting the problem
> - transport: the error is a transport error
> - driver: the error comes from the device driver.
> 
> So dm would know that fatal transport or driver errors could be
> repathed, but fatal device errors probably couldn't.
> 
> Any that I've missed?

No, I think those were the ones which we were discussing at KS2003 too.


Sincerely,
    Lars Marowsky-Brée <lmb@suse.de>

-- 
High Availability & Clustering	      \ ever tried. ever failed. no matter.
SUSE Labs			      | try again. fail again. fail better.
Research & Development, SUSE LINUX AG \ 	-- Samuel Beckett


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-16  8:19               ` Lars Marowsky-Bree
@ 2004-02-16  9:35                 ` Arjan van de Ven
  0 siblings, 0 replies; 19+ messages in thread
From: Arjan van de Ven @ 2004-02-16  9:35 UTC (permalink / raw)
  To: Lars Marowsky-Bree; +Cc: Joe Thornber, Linux Mailing List, axboe

[-- Attachment #1: Type: text/plain, Size: 1069 bytes --]

On Mon, Feb 16, 2004 at 09:19:45AM +0100, Lars Marowsky-Bree wrote:
> On 2004-02-13T17:08:59,
>    Arjan van de Ven <arjanv@redhat.com> said:
> 
> > one thing you can do is provide a way for drivers to wake the userspace
> > tester early. Say by default it polls every minute, but if the fiber
> > channel driver gets a LIP UP event it (via a central API) makes the
> > userspace daemon *now*.
> 
> I may be missing something obvious, but a LIP UP should be accompanied
> with a round of 'device detections' on that link, which already should
> trigger a few hotplug events, no?
> 
> So this seems pretty much solved.

not normaly; there are several reasons the loop can bounce briefly and right
now the fiber drivers don't notify linux of that every time. Maybe that's
for the better .... if it's a frequent thing that is short-timed then it
would be obscene to yank the disks from under the user (and force-umount his
fs) every few hours..

while in multipath you do want to at least stop using the current path if
there is another path that is not in negotiation...

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-13 15:39           ` Joe Thornber
  2004-02-13 16:08             ` Arjan van de Ven
  2004-02-13 23:46             ` Mike Anderson
@ 2004-02-16 12:17             ` Heinz Mauelshagen
  2 siblings, 0 replies; 19+ messages in thread
From: Heinz Mauelshagen @ 2004-02-16 12:17 UTC (permalink / raw)
  To: Joe Thornber; +Cc: Lars Marowsky-Bree, Linux Mailing List, axboe

On Fri, Feb 13, 2004 at 03:39:36PM +0000, Joe Thornber wrote:
> On Fri, Feb 13, 2004 at 04:12:14PM +0100, Lars Marowsky-Bree wrote:
> > On 2004-02-12T20:13:40,
> >    Joe Thornber <thornber@redhat.com> said:
> > 
> > > I think the main concern now is over the testing of paths.  Sending an
> > > io down an inactive path can be very expensive for some hardware
> > > configurations.  So I'm considering changing a couple of things:
> > > 
> > > - Only ever send io to 1 priority group at a time (even test ios).
> > >   To test the lower priority groups we'd have to periodically switch to
> > >   them and use them for a bit for both test io and proper io.
> > 
> > You are missing the obvious answer:
> > 
> > - Periodically checking paths is a user-space issue and doesn't belong
> >   into the kernel. User-space gets to handle this policy.
> 
> Yes, that is obvious, I had wanted to do failback automatically.  But
> pushing it to userland does allow people to write hardware specific
> tests.  I'll try it and see what people think.

Right, such policy belongs to userpsace it seems.

The reason why I put it into the multipath target is to cover the case,
where all paths are inoperational, the system is OOM _and_ the only
chance to recover from that is the hope to unfail a path in order to
release memory preasure.

'Sorry, userspace test handler can't run, your enterprise server
is a pile of sh..' is not acceptable in case there's a path we
could unfail IMO.

Regards,
Heinz    -- The LVM Guy --


> 
> Thanks,
> 
> - Joe
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

*** Software bugs are stupid.
    Nevertheless it needs not so stupid people to solve them ***

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Heinz Mauelshagen                                 Red Hat, Inc.
Consulting Development Engineer                   Am Sonnenhang 11
                                                  56242 Marienrachdorf
                                                  Germany
Mauelshagen@RedHat.com                            +49 2626 141200
                                                       FAX 924446
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-13 16:44 dm core patches James Bottomley
  2004-02-16  8:22 ` Lars Marowsky-Bree
@ 2004-02-16 16:57 ` Jens Axboe
  2004-02-16 17:04   ` James Bottomley
  2004-02-19  0:26 ` Mike Christie
  2 siblings, 1 reply; 19+ messages in thread
From: Jens Axboe @ 2004-02-16 16:57 UTC (permalink / raw)
  To: James Bottomley; +Cc: Linux Kernel

On Fri, Feb 13 2004, James Bottomley wrote:
> > The mechanism is in place, but the SCSI stack still needs a few changes
> > to pass down the correct errors. The easiest would be to pass down
> > pseudo-sense keys (I'd rather just call them something else as not to
> > confuse things, io error hints or something) to
> > end_that_request_first(), changing uptodate from a bool to a hint.
> 
> Yes, I'm ready to do this in SCSI.  I think the uptodate field should
> include at least two (and possibly three) failure type indications:
> 
> - fatal: error cannot be retried
> - retryable: error may be retried
> 
> and possibly
> 
> - informational: This is dangerous, since it's giving information about
> a transaction that actually succeeded (i.e. we'd need to fix drivers to
> recognise it as being uptodate but with info, like sector remapped)
> 
> Then, we also have a error origin indication:
> 
> - device: The device is actually reporting the problem
> - transport: the error is a transport error
> - driver: the error comes from the device driver.
> 
> So dm would know that fatal transport or driver errors could be
> repathed, but fatal device errors probably couldn't.
> 
> Any that I've missed?

Nope, this looks pretty spot-on to me. I have to agree with Lars and
rather keep it simple and straight forward, than introduce shady
informational bits.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-16 16:57 ` Jens Axboe
@ 2004-02-16 17:04   ` James Bottomley
  0 siblings, 0 replies; 19+ messages in thread
From: James Bottomley @ 2004-02-16 17:04 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Linux Kernel

On Mon, 2004-02-16 at 11:57, Jens Axboe wrote:
> Nope, this looks pretty spot-on to me. I have to agree with Lars and
> rather keep it simple and straight forward, than introduce shady
> informational bits.

OK, I pretty much agree, that's why I labelled the informational piece
as "possibly".

About the only use I can see for it is predictive failure, which was all
the rage a while ago, but seems to have quieted down somewhat.  I agree
certainly that predictive failure is far more useful to RAID than
multi-path.

James

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-13 16:44 dm core patches James Bottomley
  2004-02-16  8:22 ` Lars Marowsky-Bree
  2004-02-16 16:57 ` Jens Axboe
@ 2004-02-19  0:26 ` Mike Christie
  2004-02-19  3:40   ` Jeff Garzik
  2 siblings, 1 reply; 19+ messages in thread
From: Mike Christie @ 2004-02-19  0:26 UTC (permalink / raw)
  To: James Bottomley; +Cc: Jens Axboe, Linux Kernel

[-- Attachment #1: Type: text/plain, Size: 2161 bytes --]

James Bottomley wrote:
>>The mechanism is in place, but the SCSI stack still needs a few changes
>>to pass down the correct errors. The easiest would be to pass down
>>pseudo-sense keys (I'd rather just call them something else as not to
>>confuse things, io error hints or something) to
>>end_that_request_first(), changing uptodate from a bool to a hint.
> 
> 
> Yes, I'm ready to do this in SCSI.  I think the uptodate field should
> include at least two (and possibly three) failure type indications:
> 
> - fatal: error cannot be retried
> - retryable: error may be retried
> 
> and possibly
> 
> - informational: This is dangerous, since it's giving information about
> a transaction that actually succeeded (i.e. we'd need to fix drivers to
> recognise it as being uptodate but with info, like sector remapped)
> 
> Then, we also have a error origin indication:
> 
> - device: The device is actually reporting the problem
> - transport: the error is a transport error
> - driver: the error comes from the device driver.
> 
> So dm would know that fatal transport or driver errors could be
> repathed, but fatal device errors probably couldn't.
> 

I apologize for not starting a new thread, but I just wanted some 
feedback as to whether or not the attached patch is headed in the right 
direction or even acceptable. block-err.patch adds new errornos to 
include/linux/errno.h (it does not touch the asm values), so useful IO 
error info can passed from callers of end_that_request_first to 
bio_endio and eventually to the DM/MD endio functions.

I have an alternative patch that defines BLK_ERR_xxx values instead of 
touching errno.h, but becuase the error values get passed through the 
request code, bio code and DM/MD code the callers of bio_endio that are 
already using -Exxx values could present a problem. It would be nice to 
change them to the BLK_ERR_xxx, so the bio layer could have a single 
error value namespace. It's a more invasive change as there are several 
callers passing at least -EIO, -EWOULDBLOCK and -EPERM, so I am not sure 
if that is going to be OK since we are already in 2.6.3?

Thanks,

Mike Christie
mikenc@us.ibm.com

[-- Attachment #2: block-err.patch --]
[-- Type: text/plain, Size: 2267 bytes --]

diff -aurp linux-2.6.3-orig/drivers/block/ll_rw_blk.c linux-2.6.3-ec/drivers/block/ll_rw_blk.c
--- linux-2.6.3-orig/drivers/block/ll_rw_blk.c	2004-02-17 19:57:16.000000000 -0800
+++ linux-2.6.3-ec/drivers/block/ll_rw_blk.c	2004-02-18 12:33:50.000000000 -0800
@@ -2456,8 +2456,13 @@ static int __end_that_request_first(stru
 	if (!blk_pc_request(req))
 		req->errors = 0;
 
-	if (!uptodate) {
-		error = -EIO;
+	/*
+	 * Most drivers set uptodate to 0 for error and 1 for success.
+	 * MD/DM ready drivers will set 1 for success and a -Exxx
+	 * value to indicate a specific error.
+	 */
+	if (uptodate < 1) {
+		error = (uptodate == 0 ? -EIO : uptodate);
 		if (blk_fs_request(req) && !(req->flags & REQ_QUIET))
 			printk("end_request: I/O error, dev %s, sector %llu\n",
 				req->rq_disk ? req->rq_disk->disk_name : "?",
@@ -2540,7 +2545,7 @@ static int __end_that_request_first(stru
 /**
  * end_that_request_first - end I/O on a request
  * @req:      the request being processed
- * @uptodate: 0 for I/O error
+ * @@uptodate: <= 0 to indicate an I/O error.
  * @nr_sectors: number of sectors to end I/O on
  *
  * Description:
@@ -2561,7 +2566,7 @@ EXPORT_SYMBOL(end_that_request_first);
 /**
  * end_that_request_chunk - end I/O on a request
  * @req:      the request being processed
- * @uptodate: 0 for I/O error
+ * @uptodate: <= 0 to indicate an I/O error.
  * @nr_bytes: number of bytes to complete
  *
  * Description:
diff -aurp linux-2.6.3-orig/include/linux/errno.h linux-2.6.3-ec/include/linux/errno.h
--- linux-2.6.3-orig/include/linux/errno.h	2004-02-17 19:59:12.000000000 -0800
+++ linux-2.6.3-ec/include/linux/errno.h	2004-02-18 12:45:42.000000000 -0800
@@ -23,6 +23,14 @@
 #define EJUKEBOX	528	/* Request initiated, but will not complete before timeout */
 #define EIOCBQUEUED	529	/* iocb queued, will get completion event */
 
+/* Block device error codes */
+#define EFATALDEV	540	/* Fatal device error */
+#define EFATALTRNSPT	541	/* Fatal transport error */
+#define EFATALDRV	542	/* Fatal driver error */
+#define ERETRYDEV	543	/* Device error occured, I/O may be retried */
+#define ERETRYTRNSPT	544	/* Transport error occured, I/O may be retried */
+#define ERETRYDRV	545	/* Driver error occured, I/O may be retried */
+
 #endif
 
 #endif

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: dm core patches
  2004-02-19  0:26 ` Mike Christie
@ 2004-02-19  3:40   ` Jeff Garzik
  0 siblings, 0 replies; 19+ messages in thread
From: Jeff Garzik @ 2004-02-19  3:40 UTC (permalink / raw)
  To: Mike Christie; +Cc: James Bottomley, Jens Axboe, Linux Kernel

Mike Christie wrote:
> diff -aurp linux-2.6.3-orig/include/linux/errno.h linux-2.6.3-ec/include/linux/errno.h
> --- linux-2.6.3-orig/include/linux/errno.h	2004-02-17 19:59:12.000000000 -0800
> +++ linux-2.6.3-ec/include/linux/errno.h	2004-02-18 12:45:42.000000000 -0800
> @@ -23,6 +23,14 @@
>  #define EJUKEBOX	528	/* Request initiated, but will not complete before timeout */
>  #define EIOCBQUEUED	529	/* iocb queued, will get completion event */
>  
> +/* Block device error codes */
> +#define EFATALDEV	540	/* Fatal device error */
> +#define EFATALTRNSPT	541	/* Fatal transport error */
> +#define EFATALDRV	542	/* Fatal driver error */
> +#define ERETRYDEV	543	/* Device error occured, I/O may be retried */
> +#define ERETRYTRNSPT	544	/* Transport error occured, I/O may be retried */
> +#define ERETRYDRV	545	/* Driver error occured, I/O may be retried */


I'm not sure errno is the best place...   I would rather define them in 
blkdev.h and prefix them such that it's obvious they are specific to 
block devices.

Also, WRT the I/O error printk, you probably want to print out a string 
representing the error value returned...  that info is available now, 
might as well tell the user about it.

	Jeff




^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2004-02-19  3:40 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-02-13 16:44 dm core patches James Bottomley
2004-02-16  8:22 ` Lars Marowsky-Bree
2004-02-16 16:57 ` Jens Axboe
2004-02-16 17:04   ` James Bottomley
2004-02-19  0:26 ` Mike Christie
2004-02-19  3:40   ` Jeff Garzik
  -- strict thread matches above, loose matches on Subject: below --
2004-02-10 16:35 Joe Thornber
2004-02-11 10:16 ` Lars Marowsky-Bree
2004-02-11 10:35   ` Joe Thornber
2004-02-12 18:51     ` Lars Marowsky-Bree
2004-02-12 20:13       ` Joe Thornber
2004-02-13 15:12         ` Lars Marowsky-Bree
2004-02-13 15:39           ` Joe Thornber
2004-02-13 16:08             ` Arjan van de Ven
2004-02-16  8:19               ` Lars Marowsky-Bree
2004-02-16  9:35                 ` Arjan van de Ven
2004-02-13 23:46             ` Mike Anderson
2004-02-16 12:17             ` Heinz Mauelshagen
2004-02-13 16:03           ` Jens Axboe

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox