* [PATCH] cciss: Ignore stale commands after reboot
@ 2009-07-02 9:36 Hannes Reinecke
2009-07-02 19:00 ` Jens Axboe
0 siblings, 1 reply; 11+ messages in thread
From: Hannes Reinecke @ 2009-07-02 9:36 UTC (permalink / raw)
To: Jens Axboe; +Cc: scameron, linux-kernel
When doing an unexpected shutdown like kexec the cciss
firmware might still have some commands in flight, which
it is trying to complete.
The driver is doing it's best on resetting the HBA,
but sadly there's a firmware issue causing the firmware
_not_ to abort or drop old commands.
So the firmware will send us commands which we haven't
accounted for, causing the driver to panic.
With this patch we're just ignoring these commands as
there is nothing we could be doing with them anyway.
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
drivers/block/cciss.c | 15 +++++++++++++--
drivers/block/cciss_cmd.h | 1 +
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
index c7a527c..65a0655 100644
--- a/drivers/block/cciss.c
+++ b/drivers/block/cciss.c
@@ -226,8 +226,18 @@ static inline void addQ(struct hlist_head *list, CommandList_struct *c)
static inline void removeQ(CommandList_struct *c)
{
- if (WARN_ON(hlist_unhashed(&c->list)))
+ /*
+ * After kexec/dump some commands might still
+ * be in flight, which the firmware will try
+ * to complete. Resetting the firmware doesn't work
+ * with old fw revisions, so we have to mark
+ * them off as 'stale' to prevent the driver from
+ * falling over.
+ */
+ if (WARN_ON(hlist_unhashed(&c->list))) {
+ c->cmd_type = CMD_MSG_STALE;
return;
+ }
hlist_del_init(&c->list);
}
@@ -4246,7 +4256,8 @@ static void fail_all_cmds(unsigned long ctlr)
while (!hlist_empty(&h->cmpQ)) {
c = hlist_entry(h->cmpQ.first, CommandList_struct, list);
removeQ(c);
- c->err_info->CommandStatus = CMD_HARDWARE_ERR;
+ if (c->cmd_type != CMD_MSG_STALE)
+ c->err_info->CommandStatus = CMD_HARDWARE_ERR;
if (c->cmd_type == CMD_RWREQ) {
complete_command(h, c, 0);
} else if (c->cmd_type == CMD_IOCTL_PEND)
diff --git a/drivers/block/cciss_cmd.h b/drivers/block/cciss_cmd.h
index cd665b0..dbaed1e 100644
--- a/drivers/block/cciss_cmd.h
+++ b/drivers/block/cciss_cmd.h
@@ -274,6 +274,7 @@ typedef struct _ErrorInfo_struct {
#define CMD_SCSI 0x03
#define CMD_MSG_DONE 0x04
#define CMD_MSG_TIMEOUT 0x05
+#define CMD_MSG_STALE 0xff
/* This structure needs to be divisible by 8 for new
* indexing method.
--
1.5.3.2
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH] cciss: Ignore stale commands after reboot
2009-07-02 9:36 [PATCH] cciss: Ignore stale commands after reboot Hannes Reinecke
@ 2009-07-02 19:00 ` Jens Axboe
0 siblings, 0 replies; 11+ messages in thread
From: Jens Axboe @ 2009-07-02 19:00 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: scameron, linux-kernel, mikem
On Thu, Jul 02 2009, Hannes Reinecke wrote:
>
> When doing an unexpected shutdown like kexec the cciss
> firmware might still have some commands in flight, which
> it is trying to complete.
> The driver is doing it's best on resetting the HBA,
> but sadly there's a firmware issue causing the firmware
> _not_ to abort or drop old commands.
> So the firmware will send us commands which we haven't
> accounted for, causing the driver to panic.
>
> With this patch we're just ignoring these commands as
> there is nothing we could be doing with them anyway.
Looks good to me. Mike, Stephen?
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
> drivers/block/cciss.c | 15 +++++++++++++--
> drivers/block/cciss_cmd.h | 1 +
> 2 files changed, 14 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> index c7a527c..65a0655 100644
> --- a/drivers/block/cciss.c
> +++ b/drivers/block/cciss.c
> @@ -226,8 +226,18 @@ static inline void addQ(struct hlist_head *list, CommandList_struct *c)
>
> static inline void removeQ(CommandList_struct *c)
> {
> - if (WARN_ON(hlist_unhashed(&c->list)))
> + /*
> + * After kexec/dump some commands might still
> + * be in flight, which the firmware will try
> + * to complete. Resetting the firmware doesn't work
> + * with old fw revisions, so we have to mark
> + * them off as 'stale' to prevent the driver from
> + * falling over.
> + */
> + if (WARN_ON(hlist_unhashed(&c->list))) {
> + c->cmd_type = CMD_MSG_STALE;
> return;
> + }
>
> hlist_del_init(&c->list);
> }
> @@ -4246,7 +4256,8 @@ static void fail_all_cmds(unsigned long ctlr)
> while (!hlist_empty(&h->cmpQ)) {
> c = hlist_entry(h->cmpQ.first, CommandList_struct, list);
> removeQ(c);
> - c->err_info->CommandStatus = CMD_HARDWARE_ERR;
> + if (c->cmd_type != CMD_MSG_STALE)
> + c->err_info->CommandStatus = CMD_HARDWARE_ERR;
> if (c->cmd_type == CMD_RWREQ) {
> complete_command(h, c, 0);
> } else if (c->cmd_type == CMD_IOCTL_PEND)
> diff --git a/drivers/block/cciss_cmd.h b/drivers/block/cciss_cmd.h
> index cd665b0..dbaed1e 100644
> --- a/drivers/block/cciss_cmd.h
> +++ b/drivers/block/cciss_cmd.h
> @@ -274,6 +274,7 @@ typedef struct _ErrorInfo_struct {
> #define CMD_SCSI 0x03
> #define CMD_MSG_DONE 0x04
> #define CMD_MSG_TIMEOUT 0x05
> +#define CMD_MSG_STALE 0xff
>
> /* This structure needs to be divisible by 8 for new
> * indexing method.
> --
> 1.5.3.2
>
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH] cciss: Ignore stale commands after reboot
@ 2009-07-02 8:23 Hannes Reinecke
2009-07-02 8:28 ` Jens Axboe
2009-07-06 20:33 ` Alan D. Brunelle
0 siblings, 2 replies; 11+ messages in thread
From: Hannes Reinecke @ 2009-07-02 8:23 UTC (permalink / raw)
To: Jens Axboe; +Cc: scameron, linux-kernel
When doing an unexpected shutdown like kexec the cciss
firmware might still have some commands in flight, which
it is trying to complete.
The driver is doing it's best on resetting the HBA,
but sadly there's a firmware issue causing the firmware
_not_ to abort or drop old commands.
So the firmware will send us commands which we haven't
accounted for, causing the driver to panic.
With this patch we're just ignoring these commands as
there is nothing we could be doing with them anyway.
Signed-off-by: Hannes Reinecke <hare@suse.de>
---
drivers/block/cciss.c | 14 ++++++++++++--
drivers/block/cciss_cmd.h | 1 +
2 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
index c7a527c..8dd4c0d 100644
--- a/drivers/block/cciss.c
+++ b/drivers/block/cciss.c
@@ -226,7 +226,16 @@ static inline void addQ(struct hlist_head *list, CommandList_struct *c)
static inline void removeQ(CommandList_struct *c)
{
- if (WARN_ON(hlist_unhashed(&c->list)))
+ /*
+ * After kexec/dump some commands might still
+ * be in flight, which the firmware will try
+ * to complete. Resetting the firmware doesn't work
+ * with old fw revisions, so we have to mark
+ * them off as 'stale' to prevent the driver from
+ * falling over.
+ */
+ if (unlikely(hlist_unhashed(&c->list))) {
+ c->cmd_type = CMD_MSG_STALE;
return;
hlist_del_init(&c->list);
@@ -4246,7 +4255,8 @@ static void fail_all_cmds(unsigned long ctlr)
while (!hlist_empty(&h->cmpQ)) {
c = hlist_entry(h->cmpQ.first, CommandList_struct, list);
removeQ(c);
- c->err_info->CommandStatus = CMD_HARDWARE_ERR;
+ if (c->cmd_type != CMD_MSG_STALE)
+ c->err_info->CommandStatus = CMD_HARDWARE_ERR;
if (c->cmd_type == CMD_RWREQ) {
complete_command(h, c, 0);
} else if (c->cmd_type == CMD_IOCTL_PEND)
diff --git a/drivers/block/cciss_cmd.h b/drivers/block/cciss_cmd.h
index cd665b0..dbaed1e 100644
--- a/drivers/block/cciss_cmd.h
+++ b/drivers/block/cciss_cmd.h
@@ -274,6 +274,7 @@ typedef struct _ErrorInfo_struct {
#define CMD_SCSI 0x03
#define CMD_MSG_DONE 0x04
#define CMD_MSG_TIMEOUT 0x05
+#define CMD_MSG_STALE 0xff
/* This structure needs to be divisible by 8 for new
* indexing method.
--
1.5.3.2
^ permalink raw reply related [flat|nested] 11+ messages in thread* Re: [PATCH] cciss: Ignore stale commands after reboot
2009-07-02 8:23 Hannes Reinecke
@ 2009-07-02 8:28 ` Jens Axboe
2009-07-02 8:44 ` Hannes Reinecke
2009-07-06 20:33 ` Alan D. Brunelle
1 sibling, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2009-07-02 8:28 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: scameron, linux-kernel
On Thu, Jul 02 2009, Hannes Reinecke wrote:
>
> When doing an unexpected shutdown like kexec the cciss
> firmware might still have some commands in flight, which
> it is trying to complete.
> The driver is doing it's best on resetting the HBA,
> but sadly there's a firmware issue causing the firmware
> _not_ to abort or drop old commands.
> So the firmware will send us commands which we haven't
> accounted for, causing the driver to panic.
>
> With this patch we're just ignoring these commands as
> there is nothing we could be doing with them anyway.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> ---
> drivers/block/cciss.c | 14 ++++++++++++--
> drivers/block/cciss_cmd.h | 1 +
> 2 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> index c7a527c..8dd4c0d 100644
> --- a/drivers/block/cciss.c
> +++ b/drivers/block/cciss.c
> @@ -226,7 +226,16 @@ static inline void addQ(struct hlist_head *list, CommandList_struct *c)
>
> static inline void removeQ(CommandList_struct *c)
> {
> - if (WARN_ON(hlist_unhashed(&c->list)))
> + /*
> + * After kexec/dump some commands might still
> + * be in flight, which the firmware will try
> + * to complete. Resetting the firmware doesn't work
> + * with old fw revisions, so we have to mark
> + * them off as 'stale' to prevent the driver from
> + * falling over.
> + */
> + if (unlikely(hlist_unhashed(&c->list))) {
> + c->cmd_type = CMD_MSG_STALE;
> return;
>
> hlist_del_init(&c->list);
Ehm, that looks rather dangerous. What's the level of testing this patch
received?
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH] cciss: Ignore stale commands after reboot
2009-07-02 8:28 ` Jens Axboe
@ 2009-07-02 8:44 ` Hannes Reinecke
2009-07-02 9:18 ` Jens Axboe
0 siblings, 1 reply; 11+ messages in thread
From: Hannes Reinecke @ 2009-07-02 8:44 UTC (permalink / raw)
To: Jens Axboe; +Cc: scameron, linux-kernel
Jens Axboe wrote:
> On Thu, Jul 02 2009, Hannes Reinecke wrote:
>> When doing an unexpected shutdown like kexec the cciss
>> firmware might still have some commands in flight, which
>> it is trying to complete.
>> The driver is doing it's best on resetting the HBA,
>> but sadly there's a firmware issue causing the firmware
>> _not_ to abort or drop old commands.
>> So the firmware will send us commands which we haven't
>> accounted for, causing the driver to panic.
>>
>> With this patch we're just ignoring these commands as
>> there is nothing we could be doing with them anyway.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>> ---
>> drivers/block/cciss.c | 14 ++++++++++++--
>> drivers/block/cciss_cmd.h | 1 +
>> 2 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
>> index c7a527c..8dd4c0d 100644
>> --- a/drivers/block/cciss.c
>> +++ b/drivers/block/cciss.c
>> @@ -226,7 +226,16 @@ static inline void addQ(struct hlist_head *list, CommandList_struct *c)
>>
>> static inline void removeQ(CommandList_struct *c)
>> {
>> - if (WARN_ON(hlist_unhashed(&c->list)))
>> + /*
>> + * After kexec/dump some commands might still
>> + * be in flight, which the firmware will try
>> + * to complete. Resetting the firmware doesn't work
>> + * with old fw revisions, so we have to mark
>> + * them off as 'stale' to prevent the driver from
>> + * falling over.
>> + */
>> + if (unlikely(hlist_unhashed(&c->list))) {
>> + c->cmd_type = CMD_MSG_STALE;
>> return;
>>
>> hlist_del_init(&c->list);
>
> Ehm, that looks rather dangerous. What's the level of testing this patch
> received?
>
Where is the danger here?
With the original code we would be issuing a warning
and return.
But then we hit this codepath:
while (!hlist_empty(&h->cmpQ)) {
c = hlist_entry(h->cmpQ.first, CommandList_struct, list);
removeQ(c);
c->err_info->CommandStatus = CMD_HARDWARE_ERR;
and the driver goes boom as c->err_info is not initialized.
This frequently happens if you're trying to do a kdump
while the system is doing I/O.
If you object to the removed WARN() I can easily put this
in, but without the fix there is a good chance that
kdump fails on cciss machines.
And note we can't do anything with the stale commands anyway,
as the context having sent the commands originally is long gone.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH] cciss: Ignore stale commands after reboot
2009-07-02 8:44 ` Hannes Reinecke
@ 2009-07-02 9:18 ` Jens Axboe
2009-07-02 9:36 ` Hannes Reinecke
0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2009-07-02 9:18 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: scameron, linux-kernel
On Thu, Jul 02 2009, Hannes Reinecke wrote:
> Jens Axboe wrote:
> > On Thu, Jul 02 2009, Hannes Reinecke wrote:
> >> When doing an unexpected shutdown like kexec the cciss
> >> firmware might still have some commands in flight, which
> >> it is trying to complete.
> >> The driver is doing it's best on resetting the HBA,
> >> but sadly there's a firmware issue causing the firmware
> >> _not_ to abort or drop old commands.
> >> So the firmware will send us commands which we haven't
> >> accounted for, causing the driver to panic.
> >>
> >> With this patch we're just ignoring these commands as
> >> there is nothing we could be doing with them anyway.
> >>
> >> Signed-off-by: Hannes Reinecke <hare@suse.de>
> >> ---
> >> drivers/block/cciss.c | 14 ++++++++++++--
> >> drivers/block/cciss_cmd.h | 1 +
> >> 2 files changed, 13 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> >> index c7a527c..8dd4c0d 100644
> >> --- a/drivers/block/cciss.c
> >> +++ b/drivers/block/cciss.c
> >> @@ -226,7 +226,16 @@ static inline void addQ(struct hlist_head *list, CommandList_struct *c)
> >>
> >> static inline void removeQ(CommandList_struct *c)
> >> {
> >> - if (WARN_ON(hlist_unhashed(&c->list)))
> >> + /*
> >> + * After kexec/dump some commands might still
> >> + * be in flight, which the firmware will try
> >> + * to complete. Resetting the firmware doesn't work
> >> + * with old fw revisions, so we have to mark
> >> + * them off as 'stale' to prevent the driver from
> >> + * falling over.
> >> + */
> >> + if (unlikely(hlist_unhashed(&c->list))) {
> >> + c->cmd_type = CMD_MSG_STALE;
> >> return;
> >>
> >> hlist_del_init(&c->list);
> >
> > Ehm, that looks rather dangerous. What's the level of testing this patch
> > received?
> >
> Where is the danger here?
The danger is that the patch doesn't even compile :-)
At least it had the { at the end of the if, otherwise it would have been
insta-hang.
>
> With the original code we would be issuing a warning
> and return.
> But then we hit this codepath:
>
> while (!hlist_empty(&h->cmpQ)) {
> c = hlist_entry(h->cmpQ.first, CommandList_struct, list);
> removeQ(c);
> c->err_info->CommandStatus = CMD_HARDWARE_ERR;
>
> and the driver goes boom as c->err_info is not initialized.
>
> This frequently happens if you're trying to do a kdump
> while the system is doing I/O.
> If you object to the removed WARN() I can easily put this
> in, but without the fix there is a good chance that
> kdump fails on cciss machines.
>
> And note we can't do anything with the stale commands anyway,
> as the context having sent the commands originally is long gone.
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke zSeries & Storage
> hare@suse.de +49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Markus Rex, HRB 16746 (AG Nürnberg)
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH] cciss: Ignore stale commands after reboot
2009-07-02 9:18 ` Jens Axboe
@ 2009-07-02 9:36 ` Hannes Reinecke
2009-07-02 10:26 ` Jens Axboe
0 siblings, 1 reply; 11+ messages in thread
From: Hannes Reinecke @ 2009-07-02 9:36 UTC (permalink / raw)
To: Jens Axboe; +Cc: scameron, linux-kernel
Jens Axboe wrote:
> On Thu, Jul 02 2009, Hannes Reinecke wrote:
>> Jens Axboe wrote:
>>> On Thu, Jul 02 2009, Hannes Reinecke wrote:
>>>> When doing an unexpected shutdown like kexec the cciss
>>>> firmware might still have some commands in flight, which
>>>> it is trying to complete.
>>>> The driver is doing it's best on resetting the HBA,
>>>> but sadly there's a firmware issue causing the firmware
>>>> _not_ to abort or drop old commands.
>>>> So the firmware will send us commands which we haven't
>>>> accounted for, causing the driver to panic.
>>>>
>>>> With this patch we're just ignoring these commands as
>>>> there is nothing we could be doing with them anyway.
>>>>
>>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>>>> ---
>>>> drivers/block/cciss.c | 14 ++++++++++++--
>>>> drivers/block/cciss_cmd.h | 1 +
>>>> 2 files changed, 13 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
>>>> index c7a527c..8dd4c0d 100644
>>>> --- a/drivers/block/cciss.c
>>>> +++ b/drivers/block/cciss.c
>>>> @@ -226,7 +226,16 @@ static inline void addQ(struct hlist_head *list, CommandList_struct *c)
>>>>
>>>> static inline void removeQ(CommandList_struct *c)
>>>> {
>>>> - if (WARN_ON(hlist_unhashed(&c->list)))
>>>> + /*
>>>> + * After kexec/dump some commands might still
>>>> + * be in flight, which the firmware will try
>>>> + * to complete. Resetting the firmware doesn't work
>>>> + * with old fw revisions, so we have to mark
>>>> + * them off as 'stale' to prevent the driver from
>>>> + * falling over.
>>>> + */
>>>> + if (unlikely(hlist_unhashed(&c->list))) {
>>>> + c->cmd_type = CMD_MSG_STALE;
>>>> return;
>>>>
>>>> hlist_del_init(&c->list);
>>> Ehm, that looks rather dangerous. What's the level of testing this patch
>>> received?
>>>
>> Where is the danger here?
>
> The danger is that the patch doesn't even compile :-)
> At least it had the { at the end of the if, otherwise it would have been
> insta-hang.
>
Bah. Should've said so.
That's what you get when you cut-n-paste from a different version.
OK, resending.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH] cciss: Ignore stale commands after reboot
2009-07-02 9:36 ` Hannes Reinecke
@ 2009-07-02 10:26 ` Jens Axboe
2009-07-02 10:28 ` Hannes Reinecke
0 siblings, 1 reply; 11+ messages in thread
From: Jens Axboe @ 2009-07-02 10:26 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: scameron, linux-kernel
On Thu, Jul 02 2009, Hannes Reinecke wrote:
> Jens Axboe wrote:
> > On Thu, Jul 02 2009, Hannes Reinecke wrote:
> >> Jens Axboe wrote:
> >>> On Thu, Jul 02 2009, Hannes Reinecke wrote:
> >>>> When doing an unexpected shutdown like kexec the cciss
> >>>> firmware might still have some commands in flight, which
> >>>> it is trying to complete.
> >>>> The driver is doing it's best on resetting the HBA,
> >>>> but sadly there's a firmware issue causing the firmware
> >>>> _not_ to abort or drop old commands.
> >>>> So the firmware will send us commands which we haven't
> >>>> accounted for, causing the driver to panic.
> >>>>
> >>>> With this patch we're just ignoring these commands as
> >>>> there is nothing we could be doing with them anyway.
> >>>>
> >>>> Signed-off-by: Hannes Reinecke <hare@suse.de>
> >>>> ---
> >>>> drivers/block/cciss.c | 14 ++++++++++++--
> >>>> drivers/block/cciss_cmd.h | 1 +
> >>>> 2 files changed, 13 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/drivers/block/cciss.c b/drivers/block/cciss.c
> >>>> index c7a527c..8dd4c0d 100644
> >>>> --- a/drivers/block/cciss.c
> >>>> +++ b/drivers/block/cciss.c
> >>>> @@ -226,7 +226,16 @@ static inline void addQ(struct hlist_head *list, CommandList_struct *c)
> >>>>
> >>>> static inline void removeQ(CommandList_struct *c)
> >>>> {
> >>>> - if (WARN_ON(hlist_unhashed(&c->list)))
> >>>> + /*
> >>>> + * After kexec/dump some commands might still
> >>>> + * be in flight, which the firmware will try
> >>>> + * to complete. Resetting the firmware doesn't work
> >>>> + * with old fw revisions, so we have to mark
> >>>> + * them off as 'stale' to prevent the driver from
> >>>> + * falling over.
> >>>> + */
> >>>> + if (unlikely(hlist_unhashed(&c->list))) {
> >>>> + c->cmd_type = CMD_MSG_STALE;
> >>>> return;
> >>>>
> >>>> hlist_del_init(&c->list);
> >>> Ehm, that looks rather dangerous. What's the level of testing this patch
> >>> received?
> >>>
> >> Where is the danger here?
> >
> > The danger is that the patch doesn't even compile :-)
> > At least it had the { at the end of the if, otherwise it would have been
> > insta-hang.
> >
> Bah. Should've said so.
Sorry, just annoys me when people send out patches for inclusion that
don't even compile. It usually means that some other form of the patch
was tested and that this one hasn't even been run (obviously, since it
doesn't compile).
--
Jens Axboe
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: [PATCH] cciss: Ignore stale commands after reboot
2009-07-02 10:26 ` Jens Axboe
@ 2009-07-02 10:28 ` Hannes Reinecke
0 siblings, 0 replies; 11+ messages in thread
From: Hannes Reinecke @ 2009-07-02 10:28 UTC (permalink / raw)
To: Jens Axboe; +Cc: scameron, linux-kernel
Jens Axboe wrote:
[ ... ]
>
> Sorry, just annoys me when people send out patches for inclusion that
> don't even compile. It usually means that some other form of the patch
> was tested and that this one hasn't even been run (obviously, since it
> doesn't compile).
>
Accepted.
Apologies.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] cciss: Ignore stale commands after reboot
2009-07-02 8:23 Hannes Reinecke
2009-07-02 8:28 ` Jens Axboe
@ 2009-07-06 20:33 ` Alan D. Brunelle
2009-07-07 7:34 ` Hannes Reinecke
1 sibling, 1 reply; 11+ messages in thread
From: Alan D. Brunelle @ 2009-07-06 20:33 UTC (permalink / raw)
To: Hannes Reinecke; +Cc: Jens Axboe, scameron, linux-kernel
Hannes Reinecke wrote:
> When doing an unexpected shutdown like kexec the cciss
> firmware might still have some commands in flight, which
> it is trying to complete.
> The driver is doing it's best on resetting the HBA,
> but sadly there's a firmware issue causing the firmware
> _not_ to abort or drop old commands.
> So the firmware will send us commands which we haven't
> accounted for, causing the driver to panic.
>
> With this patch we're just ignoring these commands as
> there is nothing we could be doing with them anyway.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
Pardon my ignorance here, but don't you have a bigger problem: if the
reset is not dropping or aborting old commands, doesn't this also mean
that these old commands can still be _executing_? In which case any
(old) reads being executed could be scribbling over memory? (Memory that
may be being used for other purposes?)
Alan D. Brunelle
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH] cciss: Ignore stale commands after reboot
2009-07-06 20:33 ` Alan D. Brunelle
@ 2009-07-07 7:34 ` Hannes Reinecke
0 siblings, 0 replies; 11+ messages in thread
From: Hannes Reinecke @ 2009-07-07 7:34 UTC (permalink / raw)
To: Alan.Brunelle; +Cc: Jens Axboe, linux-kernel
Hi Alan,
Alan D. Brunelle wrote:
> Hannes Reinecke wrote:
>> When doing an unexpected shutdown like kexec the cciss
>> firmware might still have some commands in flight, which
>> it is trying to complete.
>> The driver is doing it's best on resetting the HBA,
>> but sadly there's a firmware issue causing the firmware
>> _not_ to abort or drop old commands.
>> So the firmware will send us commands which we haven't
>> accounted for, causing the driver to panic.
>>
>> With this patch we're just ignoring these commands as
>> there is nothing we could be doing with them anyway.
>>
>> Signed-off-by: Hannes Reinecke <hare@suse.de>
>
> Pardon my ignorance here, but don't you have a bigger problem: if the
> reset is not dropping or aborting old commands, doesn't this also mean
> that these old commands can still be _executing_? In which case any
> (old) reads being executed could be scribbling over memory? (Memory that
> may be being used for other purposes?)
>
Yes and no.
This scenario is being observed whilst doing a kexec/kdump reboot.
IE a new kernel is started directly from the context of an
already running kernel, so there is a fair chance that IO is
still in flight.
In flight here means the kernel/driver has send the commands to the
firmware but not yet received a reply/completion to them.
So the kdump kernel boots and initializes the driver.
The driver itself tries to initializes the firmware, but due to the
abovementioned bug this initialization does _not_ clear out old
commands, so when the driver is up and running is receives
command completions.
But these completions are not associated with any commands the
driver has been sent, so we can as well drop them to the floor.
Which is what this patch is all about.
So yes, there is some sort of overwrite in the sense the 'old'
IO is being committed to disk by the time the new kernel starts.
But no, it doesn't really matter to us as we're starting out
with any operations only _after_ we have received these stale
IO.
HTH.
Cheers,
Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@suse.de +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Markus Rex, HRB 16746 (AG Nürnberg)
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2009-07-07 7:34 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-07-02 9:36 [PATCH] cciss: Ignore stale commands after reboot Hannes Reinecke
2009-07-02 19:00 ` Jens Axboe
-- strict thread matches above, loose matches on Subject: below --
2009-07-02 8:23 Hannes Reinecke
2009-07-02 8:28 ` Jens Axboe
2009-07-02 8:44 ` Hannes Reinecke
2009-07-02 9:18 ` Jens Axboe
2009-07-02 9:36 ` Hannes Reinecke
2009-07-02 10:26 ` Jens Axboe
2009-07-02 10:28 ` Hannes Reinecke
2009-07-06 20:33 ` Alan D. Brunelle
2009-07-07 7:34 ` Hannes Reinecke
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox