All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] scsi_lib: rate-limit the error message from failing commands
@ 2012-03-05 20:42 Yi Zou
  2012-03-06 15:40 ` Tomas Henzl
  0 siblings, 1 reply; 5+ messages in thread
From: Yi Zou @ 2012-03-05 20:42 UTC (permalink / raw)
  To: linux-scsi; +Cc: devel

When performing a cable pull test w/ active stress I/O using fio over
a dual port Intel 82599 FCoE CNA, w/ 256LUNs on one port and about 32LUNs
on the other, it is observed that the system becomes not usable due to
scsi-ml being busy printing the error messages for all the failing commands.
I don't believe this problem is specific to FCoE and these commands are
anyway failing due to link being down (DID_NO_CONNECT), just rate-limit
the messages here to solve this issue.

Signed-off-by: Yi Zou <yi.zou@intel.com>
Cc: www.Open-FCoE.org <devel@open-fcoe.org>
---

 drivers/scsi/scsi_lib.c |    3 ++-
 1 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index b2c95db..e2128d9 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -934,7 +934,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
 	case ACTION_FAIL:
 		/* Give up and fail the remainder of the request */
 		scsi_release_buffers(cmd);
-		if (!(req->cmd_flags & REQ_QUIET)) {
+		if (!(req->cmd_flags & REQ_QUIET) &&
+		    printk_ratelimit()) {
 			if (description)
 				scmd_printk(KERN_INFO, cmd, "%s\n",
 					    description);


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH] scsi_lib: rate-limit the error message from failing commands
  2012-03-05 20:42 [PATCH] scsi_lib: rate-limit the error message from failing commands Yi Zou
@ 2012-03-06 15:40 ` Tomas Henzl
  2012-03-06 17:10   ` Zou, Yi
  0 siblings, 1 reply; 5+ messages in thread
From: Tomas Henzl @ 2012-03-06 15:40 UTC (permalink / raw)
  To: Yi Zou; +Cc: linux-scsi, devel

On 03/05/2012 09:42 PM, Yi Zou wrote:
> When performing a cable pull test w/ active stress I/O using fio over
> a dual port Intel 82599 FCoE CNA, w/ 256LUNs on one port and about 32LUNs
> on the other, it is observed that the system becomes not usable due to
> scsi-ml being busy printing the error messages for all the failing commands.
> I don't believe this problem is specific to FCoE and these commands are
> anyway failing due to link being down (DID_NO_CONNECT), just rate-limit
> the messages here to solve this issue.
>
> Signed-off-by: Yi Zou <yi.zou@intel.com>
> Cc: www.Open-FCoE.org <devel@open-fcoe.org>
Hi Zou,
you shouldn't use printk_ratelimit .
checkpatch says:  WARNING: Prefer printk_ratelimited or pr_<level>_ratelimited to printk_ratelimit
Tomas

> ---
>
>  drivers/scsi/scsi_lib.c |    3 ++-
>  1 files changed, 2 insertions(+), 1 deletions(-)
>
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index b2c95db..e2128d9 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -934,7 +934,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
>  	case ACTION_FAIL:
>  		/* Give up and fail the remainder of the request */
>  		scsi_release_buffers(cmd);
> -		if (!(req->cmd_flags & REQ_QUIET)) {
> +		if (!(req->cmd_flags & REQ_QUIET) &&
> +		    printk_ratelimit()) {
>  			if (description)
>  				scmd_printk(KERN_INFO, cmd, "%s\n",
>  					    description);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH] scsi_lib: rate-limit the error message from failing commands
  2012-03-06 15:40 ` Tomas Henzl
@ 2012-03-06 17:10   ` Zou, Yi
  2012-03-07 13:17     ` Tomas Henzl
  0 siblings, 1 reply; 5+ messages in thread
From: Zou, Yi @ 2012-03-06 17:10 UTC (permalink / raw)
  To: Tomas Henzl; +Cc: linux-scsi@vger.kernel.org, devel@open-fcoe.org

> 
> On 03/05/2012 09:42 PM, Yi Zou wrote:
> > When performing a cable pull test w/ active stress I/O using fio over
> > a dual port Intel 82599 FCoE CNA, w/ 256LUNs on one port and about
> 32LUNs
> > on the other, it is observed that the system becomes not usable due to
> > scsi-ml being busy printing the error messages for all the failing
> commands.
> > I don't believe this problem is specific to FCoE and these commands are
> > anyway failing due to link being down (DID_NO_CONNECT), just rate-limit
> > the messages here to solve this issue.
> >
> > Signed-off-by: Yi Zou <yi.zou@intel.com>
> > Cc: www.Open-FCoE.org <devel@open-fcoe.org>
> Hi Zou,
> you shouldn't use printk_ratelimit .
> checkpatch says:  WARNING: Prefer printk_ratelimited or
> pr_<level>_ratelimited to printk_ratelimit
> Tomas

Yes, I saw that when I did the checkpatch, but pratek_ratelimit() actually
serves the purpose better, as it guards the block of the several scsi related
prints that follow, and this works well particularly for this bug. For the worst
case, the shared ratelimit surpresses everything here, blk i/o error report
would still be able pick up printing something.

If it is desired, which I doubt, I could change all scmd_printk, scsi_print_xxxx
calls to use printk_ratelimited(), seems an overkill for this to me. I am open
for a better solution.

yi

> 
> > ---
> >
> >  drivers/scsi/scsi_lib.c |    3 ++-
> >  1 files changed, 2 insertions(+), 1 deletions(-)
> >
> > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > index b2c95db..e2128d9 100644
> > --- a/drivers/scsi/scsi_lib.c
> > +++ b/drivers/scsi/scsi_lib.c
> > @@ -934,7 +934,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd,
> unsigned int good_bytes)
> >  	case ACTION_FAIL:
> >  		/* Give up and fail the remainder of the request */
> >  		scsi_release_buffers(cmd);
> > -		if (!(req->cmd_flags & REQ_QUIET)) {
> > +		if (!(req->cmd_flags & REQ_QUIET) &&
> > +		    printk_ratelimit()) {
> >  			if (description)
> >  				scmd_printk(KERN_INFO, cmd, "%s\n",
> >  					    description);
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH] scsi_lib: rate-limit the error message from failing commands
  2012-03-06 17:10   ` Zou, Yi
@ 2012-03-07 13:17     ` Tomas Henzl
  2012-03-07 14:55       ` Zou, Yi
  0 siblings, 1 reply; 5+ messages in thread
From: Tomas Henzl @ 2012-03-07 13:17 UTC (permalink / raw)
  To: Zou, Yi; +Cc: linux-scsi@vger.kernel.org, devel@open-fcoe.org

On 03/06/2012 06:10 PM, Zou, Yi wrote:
>> On 03/05/2012 09:42 PM, Yi Zou wrote:
>>> When performing a cable pull test w/ active stress I/O using fio over
>>> a dual port Intel 82599 FCoE CNA, w/ 256LUNs on one port and about
>> 32LUNs
>>> on the other, it is observed that the system becomes not usable due to
>>> scsi-ml being busy printing the error messages for all the failing
>> commands.
>>> I don't believe this problem is specific to FCoE and these commands are
>>> anyway failing due to link being down (DID_NO_CONNECT), just rate-limit
>>> the messages here to solve this issue.
>>>
>>> Signed-off-by: Yi Zou <yi.zou@intel.com>
>>> Cc: www.Open-FCoE.org <devel@open-fcoe.org>
>> Hi Zou,
>> you shouldn't use printk_ratelimit .
>> checkpatch says:  WARNING: Prefer printk_ratelimited or
>> pr_<level>_ratelimited to printk_ratelimit
>> Tomas
> Yes, I saw that when I did the checkpatch, but pratek_ratelimit() actually
> serves the purpose better, as it guards the block of the several scsi related
> prints that follow, and this works well particularly for this bug. For the worst
> case, the shared ratelimit surpresses everything here, blk i/o error report
> would still be able pick up printing something.
>
> If it is desired, which I doubt, I could change all scmd_printk, scsi_print_xxxx
> calls to use printk_ratelimited(), seems an overkill for this to me. I am open
> for a better solution.
For a whole block of commands you can use __ratelimit, so something like this
were possible:

@@ -20,6 +20,7 @@
 #include <linux/delay.h>
 #include <linux/hardirq.h>
 #include <linux/scatterlist.h>
+#include <linux/ratelimit.h>
 
 #include <scsi/scsi.h>
 #include <scsi/scsi_cmnd.h>
@@ -745,6 +746,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
        enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY,
              ACTION_DELAYED_RETRY} action;
        char *description = NULL;
+       static DEFINE_RATELIMIT_STATE(rs,  DEFAULT_RATELIMIT_INTERVAL,
+                                               DEFAULT_RATELIMIT_BURST);
 
        if (result) {
                sense_valid = scsi_command_normalize_sense(cmd, &sshdr);
@@ -935,7 +938,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd, unsigned int good_bytes)
        case ACTION_FAIL:
                /* Give up and fail the remainder of the request */
                scsi_release_buffers(cmd);
-               if (!(req->cmd_flags & REQ_QUIET)) {
+               if (!(req->cmd_flags & REQ_QUIET) &&  __ratelimit(&rs)) {
                        if (description)
                                scmd_printk(KERN_INFO, cmd, "%s\n",
                                            description);yi


>>> ---
>>>
>>>  drivers/scsi/scsi_lib.c |    3 ++-
>>>  1 files changed, 2 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>>> index b2c95db..e2128d9 100644
>>> --- a/drivers/scsi/scsi_lib.c
>>> +++ b/drivers/scsi/scsi_lib.c
>>> @@ -934,7 +934,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd,
>> unsigned int good_bytes)
>>>  	case ACTION_FAIL:
>>>  		/* Give up and fail the remainder of the request */
>>>  		scsi_release_buffers(cmd);
>>> -		if (!(req->cmd_flags & REQ_QUIET)) {
>>> +		if (!(req->cmd_flags & REQ_QUIET) &&
>>> +		    printk_ratelimit()) {
>>>  			if (description)
>>>  				scmd_printk(KERN_INFO, cmd, "%s\n",
>>>  					    description);
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
>> in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

* RE: [PATCH] scsi_lib: rate-limit the error message from failing commands
  2012-03-07 13:17     ` Tomas Henzl
@ 2012-03-07 14:55       ` Zou, Yi
  0 siblings, 0 replies; 5+ messages in thread
From: Zou, Yi @ 2012-03-07 14:55 UTC (permalink / raw)
  To: Tomas Henzl; +Cc: linux-scsi@vger.kernel.org, devel@open-fcoe.org

> 
> On 03/06/2012 06:10 PM, Zou, Yi wrote:
> >> On 03/05/2012 09:42 PM, Yi Zou wrote:
> >>> When performing a cable pull test w/ active stress I/O using fio over
> >>> a dual port Intel 82599 FCoE CNA, w/ 256LUNs on one port and about
> >> 32LUNs
> >>> on the other, it is observed that the system becomes not usable due
> to
> >>> scsi-ml being busy printing the error messages for all the failing
> >> commands.
> >>> I don't believe this problem is specific to FCoE and these commands
> are
> >>> anyway failing due to link being down (DID_NO_CONNECT), just rate-
> limit
> >>> the messages here to solve this issue.
> >>>
> >>> Signed-off-by: Yi Zou <yi.zou@intel.com>
> >>> Cc: www.Open-FCoE.org <devel@open-fcoe.org>
> >> Hi Zou,
> >> you shouldn't use printk_ratelimit .
> >> checkpatch says:  WARNING: Prefer printk_ratelimited or
> >> pr_<level>_ratelimited to printk_ratelimit
> >> Tomas
> > Yes, I saw that when I did the checkpatch, but pratek_ratelimit()
> actually
> > serves the purpose better, as it guards the block of the several scsi
> related
> > prints that follow, and this works well particularly for this bug. For
> the worst
> > case, the shared ratelimit surpresses everything here, blk i/o error
> report
> > would still be able pick up printing something.
> >
> > If it is desired, which I doubt, I could change all scmd_printk,
> scsi_print_xxxx
> > calls to use printk_ratelimited(), seems an overkill for this to me. I
> am open
> > for a better solution.
> For a whole block of commands you can use __ratelimit, so something like
> this
> were possible:

> 
> @@ -20,6 +20,7 @@
>  #include <linux/delay.h>
>  #include <linux/hardirq.h>
>  #include <linux/scatterlist.h>
> +#include <linux/ratelimit.h>
> 
>  #include <scsi/scsi.h>
>  #include <scsi/scsi_cmnd.h>
> @@ -745,6 +746,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd,
> unsigned int good_bytes)
>         enum {ACTION_FAIL, ACTION_REPREP, ACTION_RETRY,
>               ACTION_DELAYED_RETRY} action;
>         char *description = NULL;
> +       static DEFINE_RATELIMIT_STATE(rs,  DEFAULT_RATELIMIT_INTERVAL,
> +                                               DEFAULT_RATELIMIT_BURST);
> 
>         if (result) {
>                 sense_valid = scsi_command_normalize_sense(cmd, &sshdr);
> @@ -935,7 +938,7 @@ void scsi_io_completion(struct scsi_cmnd *cmd,
> unsigned int good_bytes)
>         case ACTION_FAIL:
>                 /* Give up and fail the remainder of the request */
>                 scsi_release_buffers(cmd);
> -               if (!(req->cmd_flags & REQ_QUIET)) {
> +               if (!(req->cmd_flags & REQ_QUIET) &&  __ratelimit(&rs)) {
>                         if (description)
>                                 scmd_printk(KERN_INFO, cmd, "%s\n",
>                                             description);yi
> 
This looks better, I will resend the patch, which will be pretty much the same 
what you have here.

thanks.
yi


> 
> >>> ---
> >>>
> >>>  drivers/scsi/scsi_lib.c |    3 ++-
> >>>  1 files changed, 2 insertions(+), 1 deletions(-)
> >>>
> >>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> >>> index b2c95db..e2128d9 100644
> >>> --- a/drivers/scsi/scsi_lib.c
> >>> +++ b/drivers/scsi/scsi_lib.c
> >>> @@ -934,7 +934,8 @@ void scsi_io_completion(struct scsi_cmnd *cmd,
> >> unsigned int good_bytes)
> >>>  	case ACTION_FAIL:
> >>>  		/* Give up and fail the remainder of the request */
> >>>  		scsi_release_buffers(cmd);
> >>> -		if (!(req->cmd_flags & REQ_QUIET)) {
> >>> +		if (!(req->cmd_flags & REQ_QUIET) &&
> >>> +		    printk_ratelimit()) {
> >>>  			if (description)
> >>>  				scmd_printk(KERN_INFO, cmd, "%s\n",
> >>>  					    description);
> >>>
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> >> in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-scsi"
> in
> >> the body of a message to majordomo@vger.kernel.org
> >> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-03-07 14:55 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-03-05 20:42 [PATCH] scsi_lib: rate-limit the error message from failing commands Yi Zou
2012-03-06 15:40 ` Tomas Henzl
2012-03-06 17:10   ` Zou, Yi
2012-03-07 13:17     ` Tomas Henzl
2012-03-07 14:55       ` Zou, Yi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.