* [PATCH] scsi_debug: illegal blocking memory allocation
@ 2007-01-03 13:49 Jens Axboe
2007-01-04 5:38 ` Douglas Gilbert
0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2007-01-03 13:49 UTC (permalink / raw)
To: dougg; +Cc: linux-scsi
Hi Doug,
resp_inquiry() does a GFP_KERNEL memory allocation, but it's not allowed
to from the queuecommand context. There's no good way to return this
error, so I used DID_ERROR which is used from similar paths. That
doesn't seem quite right though, it would be better to return an error
indicating a later retry would be more appropriate.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
index 30ee3d7..0c80ed3 100644
--- a/drivers/scsi/scsi_debug.c
+++ b/drivers/scsi/scsi_debug.c
@@ -954,7 +954,9 @@ static int resp_inquiry(struct scsi_cmnd * scp, int target,
int alloc_len, n, ret;
alloc_len = (cmd[3] << 8) + cmd[4];
- arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_KERNEL);
+ arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC);
+ if (!arr)
+ return DID_ERROR << 16;
if (devip->wlun)
pq_pdt = 0x1e; /* present, wlun */
else if (scsi_debug_no_lun_0 && (0 == devip->lun))
--
Jens Axboe
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation
2007-01-03 13:49 [PATCH] scsi_debug: illegal blocking memory allocation Jens Axboe
@ 2007-01-04 5:38 ` Douglas Gilbert
2007-01-04 11:21 ` Jens Axboe
0 siblings, 1 reply; 9+ messages in thread
From: Douglas Gilbert @ 2007-01-04 5:38 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-scsi
[-- Attachment #1: Type: text/plain, Size: 1974 bytes --]
Jens Axboe wrote:
> Hi Doug,
>
> resp_inquiry() does a GFP_KERNEL memory allocation, but it's not allowed
> to from the queuecommand context. There's no good way to return this
> error, so I used DID_ERROR which is used from similar paths. That
> doesn't seem quite right though, it would be better to return an error
> indicating a later retry would be more appropriate.
>
> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
>
> diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
> index 30ee3d7..0c80ed3 100644
> --- a/drivers/scsi/scsi_debug.c
> +++ b/drivers/scsi/scsi_debug.c
> @@ -954,7 +954,9 @@ static int resp_inquiry(struct scsi_cmnd * scp, int target,
> int alloc_len, n, ret;
>
> alloc_len = (cmd[3] << 8) + cmd[4];
> - arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_KERNEL);
> + arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC);
> + if (!arr)
> + return DID_ERROR << 16;
> if (devip->wlun)
> pq_pdt = 0x1e; /* present, wlun */
> else if (scsi_debug_no_lun_0 && (0 == devip->lun))
>
Jens,
I had to read that twice. I'm always happy to convert a
GFP_KERNEL to a GFP_ATOMIC (as I'm sure it started as a
GFP_ATOMIC). There are a couple more that may be in
queuecommand context.
Taking up your point about retries and seeing that the
scsi_debug driver has a SAS flavour, I'm inclined towards
a "aborted command, initiator response timeout" [Bh,4Bh,6]
CHECK CONDITION. There is a group of transport injected
error messages in SAS (see sas2r07.pdf section 10.2.3)
that pop up from time to time. Due to conjestion in
connection-switched SAS expanders these error messages
should be interpreted as "try again" depending on the
topology. The patch below adds a "aborted_command" bit
in opts that will cause every nth command to be aborted
(with an ack/nak timeout).
Note that SAS has an optional "transport layer retries"
state machine to lessen the incidence of "aborted commands".
Evidently SAS tape drives use the facility.
Doug Gilbert
[-- Attachment #2: sdebug_lk2620rc3abo --]
[-- Type: text/plain, Size: 5553 bytes --]
--- linux/drivers/scsi/scsi_debug.c 2006-11-30 10:00:01.000000000 -0500
+++ linux/drivers/scsi/scsi_debug.c2620rc3abo 2007-01-04 00:19:49.000000000 -0500
@@ -51,10 +51,10 @@
#include "scsi_logging.h"
#include "scsi_debug.h"
-#define SCSI_DEBUG_VERSION "1.80"
-static const char * scsi_debug_version_date = "20061018";
+#define SCSI_DEBUG_VERSION "1.81"
+static const char * scsi_debug_version_date = "20070104";
-/* Additional Sense Code (ASC) used */
+/* Additional Sense Code (ASC) */
#define NO_ADDITIONAL_SENSE 0x0
#define LOGICAL_UNIT_NOT_READY 0x4
#define UNRECOVERED_READ_ERR 0x11
@@ -65,9 +65,14 @@
#define INVALID_FIELD_IN_PARAM_LIST 0x26
#define POWERON_RESET 0x29
#define SAVING_PARAMS_UNSUP 0x39
+#define TRANSPORT_PROBLEM 0x4b
#define THRESHOLD_EXCEEDED 0x5d
#define LOW_POWER_COND_ON 0x5e
+/* Additional Sense Code Qualifier (ASCQ) */
+#define ACK_NAK_TO 0x3
+#define INIT_RESPONSE_TO 0x6
+
#define SDEBUG_TAGGED_QUEUING 0 /* 0 | MSG_SIMPLE_TAG | MSG_ORDERED_TAG */
/* Default values for driver parameters */
@@ -95,15 +100,20 @@
#define SCSI_DEBUG_OPT_MEDIUM_ERR 2
#define SCSI_DEBUG_OPT_TIMEOUT 4
#define SCSI_DEBUG_OPT_RECOVERED_ERR 8
+#define SCSI_DEBUG_OPT_TRANSPORT_ERR 16
/* When "every_nth" > 0 then modulo "every_nth" commands:
* - a no response is simulated if SCSI_DEBUG_OPT_TIMEOUT is set
* - a RECOVERED_ERROR is simulated on successful read and write
* commands if SCSI_DEBUG_OPT_RECOVERED_ERR is set.
+ * - a TRANSPORT_ERROR is simulated on successful read and write
+ * commands if SCSI_DEBUG_OPT_TRANSPORT_ERR is set.
*
* When "every_nth" < 0 then after "- every_nth" commands:
* - a no response is simulated if SCSI_DEBUG_OPT_TIMEOUT is set
* - a RECOVERED_ERROR is simulated on successful read and write
* commands if SCSI_DEBUG_OPT_RECOVERED_ERR is set.
+ * - a TRANSPORT_ERROR is simulated on successful read and write
+ * commands if SCSI_DEBUG_OPT_TRANSPORT_ERR is set.
* This will continue until some other action occurs (e.g. the user
* writing a new value (other than -1 or 1) to every_nth via sysfs).
*/
@@ -315,6 +325,7 @@
int target = SCpnt->device->id;
struct sdebug_dev_info * devip = NULL;
int inj_recovered = 0;
+ int inj_transport = 0;
int delay_override = 0;
if (done == NULL)
@@ -352,6 +363,8 @@
return 0; /* ignore command causing timeout */
else if (SCSI_DEBUG_OPT_RECOVERED_ERR & scsi_debug_opts)
inj_recovered = 1; /* to reads and writes below */
+ else if (SCSI_DEBUG_OPT_TRANSPORT_ERR & scsi_debug_opts)
+ inj_transport = 1; /* to reads and writes below */
}
if (devip->wlun) {
@@ -468,7 +481,11 @@
mk_sense_buffer(devip, RECOVERED_ERROR,
THRESHOLD_EXCEEDED, 0);
errsts = check_condition_result;
- }
+ } else if (inj_transport && (0 == errsts)) {
+ mk_sense_buffer(devip, ABORTED_COMMAND,
+ TRANSPORT_PROBLEM, ACK_NAK_TO);
+ errsts = check_condition_result;
+ }
break;
case REPORT_LUNS: /* mandatory, ignore unit attention */
delay_override = 1;
@@ -531,6 +548,9 @@
delay_override = 1;
errsts = check_readiness(SCpnt, 0, devip);
break;
+ case WRITE_BUFFER:
+ errsts = check_readiness(SCpnt, 1, devip);
+ break;
default:
if (SCSI_DEBUG_OPT_NOISE & scsi_debug_opts)
printk(KERN_INFO "scsi_debug: Opcode: 0x%x not "
@@ -954,7 +974,12 @@
int alloc_len, n, ret;
alloc_len = (cmd[3] << 8) + cmd[4];
- arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_KERNEL);
+ arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC);
+ if (NULL == arr) {
+ mk_sense_buffer(devip, ABORTED_COMMAND, TRANSPORT_PROBLEM,
+ INIT_RESPONSE_TO);
+ return check_condition_result;
+ }
if (devip->wlun)
pq_pdt = 0x1e; /* present, wlun */
else if (scsi_debug_no_lun_0 && (0 == devip->lun))
@@ -1217,7 +1242,12 @@
alen = ((cmd[6] << 24) + (cmd[7] << 16) + (cmd[8] << 8)
+ cmd[9]);
- arr = kzalloc(SDEBUG_MAX_TGTPGS_ARR_SZ, GFP_KERNEL);
+ arr = kzalloc(SDEBUG_MAX_TGTPGS_ARR_SZ, GFP_ATOMIC);
+ if (NULL == arr) {
+ mk_sense_buffer(devip, ABORTED_COMMAND, TRANSPORT_PROBLEM,
+ INIT_RESPONSE_TO);
+ return check_condition_result;
+ }
/*
* EVPD page 0x88 states we have two ports, one
* real and a fake port with no device connected.
@@ -2044,7 +2074,7 @@
}
}
if (NULL == open_devip) { /* try and make a new one */
- open_devip = kzalloc(sizeof(*open_devip),GFP_KERNEL);
+ open_devip = kzalloc(sizeof(*open_devip),GFP_ATOMIC);
if (NULL == open_devip) {
printk(KERN_ERR "%s: out of memory at line %d\n",
__FUNCTION__, __LINE__);
@@ -2388,7 +2418,7 @@
MODULE_PARM_DESC(no_lun_0, "no LU number 0 (def=0 -> have lun 0)");
MODULE_PARM_DESC(num_parts, "number of partitions(def=0)");
MODULE_PARM_DESC(num_tgts, "number of targets per host to simulate(def=1)");
-MODULE_PARM_DESC(opts, "1->noise, 2->medium_error, 4->... (def=0)");
+MODULE_PARM_DESC(opts, "1->noise, 2->medium_err, 4->timeout, 8->recovered_err... (def=0)");
MODULE_PARM_DESC(ptype, "SCSI peripheral type(def=0[disk])");
MODULE_PARM_DESC(scsi_level, "SCSI level to simulate(def=5[SPC-3])");
MODULE_PARM_DESC(virtual_gb, "virtual gigabyte size (def=0 -> use dev_size_mb)");
@@ -2943,7 +2973,6 @@
struct list_head *lh, *lh_sf;
sdbg_host = kzalloc(sizeof(*sdbg_host),GFP_KERNEL);
-
if (NULL == sdbg_host) {
printk(KERN_ERR "%s: out of memory at line %d\n",
__FUNCTION__, __LINE__);
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation
2007-01-04 5:38 ` Douglas Gilbert
@ 2007-01-04 11:21 ` Jens Axboe
2007-01-04 15:10 ` James Bottomley
0 siblings, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2007-01-04 11:21 UTC (permalink / raw)
To: Douglas Gilbert; +Cc: linux-scsi
On Thu, Jan 04 2007, Douglas Gilbert wrote:
> Jens Axboe wrote:
> > Hi Doug,
> >
> > resp_inquiry() does a GFP_KERNEL memory allocation, but it's not allowed
> > to from the queuecommand context. There's no good way to return this
> > error, so I used DID_ERROR which is used from similar paths. That
> > doesn't seem quite right though, it would be better to return an error
> > indicating a later retry would be more appropriate.
> >
> > Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
> >
> > diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c
> > index 30ee3d7..0c80ed3 100644
> > --- a/drivers/scsi/scsi_debug.c
> > +++ b/drivers/scsi/scsi_debug.c
> > @@ -954,7 +954,9 @@ static int resp_inquiry(struct scsi_cmnd * scp, int target,
> > int alloc_len, n, ret;
> >
> > alloc_len = (cmd[3] << 8) + cmd[4];
> > - arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_KERNEL);
> > + arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC);
> > + if (!arr)
> > + return DID_ERROR << 16;
> > if (devip->wlun)
> > pq_pdt = 0x1e; /* present, wlun */
> > else if (scsi_debug_no_lun_0 && (0 == devip->lun))
> >
>
> Jens,
> I had to read that twice. I'm always happy to convert a
> GFP_KERNEL to a GFP_ATOMIC (as I'm sure it started as a
> GFP_ATOMIC). There are a couple more that may be in
> queuecommand context.
>
> Taking up your point about retries and seeing that the
> scsi_debug driver has a SAS flavour, I'm inclined towards
> a "aborted command, initiator response timeout" [Bh,4Bh,6]
> CHECK CONDITION. There is a group of transport injected
> error messages in SAS (see sas2r07.pdf section 10.2.3)
> that pop up from time to time. Due to conjestion in
> connection-switched SAS expanders these error messages
> should be interpreted as "try again" depending on the
> topology. The patch below adds a "aborted_command" bit
> in opts that will cause every nth command to be aborted
> (with an ack/nak timeout).
>
> Note that SAS has an optional "transport layer retries"
> state machine to lessen the incidence of "aborted commands".
> Evidently SAS tape drives use the facility.
I guess it's fully up to you how you want to solve it. The scheme seems
a little elaborate, but these error conditions are unlikely to ever been
seen in the wild, so no objections from me.
I'd suggest sending this in for 2.6.20.
--
Jens Axboe
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation
2007-01-04 11:21 ` Jens Axboe
@ 2007-01-04 15:10 ` James Bottomley
2007-01-04 15:27 ` Matthew Wilcox
2007-01-04 15:50 ` Jens Axboe
0 siblings, 2 replies; 9+ messages in thread
From: James Bottomley @ 2007-01-04 15:10 UTC (permalink / raw)
To: Jens Axboe; +Cc: Douglas Gilbert, linux-scsi
On Thu, 2007-01-04 at 12:21 +0100, Jens Axboe wrote:
> I guess it's fully up to you how you want to solve it. The scheme seems
> a little elaborate, but these error conditions are unlikely to ever been
> seen in the wild, so no objections from me.
Actually, there's already a DID_ code that does what you want. Instead
of DID_ERROR, which will retry immediately, there's DID_REQUEUE which
will halt the device queue and wait for a returning command to retry.
James
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation
2007-01-04 15:10 ` James Bottomley
@ 2007-01-04 15:27 ` Matthew Wilcox
2007-01-04 15:34 ` James Bottomley
2007-01-04 15:50 ` Jens Axboe
1 sibling, 1 reply; 9+ messages in thread
From: Matthew Wilcox @ 2007-01-04 15:27 UTC (permalink / raw)
To: James Bottomley; +Cc: Jens Axboe, Douglas Gilbert, linux-scsi
On Thu, Jan 04, 2007 at 09:10:01AM -0600, James Bottomley wrote:
> On Thu, 2007-01-04 at 12:21 +0100, Jens Axboe wrote:
> > I guess it's fully up to you how you want to solve it. The scheme seems
> > a little elaborate, but these error conditions are unlikely to ever been
> > seen in the wild, so no objections from me.
>
> Actually, there's already a DID_ code that does what you want. Instead
> of DID_ERROR, which will retry immediately, there's DID_REQUEUE which
> will halt the device queue and wait for a returning command to retry.
Any returning command, or a returning command from this device? We may
be out of memory from a different device, and this may be the only
command being sent to the scsi_debug driver. Wouldn't we end up
halting the device queue indefinitely then?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation
2007-01-04 15:27 ` Matthew Wilcox
@ 2007-01-04 15:34 ` James Bottomley
0 siblings, 0 replies; 9+ messages in thread
From: James Bottomley @ 2007-01-04 15:34 UTC (permalink / raw)
To: Matthew Wilcox; +Cc: Jens Axboe, Douglas Gilbert, linux-scsi
On Thu, 2007-01-04 at 08:27 -0700, Matthew Wilcox wrote:
> Any returning command, or a returning command from this device?
This device
> We may
> be out of memory from a different device, and this may be the only
> command being sent to the scsi_debug driver. Wouldn't we end up
> halting the device queue indefinitely then?
it's basically the SCSI_MLQUEUE_DEVICE_BUSY condition ... and yes, it
has a mechanism for no outstanding commands.
James
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation
2007-01-04 15:10 ` James Bottomley
2007-01-04 15:27 ` Matthew Wilcox
@ 2007-01-04 15:50 ` Jens Axboe
2007-01-05 5:30 ` Douglas Gilbert
1 sibling, 1 reply; 9+ messages in thread
From: Jens Axboe @ 2007-01-04 15:50 UTC (permalink / raw)
To: James Bottomley; +Cc: Douglas Gilbert, linux-scsi
On Thu, Jan 04 2007, James Bottomley wrote:
> On Thu, 2007-01-04 at 12:21 +0100, Jens Axboe wrote:
> > I guess it's fully up to you how you want to solve it. The scheme seems
> > a little elaborate, but these error conditions are unlikely to ever been
> > seen in the wild, so no objections from me.
>
> Actually, there's already a DID_ code that does what you want. Instead
> of DID_ERROR, which will retry immediately, there's DID_REQUEUE which
> will halt the device queue and wait for a returning command to retry.
As long as it keeps firing the queue at some intervals even without any
commands pending at all, then that'll work just fine. I like that
approach a lot better than coding the error into some sense value that
is (at best) some vague approximation of what has happened (calling
memory shortage a transport error is a bit of a stretch).
--
Jens Axboe
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation
2007-01-04 15:50 ` Jens Axboe
@ 2007-01-05 5:30 ` Douglas Gilbert
2007-01-05 12:49 ` Jens Axboe
0 siblings, 1 reply; 9+ messages in thread
From: Douglas Gilbert @ 2007-01-05 5:30 UTC (permalink / raw)
To: Jens Axboe; +Cc: James Bottomley, linux-scsi
Jens Axboe wrote:
> On Thu, Jan 04 2007, James Bottomley wrote:
>> On Thu, 2007-01-04 at 12:21 +0100, Jens Axboe wrote:
>>> I guess it's fully up to you how you want to solve it. The scheme seems
>>> a little elaborate, but these error conditions are unlikely to ever been
>>> seen in the wild, so no objections from me.
>> Actually, there's already a DID_ code that does what you want. Instead
>> of DID_ERROR, which will retry immediately, there's DID_REQUEUE which
>> will halt the device queue and wait for a returning command to retry.
>
> As long as it keeps firing the queue at some intervals even without any
> commands pending at all, then that'll work just fine. I like that
> approach a lot better than coding the error into some sense value that
> is (at best) some vague approximation of what has happened (calling
> memory shortage a transport error is a bit of a stretch).
True, but both happen. The scsi_debug driver is a
virtual host, virtual target and a lu (ram disk).
The failure that you pointed out stopped a response
being built. In the real world that would in the
target or lu.
The reason that I mentioned aborted_command sense key
is that it is also a "out of resources" (bandwidth) error
and it broke sg_dd. I would bet money that it would
also break the block layer/sd. The block layer should
leave it alone as it is simply a matter of sd
retrying a few times. However the st driver could have
a real problem (e.g. did that state changing command
work, fail or partially work??).
Anyway, I have submitted a patch that reports DID_REQUEUE
for an allocation failures and adds the ability to inject
aborted_command errors.
Doug Gilbert
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation
2007-01-05 5:30 ` Douglas Gilbert
@ 2007-01-05 12:49 ` Jens Axboe
0 siblings, 0 replies; 9+ messages in thread
From: Jens Axboe @ 2007-01-05 12:49 UTC (permalink / raw)
To: Douglas Gilbert; +Cc: James Bottomley, linux-scsi
On Fri, Jan 05 2007, Douglas Gilbert wrote:
> Jens Axboe wrote:
> > On Thu, Jan 04 2007, James Bottomley wrote:
> >> On Thu, 2007-01-04 at 12:21 +0100, Jens Axboe wrote:
> >>> I guess it's fully up to you how you want to solve it. The scheme seems
> >>> a little elaborate, but these error conditions are unlikely to ever been
> >>> seen in the wild, so no objections from me.
> >> Actually, there's already a DID_ code that does what you want. Instead
> >> of DID_ERROR, which will retry immediately, there's DID_REQUEUE which
> >> will halt the device queue and wait for a returning command to retry.
> >
> > As long as it keeps firing the queue at some intervals even without any
> > commands pending at all, then that'll work just fine. I like that
> > approach a lot better than coding the error into some sense value that
> > is (at best) some vague approximation of what has happened (calling
> > memory shortage a transport error is a bit of a stretch).
>
> True, but both happen. The scsi_debug driver is a
> virtual host, virtual target and a lu (ram disk).
> The failure that you pointed out stopped a response
> being built. In the real world that would in the
> target or lu.
The point is that the condition need not be exposed, even if it could be
compared to (say) a device busy condition. Both should just results in a
requeue and later retry, when resources are available to satisfy the
request.
Your revised patch looks good to me!
--
Jens Axboe
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2007-01-05 12:48 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-01-03 13:49 [PATCH] scsi_debug: illegal blocking memory allocation Jens Axboe
2007-01-04 5:38 ` Douglas Gilbert
2007-01-04 11:21 ` Jens Axboe
2007-01-04 15:10 ` James Bottomley
2007-01-04 15:27 ` Matthew Wilcox
2007-01-04 15:34 ` James Bottomley
2007-01-04 15:50 ` Jens Axboe
2007-01-05 5:30 ` Douglas Gilbert
2007-01-05 12:49 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox