* [PATCH] scsi_debug: illegal blocking memory allocation @ 2007-01-03 13:49 Jens Axboe 2007-01-04 5:38 ` Douglas Gilbert 0 siblings, 1 reply; 9+ messages in thread From: Jens Axboe @ 2007-01-03 13:49 UTC (permalink / raw) To: dougg; +Cc: linux-scsi Hi Doug, resp_inquiry() does a GFP_KERNEL memory allocation, but it's not allowed to from the queuecommand context. There's no good way to return this error, so I used DID_ERROR which is used from similar paths. That doesn't seem quite right though, it would be better to return an error indicating a later retry would be more appropriate. Signed-off-by: Jens Axboe <jens.axboe@oracle.com> diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c index 30ee3d7..0c80ed3 100644 --- a/drivers/scsi/scsi_debug.c +++ b/drivers/scsi/scsi_debug.c @@ -954,7 +954,9 @@ static int resp_inquiry(struct scsi_cmnd * scp, int target, int alloc_len, n, ret; alloc_len = (cmd[3] << 8) + cmd[4]; - arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_KERNEL); + arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC); + if (!arr) + return DID_ERROR << 16; if (devip->wlun) pq_pdt = 0x1e; /* present, wlun */ else if (scsi_debug_no_lun_0 && (0 == devip->lun)) -- Jens Axboe ^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation 2007-01-03 13:49 [PATCH] scsi_debug: illegal blocking memory allocation Jens Axboe @ 2007-01-04 5:38 ` Douglas Gilbert 2007-01-04 11:21 ` Jens Axboe 0 siblings, 1 reply; 9+ messages in thread From: Douglas Gilbert @ 2007-01-04 5:38 UTC (permalink / raw) To: Jens Axboe; +Cc: linux-scsi [-- Attachment #1: Type: text/plain, Size: 1974 bytes --] Jens Axboe wrote: > Hi Doug, > > resp_inquiry() does a GFP_KERNEL memory allocation, but it's not allowed > to from the queuecommand context. There's no good way to return this > error, so I used DID_ERROR which is used from similar paths. That > doesn't seem quite right though, it would be better to return an error > indicating a later retry would be more appropriate. > > Signed-off-by: Jens Axboe <jens.axboe@oracle.com> > > diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c > index 30ee3d7..0c80ed3 100644 > --- a/drivers/scsi/scsi_debug.c > +++ b/drivers/scsi/scsi_debug.c > @@ -954,7 +954,9 @@ static int resp_inquiry(struct scsi_cmnd * scp, int target, > int alloc_len, n, ret; > > alloc_len = (cmd[3] << 8) + cmd[4]; > - arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_KERNEL); > + arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC); > + if (!arr) > + return DID_ERROR << 16; > if (devip->wlun) > pq_pdt = 0x1e; /* present, wlun */ > else if (scsi_debug_no_lun_0 && (0 == devip->lun)) > Jens, I had to read that twice. I'm always happy to convert a GFP_KERNEL to a GFP_ATOMIC (as I'm sure it started as a GFP_ATOMIC). There are a couple more that may be in queuecommand context. Taking up your point about retries and seeing that the scsi_debug driver has a SAS flavour, I'm inclined towards a "aborted command, initiator response timeout" [Bh,4Bh,6] CHECK CONDITION. There is a group of transport injected error messages in SAS (see sas2r07.pdf section 10.2.3) that pop up from time to time. Due to conjestion in connection-switched SAS expanders these error messages should be interpreted as "try again" depending on the topology. The patch below adds a "aborted_command" bit in opts that will cause every nth command to be aborted (with an ack/nak timeout). Note that SAS has an optional "transport layer retries" state machine to lessen the incidence of "aborted commands". Evidently SAS tape drives use the facility. Doug Gilbert [-- Attachment #2: sdebug_lk2620rc3abo --] [-- Type: text/plain, Size: 5553 bytes --] --- linux/drivers/scsi/scsi_debug.c 2006-11-30 10:00:01.000000000 -0500 +++ linux/drivers/scsi/scsi_debug.c2620rc3abo 2007-01-04 00:19:49.000000000 -0500 @@ -51,10 +51,10 @@ #include "scsi_logging.h" #include "scsi_debug.h" -#define SCSI_DEBUG_VERSION "1.80" -static const char * scsi_debug_version_date = "20061018"; +#define SCSI_DEBUG_VERSION "1.81" +static const char * scsi_debug_version_date = "20070104"; -/* Additional Sense Code (ASC) used */ +/* Additional Sense Code (ASC) */ #define NO_ADDITIONAL_SENSE 0x0 #define LOGICAL_UNIT_NOT_READY 0x4 #define UNRECOVERED_READ_ERR 0x11 @@ -65,9 +65,14 @@ #define INVALID_FIELD_IN_PARAM_LIST 0x26 #define POWERON_RESET 0x29 #define SAVING_PARAMS_UNSUP 0x39 +#define TRANSPORT_PROBLEM 0x4b #define THRESHOLD_EXCEEDED 0x5d #define LOW_POWER_COND_ON 0x5e +/* Additional Sense Code Qualifier (ASCQ) */ +#define ACK_NAK_TO 0x3 +#define INIT_RESPONSE_TO 0x6 + #define SDEBUG_TAGGED_QUEUING 0 /* 0 | MSG_SIMPLE_TAG | MSG_ORDERED_TAG */ /* Default values for driver parameters */ @@ -95,15 +100,20 @@ #define SCSI_DEBUG_OPT_MEDIUM_ERR 2 #define SCSI_DEBUG_OPT_TIMEOUT 4 #define SCSI_DEBUG_OPT_RECOVERED_ERR 8 +#define SCSI_DEBUG_OPT_TRANSPORT_ERR 16 /* When "every_nth" > 0 then modulo "every_nth" commands: * - a no response is simulated if SCSI_DEBUG_OPT_TIMEOUT is set * - a RECOVERED_ERROR is simulated on successful read and write * commands if SCSI_DEBUG_OPT_RECOVERED_ERR is set. + * - a TRANSPORT_ERROR is simulated on successful read and write + * commands if SCSI_DEBUG_OPT_TRANSPORT_ERR is set. * * When "every_nth" < 0 then after "- every_nth" commands: * - a no response is simulated if SCSI_DEBUG_OPT_TIMEOUT is set * - a RECOVERED_ERROR is simulated on successful read and write * commands if SCSI_DEBUG_OPT_RECOVERED_ERR is set. + * - a TRANSPORT_ERROR is simulated on successful read and write + * commands if SCSI_DEBUG_OPT_TRANSPORT_ERR is set. * This will continue until some other action occurs (e.g. the user * writing a new value (other than -1 or 1) to every_nth via sysfs). */ @@ -315,6 +325,7 @@ int target = SCpnt->device->id; struct sdebug_dev_info * devip = NULL; int inj_recovered = 0; + int inj_transport = 0; int delay_override = 0; if (done == NULL) @@ -352,6 +363,8 @@ return 0; /* ignore command causing timeout */ else if (SCSI_DEBUG_OPT_RECOVERED_ERR & scsi_debug_opts) inj_recovered = 1; /* to reads and writes below */ + else if (SCSI_DEBUG_OPT_TRANSPORT_ERR & scsi_debug_opts) + inj_transport = 1; /* to reads and writes below */ } if (devip->wlun) { @@ -468,7 +481,11 @@ mk_sense_buffer(devip, RECOVERED_ERROR, THRESHOLD_EXCEEDED, 0); errsts = check_condition_result; - } + } else if (inj_transport && (0 == errsts)) { + mk_sense_buffer(devip, ABORTED_COMMAND, + TRANSPORT_PROBLEM, ACK_NAK_TO); + errsts = check_condition_result; + } break; case REPORT_LUNS: /* mandatory, ignore unit attention */ delay_override = 1; @@ -531,6 +548,9 @@ delay_override = 1; errsts = check_readiness(SCpnt, 0, devip); break; + case WRITE_BUFFER: + errsts = check_readiness(SCpnt, 1, devip); + break; default: if (SCSI_DEBUG_OPT_NOISE & scsi_debug_opts) printk(KERN_INFO "scsi_debug: Opcode: 0x%x not " @@ -954,7 +974,12 @@ int alloc_len, n, ret; alloc_len = (cmd[3] << 8) + cmd[4]; - arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_KERNEL); + arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC); + if (NULL == arr) { + mk_sense_buffer(devip, ABORTED_COMMAND, TRANSPORT_PROBLEM, + INIT_RESPONSE_TO); + return check_condition_result; + } if (devip->wlun) pq_pdt = 0x1e; /* present, wlun */ else if (scsi_debug_no_lun_0 && (0 == devip->lun)) @@ -1217,7 +1242,12 @@ alen = ((cmd[6] << 24) + (cmd[7] << 16) + (cmd[8] << 8) + cmd[9]); - arr = kzalloc(SDEBUG_MAX_TGTPGS_ARR_SZ, GFP_KERNEL); + arr = kzalloc(SDEBUG_MAX_TGTPGS_ARR_SZ, GFP_ATOMIC); + if (NULL == arr) { + mk_sense_buffer(devip, ABORTED_COMMAND, TRANSPORT_PROBLEM, + INIT_RESPONSE_TO); + return check_condition_result; + } /* * EVPD page 0x88 states we have two ports, one * real and a fake port with no device connected. @@ -2044,7 +2074,7 @@ } } if (NULL == open_devip) { /* try and make a new one */ - open_devip = kzalloc(sizeof(*open_devip),GFP_KERNEL); + open_devip = kzalloc(sizeof(*open_devip),GFP_ATOMIC); if (NULL == open_devip) { printk(KERN_ERR "%s: out of memory at line %d\n", __FUNCTION__, __LINE__); @@ -2388,7 +2418,7 @@ MODULE_PARM_DESC(no_lun_0, "no LU number 0 (def=0 -> have lun 0)"); MODULE_PARM_DESC(num_parts, "number of partitions(def=0)"); MODULE_PARM_DESC(num_tgts, "number of targets per host to simulate(def=1)"); -MODULE_PARM_DESC(opts, "1->noise, 2->medium_error, 4->... (def=0)"); +MODULE_PARM_DESC(opts, "1->noise, 2->medium_err, 4->timeout, 8->recovered_err... (def=0)"); MODULE_PARM_DESC(ptype, "SCSI peripheral type(def=0[disk])"); MODULE_PARM_DESC(scsi_level, "SCSI level to simulate(def=5[SPC-3])"); MODULE_PARM_DESC(virtual_gb, "virtual gigabyte size (def=0 -> use dev_size_mb)"); @@ -2943,7 +2973,6 @@ struct list_head *lh, *lh_sf; sdbg_host = kzalloc(sizeof(*sdbg_host),GFP_KERNEL); - if (NULL == sdbg_host) { printk(KERN_ERR "%s: out of memory at line %d\n", __FUNCTION__, __LINE__); ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation 2007-01-04 5:38 ` Douglas Gilbert @ 2007-01-04 11:21 ` Jens Axboe 2007-01-04 15:10 ` James Bottomley 0 siblings, 1 reply; 9+ messages in thread From: Jens Axboe @ 2007-01-04 11:21 UTC (permalink / raw) To: Douglas Gilbert; +Cc: linux-scsi On Thu, Jan 04 2007, Douglas Gilbert wrote: > Jens Axboe wrote: > > Hi Doug, > > > > resp_inquiry() does a GFP_KERNEL memory allocation, but it's not allowed > > to from the queuecommand context. There's no good way to return this > > error, so I used DID_ERROR which is used from similar paths. That > > doesn't seem quite right though, it would be better to return an error > > indicating a later retry would be more appropriate. > > > > Signed-off-by: Jens Axboe <jens.axboe@oracle.com> > > > > diff --git a/drivers/scsi/scsi_debug.c b/drivers/scsi/scsi_debug.c > > index 30ee3d7..0c80ed3 100644 > > --- a/drivers/scsi/scsi_debug.c > > +++ b/drivers/scsi/scsi_debug.c > > @@ -954,7 +954,9 @@ static int resp_inquiry(struct scsi_cmnd * scp, int target, > > int alloc_len, n, ret; > > > > alloc_len = (cmd[3] << 8) + cmd[4]; > > - arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_KERNEL); > > + arr = kzalloc(SDEBUG_MAX_INQ_ARR_SZ, GFP_ATOMIC); > > + if (!arr) > > + return DID_ERROR << 16; > > if (devip->wlun) > > pq_pdt = 0x1e; /* present, wlun */ > > else if (scsi_debug_no_lun_0 && (0 == devip->lun)) > > > > Jens, > I had to read that twice. I'm always happy to convert a > GFP_KERNEL to a GFP_ATOMIC (as I'm sure it started as a > GFP_ATOMIC). There are a couple more that may be in > queuecommand context. > > Taking up your point about retries and seeing that the > scsi_debug driver has a SAS flavour, I'm inclined towards > a "aborted command, initiator response timeout" [Bh,4Bh,6] > CHECK CONDITION. There is a group of transport injected > error messages in SAS (see sas2r07.pdf section 10.2.3) > that pop up from time to time. Due to conjestion in > connection-switched SAS expanders these error messages > should be interpreted as "try again" depending on the > topology. The patch below adds a "aborted_command" bit > in opts that will cause every nth command to be aborted > (with an ack/nak timeout). > > Note that SAS has an optional "transport layer retries" > state machine to lessen the incidence of "aborted commands". > Evidently SAS tape drives use the facility. I guess it's fully up to you how you want to solve it. The scheme seems a little elaborate, but these error conditions are unlikely to ever been seen in the wild, so no objections from me. I'd suggest sending this in for 2.6.20. -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation 2007-01-04 11:21 ` Jens Axboe @ 2007-01-04 15:10 ` James Bottomley 2007-01-04 15:27 ` Matthew Wilcox 2007-01-04 15:50 ` Jens Axboe 0 siblings, 2 replies; 9+ messages in thread From: James Bottomley @ 2007-01-04 15:10 UTC (permalink / raw) To: Jens Axboe; +Cc: Douglas Gilbert, linux-scsi On Thu, 2007-01-04 at 12:21 +0100, Jens Axboe wrote: > I guess it's fully up to you how you want to solve it. The scheme seems > a little elaborate, but these error conditions are unlikely to ever been > seen in the wild, so no objections from me. Actually, there's already a DID_ code that does what you want. Instead of DID_ERROR, which will retry immediately, there's DID_REQUEUE which will halt the device queue and wait for a returning command to retry. James ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation 2007-01-04 15:10 ` James Bottomley @ 2007-01-04 15:27 ` Matthew Wilcox 2007-01-04 15:34 ` James Bottomley 2007-01-04 15:50 ` Jens Axboe 1 sibling, 1 reply; 9+ messages in thread From: Matthew Wilcox @ 2007-01-04 15:27 UTC (permalink / raw) To: James Bottomley; +Cc: Jens Axboe, Douglas Gilbert, linux-scsi On Thu, Jan 04, 2007 at 09:10:01AM -0600, James Bottomley wrote: > On Thu, 2007-01-04 at 12:21 +0100, Jens Axboe wrote: > > I guess it's fully up to you how you want to solve it. The scheme seems > > a little elaborate, but these error conditions are unlikely to ever been > > seen in the wild, so no objections from me. > > Actually, there's already a DID_ code that does what you want. Instead > of DID_ERROR, which will retry immediately, there's DID_REQUEUE which > will halt the device queue and wait for a returning command to retry. Any returning command, or a returning command from this device? We may be out of memory from a different device, and this may be the only command being sent to the scsi_debug driver. Wouldn't we end up halting the device queue indefinitely then? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation 2007-01-04 15:27 ` Matthew Wilcox @ 2007-01-04 15:34 ` James Bottomley 0 siblings, 0 replies; 9+ messages in thread From: James Bottomley @ 2007-01-04 15:34 UTC (permalink / raw) To: Matthew Wilcox; +Cc: Jens Axboe, Douglas Gilbert, linux-scsi On Thu, 2007-01-04 at 08:27 -0700, Matthew Wilcox wrote: > Any returning command, or a returning command from this device? This device > We may > be out of memory from a different device, and this may be the only > command being sent to the scsi_debug driver. Wouldn't we end up > halting the device queue indefinitely then? it's basically the SCSI_MLQUEUE_DEVICE_BUSY condition ... and yes, it has a mechanism for no outstanding commands. James ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation 2007-01-04 15:10 ` James Bottomley 2007-01-04 15:27 ` Matthew Wilcox @ 2007-01-04 15:50 ` Jens Axboe 2007-01-05 5:30 ` Douglas Gilbert 1 sibling, 1 reply; 9+ messages in thread From: Jens Axboe @ 2007-01-04 15:50 UTC (permalink / raw) To: James Bottomley; +Cc: Douglas Gilbert, linux-scsi On Thu, Jan 04 2007, James Bottomley wrote: > On Thu, 2007-01-04 at 12:21 +0100, Jens Axboe wrote: > > I guess it's fully up to you how you want to solve it. The scheme seems > > a little elaborate, but these error conditions are unlikely to ever been > > seen in the wild, so no objections from me. > > Actually, there's already a DID_ code that does what you want. Instead > of DID_ERROR, which will retry immediately, there's DID_REQUEUE which > will halt the device queue and wait for a returning command to retry. As long as it keeps firing the queue at some intervals even without any commands pending at all, then that'll work just fine. I like that approach a lot better than coding the error into some sense value that is (at best) some vague approximation of what has happened (calling memory shortage a transport error is a bit of a stretch). -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation 2007-01-04 15:50 ` Jens Axboe @ 2007-01-05 5:30 ` Douglas Gilbert 2007-01-05 12:49 ` Jens Axboe 0 siblings, 1 reply; 9+ messages in thread From: Douglas Gilbert @ 2007-01-05 5:30 UTC (permalink / raw) To: Jens Axboe; +Cc: James Bottomley, linux-scsi Jens Axboe wrote: > On Thu, Jan 04 2007, James Bottomley wrote: >> On Thu, 2007-01-04 at 12:21 +0100, Jens Axboe wrote: >>> I guess it's fully up to you how you want to solve it. The scheme seems >>> a little elaborate, but these error conditions are unlikely to ever been >>> seen in the wild, so no objections from me. >> Actually, there's already a DID_ code that does what you want. Instead >> of DID_ERROR, which will retry immediately, there's DID_REQUEUE which >> will halt the device queue and wait for a returning command to retry. > > As long as it keeps firing the queue at some intervals even without any > commands pending at all, then that'll work just fine. I like that > approach a lot better than coding the error into some sense value that > is (at best) some vague approximation of what has happened (calling > memory shortage a transport error is a bit of a stretch). True, but both happen. The scsi_debug driver is a virtual host, virtual target and a lu (ram disk). The failure that you pointed out stopped a response being built. In the real world that would in the target or lu. The reason that I mentioned aborted_command sense key is that it is also a "out of resources" (bandwidth) error and it broke sg_dd. I would bet money that it would also break the block layer/sd. The block layer should leave it alone as it is simply a matter of sd retrying a few times. However the st driver could have a real problem (e.g. did that state changing command work, fail or partially work??). Anyway, I have submitted a patch that reports DID_REQUEUE for an allocation failures and adds the ability to inject aborted_command errors. Doug Gilbert ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] scsi_debug: illegal blocking memory allocation 2007-01-05 5:30 ` Douglas Gilbert @ 2007-01-05 12:49 ` Jens Axboe 0 siblings, 0 replies; 9+ messages in thread From: Jens Axboe @ 2007-01-05 12:49 UTC (permalink / raw) To: Douglas Gilbert; +Cc: James Bottomley, linux-scsi On Fri, Jan 05 2007, Douglas Gilbert wrote: > Jens Axboe wrote: > > On Thu, Jan 04 2007, James Bottomley wrote: > >> On Thu, 2007-01-04 at 12:21 +0100, Jens Axboe wrote: > >>> I guess it's fully up to you how you want to solve it. The scheme seems > >>> a little elaborate, but these error conditions are unlikely to ever been > >>> seen in the wild, so no objections from me. > >> Actually, there's already a DID_ code that does what you want. Instead > >> of DID_ERROR, which will retry immediately, there's DID_REQUEUE which > >> will halt the device queue and wait for a returning command to retry. > > > > As long as it keeps firing the queue at some intervals even without any > > commands pending at all, then that'll work just fine. I like that > > approach a lot better than coding the error into some sense value that > > is (at best) some vague approximation of what has happened (calling > > memory shortage a transport error is a bit of a stretch). > > True, but both happen. The scsi_debug driver is a > virtual host, virtual target and a lu (ram disk). > The failure that you pointed out stopped a response > being built. In the real world that would in the > target or lu. The point is that the condition need not be exposed, even if it could be compared to (say) a device busy condition. Both should just results in a requeue and later retry, when resources are available to satisfy the request. Your revised patch looks good to me! -- Jens Axboe ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2007-01-05 12:48 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2007-01-03 13:49 [PATCH] scsi_debug: illegal blocking memory allocation Jens Axboe 2007-01-04 5:38 ` Douglas Gilbert 2007-01-04 11:21 ` Jens Axboe 2007-01-04 15:10 ` James Bottomley 2007-01-04 15:27 ` Matthew Wilcox 2007-01-04 15:34 ` James Bottomley 2007-01-04 15:50 ` Jens Axboe 2007-01-05 5:30 ` Douglas Gilbert 2007-01-05 12:49 ` Jens Axboe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox