* Recent removal of bsg read/write support @ 2018-09-02 11:35 Dror Levin 2018-09-02 11:44 ` Richard Weinberger 0 siblings, 1 reply; 13+ messages in thread From: Dror Levin @ 2018-09-02 11:35 UTC (permalink / raw) To: linux-kernel Note: I am not subscribed to LKML so please CC replies to this email. Hi, We have an internal tool that uses the bsg read/write interface to issue SCSI commands as part of a test suite for a storage device. After recently reading on LWN that this interface is to be removed we tried porting our code to use sg instead. However, that raises new issues - mainly getting ENOMEM over iSCSI for unknown reasons. Because of this we would like to continue using the bsg interface, even if some changes are required to meet security concerns. Is there any chance for this removal to be reverted? I saw it was already included in 4.19-rc1. Regards, Dror ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent removal of bsg read/write support 2018-09-02 11:35 Recent removal of bsg read/write support Dror Levin @ 2018-09-02 11:44 ` Richard Weinberger 2018-09-02 17:55 ` Linus Torvalds 2018-09-02 19:16 ` Douglas Gilbert 0 siblings, 2 replies; 13+ messages in thread From: Richard Weinberger @ 2018-09-02 11:44 UTC (permalink / raw) To: drorl Cc: LKML, linux-block, linux-scsi, Linus Torvalds, Christoph Hellwig, Jens Axboe CC'ing relevant people. Otherwise your mail might get lost. On Sun, Sep 2, 2018 at 1:37 PM Dror Levin <drorl@infinidat.com> wrote: > > Note: I am not subscribed to LKML so please CC replies to this email. > > Hi, > > We have an internal tool that uses the bsg read/write interface to > issue SCSI commands as part of a test suite for a storage device. > > After recently reading on LWN that this interface is to be removed we > tried porting our code to use sg instead. However, that raises new > issues - mainly getting ENOMEM over iSCSI for unknown reasons. > > Because of this we would like to continue using the bsg interface, > even if some changes are required to meet security concerns. > > Is there any chance for this removal to be reverted? I saw it was > already included in 4.19-rc1. > > Regards, > Dror -- Thanks, //richard ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent removal of bsg read/write support 2018-09-02 11:44 ` Richard Weinberger @ 2018-09-02 17:55 ` Linus Torvalds 2018-09-03 8:34 ` Dror Levin 2018-09-02 19:16 ` Douglas Gilbert 1 sibling, 1 reply; 13+ messages in thread From: Linus Torvalds @ 2018-09-02 17:55 UTC (permalink / raw) To: richard -rw- weinberger Cc: drorl, Linux Kernel Mailing List, linux-block, Linux SCSI List, Christoph Hellwig, Jens Axboe On Sun, Sep 2, 2018 at 4:44 AM Richard Weinberger <richard.weinberger@gmail.com> wrote: > > CC'ing relevant people. Otherwise your mail might get lost. Indeed. > On Sun, Sep 2, 2018 at 1:37 PM Dror Levin <drorl@infinidat.com> wrote: > > > > We have an internal tool that uses the bsg read/write interface to > > issue SCSI commands as part of a test suite for a storage device. > > > > After recently reading on LWN that this interface is to be removed we > > tried porting our code to use sg instead. However, that raises new > > issues - mainly getting ENOMEM over iSCSI for unknown reasons. Is there any chance that you can make more data available? I'd rather fix the sg interface (which while also broken garbage, we can't get rid of) than re-surrect the bsg interface. That said, the removed bsg code looks a hell of a lot prettier than the nasty sg interface code does, although it also lacks ansolutely _any_ kind of security checking. > > Because of this we would like to continue using the bsg interface, > > even if some changes are required to meet security concerns. I wonder if we could at least try to unify the bsg/sg code - possibly by making sg use the prettier bsg code (but definitely have to add all the security measures). And dammit, the SCSI people need to get their heads out of their arses. This whole "stream random commands over read/write" needs to go the f*ck away. Could we perhaps extend the SG_IO interace to have an async mode? Instead of "read/write", have "SG_IOSUBMIT" and "SG_IORECEIVE" and have the SG_IO ioctl just be a shorthand of "both". And have a model that actually checks that "yeah, this command is ok for a normal user", so that it's actually useful for things like CD burning etc, without making it a huge gaping security hole that is only useful for crazy special cases? Maybe that - otherwise reasonably pretty - block/bsg.c code could be joined with the SG_IO code that actually understands security issues? Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent removal of bsg read/write support 2018-09-02 17:55 ` Linus Torvalds @ 2018-09-03 8:34 ` Dror Levin 2018-09-04 4:10 ` Douglas Gilbert ` (2 more replies) 0 siblings, 3 replies; 13+ messages in thread From: Dror Levin @ 2018-09-03 8:34 UTC (permalink / raw) To: torvalds Cc: richard.weinberger, linux-kernel, linux-block, linux-scsi, hch, axboe On Sun, Sep 2, 2018 at 8:55 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > > On Sun, Sep 2, 2018 at 4:44 AM Richard Weinberger > <richard.weinberger@gmail.com> wrote: > > > > CC'ing relevant people. Otherwise your mail might get lost. > > Indeed. Sorry for that. > > On Sun, Sep 2, 2018 at 1:37 PM Dror Levin <drorl@infinidat.com> wrote: > > > > > > We have an internal tool that uses the bsg read/write interface to > > > issue SCSI commands as part of a test suite for a storage device. > > > > > > After recently reading on LWN that this interface is to be removed we > > > tried porting our code to use sg instead. However, that raises new > > > issues - mainly getting ENOMEM over iSCSI for unknown reasons. > > Is there any chance that you can make more data available? Sure, I can try. We use writev() to send up to SG_MAX_QUEUE tasks at a time. Occasionally not all tasks are written at which point we wait for tasks to return before sending more, but then writev() fails with ENOMEM and we see this in the syslog: Sep 1 20:58:14 gdc-qa-io-017 kernel: sd 441:0:0:5: [sg73] sg_common_write: start_req err=-12 Failing tasks are reads of 128KiB. > I'd rather fix the sg interface (which while also broken garbage, we > can't get rid of) than re-surrect the bsg interface. > > That said, the removed bsg code looks a hell of a lot prettier than > the nasty sg interface code does, although it also lacks ansolutely > _any_ kind of security checking. For us the bsg interface also has several advantages over sg: 1. The device name is its HCTL which is nicer than an arbitrary integer. 2. write() supports writing more than one sg_io_v4 struct so we don't have to resort to writev(). 3. Queue size is the device's queue depth and not SG_MAX_QUEUE which is 16. > > > Because of this we would like to continue using the bsg interface, > > > even if some changes are required to meet security concerns. > > I wonder if we could at least try to unify the bsg/sg code - possibly > by making sg use the prettier bsg code (but definitely have to add all > the security measures). > > And dammit, the SCSI people need to get their heads out of their > arses. This whole "stream random commands over read/write" needs to go > the f*ck away. > > Could we perhaps extend the SG_IO interace to have an async mode? > Instead of "read/write", have "SG_IOSUBMIT" and "SG_IORECEIVE" and > have the SG_IO ioctl just be a shorthand of "both". Just my two cents - having an interface other than read/write won't allow users to treat this fd as a regular file with epoll() and read(). This is a major bonus for this interface - an sg/bsg device can be used just like a socket or pipe in any reactor (we use boost asio for example). Thanks, Dror ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent removal of bsg read/write support 2018-09-03 8:34 ` Dror Levin @ 2018-09-04 4:10 ` Douglas Gilbert 2018-10-04 6:58 ` Dror Levin 2019-02-01 17:44 ` Douglas Gilbert 2 siblings, 0 replies; 13+ messages in thread From: Douglas Gilbert @ 2018-09-04 4:10 UTC (permalink / raw) To: Dror Levin, torvalds Cc: richard.weinberger, linux-kernel, linux-block, linux-scsi, hch, axboe On 2018-09-03 10:34 AM, Dror Levin wrote: > On Sun, Sep 2, 2018 at 8:55 PM Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> >> On Sun, Sep 2, 2018 at 4:44 AM Richard Weinberger >> <richard.weinberger@gmail.com> wrote: >>> >>> CC'ing relevant people. Otherwise your mail might get lost. >> >> Indeed. > > Sorry for that. > >>> On Sun, Sep 2, 2018 at 1:37 PM Dror Levin <drorl@infinidat.com> wrote: >>>> >>>> We have an internal tool that uses the bsg read/write interface to >>>> issue SCSI commands as part of a test suite for a storage device. >>>> >>>> After recently reading on LWN that this interface is to be removed we >>>> tried porting our code to use sg instead. However, that raises new >>>> issues - mainly getting ENOMEM over iSCSI for unknown reasons. >> >> Is there any chance that you can make more data available? > > Sure, I can try. > > We use writev() to send up to SG_MAX_QUEUE tasks at a time. Occasionally not > all tasks are written at which point we wait for tasks to return before > sending more, but then writev() fails with ENOMEM and we see this in the syslog: > > Sep 1 20:58:14 gdc-qa-io-017 kernel: sd 441:0:0:5: [sg73] > sg_common_write: start_req err=-12 > > Failing tasks are reads of 128KiB. > >> I'd rather fix the sg interface (which while also broken garbage, we >> can't get rid of) than re-surrect the bsg interface. >> >> That said, the removed bsg code looks a hell of a lot prettier than >> the nasty sg interface code does, although it also lacks ansolutely >> _any_ kind of security checking. > > For us the bsg interface also has several advantages over sg: > 1. The device name is its HCTL which is nicer than an arbitrary integer. > 2. write() supports writing more than one sg_io_v4 struct so we don't have > to resort to writev(). > 3. Queue size is the device's queue depth and not SG_MAX_QUEUE which is 16. > >>>> Because of this we would like to continue using the bsg interface, >>>> even if some changes are required to meet security concerns. >> >> I wonder if we could at least try to unify the bsg/sg code - possibly >> by making sg use the prettier bsg code (but definitely have to add all >> the security measures). >> >> And dammit, the SCSI people need to get their heads out of their >> arses. This whole "stream random commands over read/write" needs to go >> the f*ck away. >> >> Could we perhaps extend the SG_IO interace to have an async mode? >> Instead of "read/write", have "SG_IOSUBMIT" and "SG_IORECEIVE" and >> have the SG_IO ioctl just be a shorthand of "both". > > Just my two cents - having an interface other than read/write won't allow > users to treat this fd as a regular file with epoll() and read(). This is > a major bonus for this interface - an sg/bsg device can be used just like > a socket or pipe in any reactor (we use boost asio for example). The advantage of having two ioctls is that they can both pass (meta-)data bidirectionally. That is hard to do with standard read() and write() calls. The command tag is the piece if meta-data that goes against the flow: returned from SG_IOSUBMIT, optionally given to SG_IORECEIVE (which might have a 'cancel command' flag). The sg v1, v2 and v3 interfaces could keep their write()/read() interfaces for backward compatibility (to Linux 1.0.0, March 1994 for sg v1). New, clean submit and receive paths could be added to the sg driver for the v3 and v4 twin ioctl interface. Previously the sg v4 interface was only supported by the bsg driver. One advantage of sg v4 over v3 is support for bidi commands. Not sure if epoll/poll works with an ioctl, if not we could add a "dummy" read() call that notionally returned SCSI status. The SG_IORECEIVE ioctl would still be needed to "clean up" the command, and optionally transfer the data-in buffer. Tony Battersby has also requested twin ioctls saying that it is extremely tedious ploughing through logs full of SG_IO calls and that clearly separating submits from receives would make things somewhat better. Doug Gilbert ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent removal of bsg read/write support 2018-09-03 8:34 ` Dror Levin 2018-09-04 4:10 ` Douglas Gilbert @ 2018-10-04 6:58 ` Dror Levin 2018-10-05 22:35 ` Greg KH 2019-02-01 17:44 ` Douglas Gilbert 2 siblings, 1 reply; 13+ messages in thread From: Dror Levin @ 2018-10-04 6:58 UTC (permalink / raw) To: torvalds Cc: Richard Weinberger, linux-kernel, linux-block, linux-scsi, hch, axboe, gregkh CC'ing Greg. On Mon, Sep 3, 2018 at 11:34 AM Dror Levin <drorl@infinidat.com> wrote: > > On Sun, Sep 2, 2018 at 8:55 PM Linus Torvalds > <torvalds@linux-foundation.org> wrote: > > > > On Sun, Sep 2, 2018 at 4:44 AM Richard Weinberger > > <richard.weinberger@gmail.com> wrote: > > > > > > CC'ing relevant people. Otherwise your mail might get lost. > > > > Indeed. > > Sorry for that. > > > > On Sun, Sep 2, 2018 at 1:37 PM Dror Levin <drorl@infinidat.com> wrote: > > > > > > > > We have an internal tool that uses the bsg read/write interface to > > > > issue SCSI commands as part of a test suite for a storage device. > > > > > > > > After recently reading on LWN that this interface is to be removed we > > > > tried porting our code to use sg instead. However, that raises new > > > > issues - mainly getting ENOMEM over iSCSI for unknown reasons. > > > > Is there any chance that you can make more data available? > > Sure, I can try. > > We use writev() to send up to SG_MAX_QUEUE tasks at a time. Occasionally not > all tasks are written at which point we wait for tasks to return before > sending more, but then writev() fails with ENOMEM and we see this in the syslog: > > Sep 1 20:58:14 gdc-qa-io-017 kernel: sd 441:0:0:5: [sg73] > sg_common_write: start_req err=-12 > > Failing tasks are reads of 128KiB. > > > I'd rather fix the sg interface (which while also broken garbage, we > > can't get rid of) than re-surrect the bsg interface. Discussion seems to have died down but release of 4.19 is drawing near. Is there still any chance removal of bsg can be reconsidered? Maybe postponed to the next version to allow more time to adjust? I'm especially concerned about the possibility of this being backported to stable kernels which might leave us very little time to fix our code. Thanks, Dror ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent removal of bsg read/write support 2018-10-04 6:58 ` Dror Levin @ 2018-10-05 22:35 ` Greg KH 2018-10-05 23:27 ` Douglas Gilbert 0 siblings, 1 reply; 13+ messages in thread From: Greg KH @ 2018-10-05 22:35 UTC (permalink / raw) To: Dror Levin Cc: torvalds, Richard Weinberger, linux-kernel, linux-block, linux-scsi, hch, axboe On Thu, Oct 04, 2018 at 09:58:37AM +0300, Dror Levin wrote: > CC'ing Greg. > > On Mon, Sep 3, 2018 at 11:34 AM Dror Levin <drorl@infinidat.com> wrote: > > > > On Sun, Sep 2, 2018 at 8:55 PM Linus Torvalds > > <torvalds@linux-foundation.org> wrote: > > > > > > On Sun, Sep 2, 2018 at 4:44 AM Richard Weinberger > > > <richard.weinberger@gmail.com> wrote: > > > > > > > > CC'ing relevant people. Otherwise your mail might get lost. > > > > > > Indeed. > > > > Sorry for that. > > > > > > On Sun, Sep 2, 2018 at 1:37 PM Dror Levin <drorl@infinidat.com> wrote: > > > > > > > > > > We have an internal tool that uses the bsg read/write interface to > > > > > issue SCSI commands as part of a test suite for a storage device. > > > > > > > > > > After recently reading on LWN that this interface is to be removed we > > > > > tried porting our code to use sg instead. However, that raises new > > > > > issues - mainly getting ENOMEM over iSCSI for unknown reasons. > > > > > > Is there any chance that you can make more data available? > > > > Sure, I can try. > > > > We use writev() to send up to SG_MAX_QUEUE tasks at a time. Occasionally not > > all tasks are written at which point we wait for tasks to return before > > sending more, but then writev() fails with ENOMEM and we see this in the syslog: > > > > Sep 1 20:58:14 gdc-qa-io-017 kernel: sd 441:0:0:5: [sg73] > > sg_common_write: start_req err=-12 > > > > Failing tasks are reads of 128KiB. > > > > > I'd rather fix the sg interface (which while also broken garbage, we > > > can't get rid of) than re-surrect the bsg interface. > > Discussion seems to have died down but release of 4.19 is drawing near. > > Is there still any chance removal of bsg can be reconsidered? Maybe > postponed to the > next version to allow more time to adjust? > > I'm especially concerned about the possibility of this being > backported to stable kernels > which might leave us very little time to fix our code. What is being backported to what stable kernels and why? Is there sg patches? totally confused, greg k-h ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent removal of bsg read/write support 2018-10-05 22:35 ` Greg KH @ 2018-10-05 23:27 ` Douglas Gilbert 0 siblings, 0 replies; 13+ messages in thread From: Douglas Gilbert @ 2018-10-05 23:27 UTC (permalink / raw) To: Greg KH, Dror Levin Cc: torvalds, Richard Weinberger, linux-kernel, linux-block, linux-scsi, hch, axboe On 2018-10-05 06:35 PM, Greg KH wrote: > On Thu, Oct 04, 2018 at 09:58:37AM +0300, Dror Levin wrote: >> CC'ing Greg. >> >> On Mon, Sep 3, 2018 at 11:34 AM Dror Levin <drorl@infinidat.com> wrote: >>> >>> On Sun, Sep 2, 2018 at 8:55 PM Linus Torvalds >>> <torvalds@linux-foundation.org> wrote: >>>> >>>> On Sun, Sep 2, 2018 at 4:44 AM Richard Weinberger >>>> <richard.weinberger@gmail.com> wrote: >>>>> >>>>> CC'ing relevant people. Otherwise your mail might get lost. >>>> >>>> Indeed. >>> >>> Sorry for that. >>> >>>>> On Sun, Sep 2, 2018 at 1:37 PM Dror Levin <drorl@infinidat.com> wrote: >>>>>> >>>>>> We have an internal tool that uses the bsg read/write interface to >>>>>> issue SCSI commands as part of a test suite for a storage device. >>>>>> >>>>>> After recently reading on LWN that this interface is to be removed we >>>>>> tried porting our code to use sg instead. However, that raises new >>>>>> issues - mainly getting ENOMEM over iSCSI for unknown reasons. >>>> >>>> Is there any chance that you can make more data available? >>> >>> Sure, I can try. >>> >>> We use writev() to send up to SG_MAX_QUEUE tasks at a time. Occasionally not >>> all tasks are written at which point we wait for tasks to return before >>> sending more, but then writev() fails with ENOMEM and we see this in the syslog: >>> >>> Sep 1 20:58:14 gdc-qa-io-017 kernel: sd 441:0:0:5: [sg73] >>> sg_common_write: start_req err=-12 >>> >>> Failing tasks are reads of 128KiB. >>> >>>> I'd rather fix the sg interface (which while also broken garbage, we >>>> can't get rid of) than re-surrect the bsg interface. >> >> Discussion seems to have died down but release of 4.19 is drawing near. >> >> Is there still any chance removal of bsg can be reconsidered? Maybe >> postponed to the >> next version to allow more time to adjust? >> >> I'm especially concerned about the possibility of this being >> backported to stable kernels >> which might leave us very little time to fix our code. > > What is being backported to what stable kernels and why? What has been removed from the bsg driver is the ability to use write() to issue a SCSI command (and return before that command's response has arrived) and to use read() to wait for the issued command to complete and return that command's status and optionally its sense buffer. Basically async SCSI command capability has been removed from the bsg driver. The ioctl(SG_IO) for bsg device stays in place. The reason is that the bsg driver has no active maintainer and its rather serious design flaw: when a async SCSI command is issued (by a write()) any process using any file descriptor (opened on that device) may receive its response. Actually if you tried hard with bsg, the synchronous SG_IO ioctl will also exhibit this behaviour. > Is there sg patches? Well I'm working on a patch set but it won't be ready for lk 4.19 . Amongst other things it will remove the SG_MAX_QUEUE (16) request limitation per file descriptor that Dror Levin is noting as a hardship when re-porting their application from the bsg to sg driver. Also Linus Torvalds doesn't like the write()/read() async interface used by the sg driver and mimicked by the bsg driver. It has been in place in the sg driver since 1992 (see lk 1.0) when it was the only user space interface to the sg driver. Linus proposed replacing them by two new ioctls: SG_IOSUBMIT and SG_IORECEIVE ***. I will work on a second patch set for the sg driver to do exactly that (plus SG_IOABORT to abort a request inflight). > totally confused, Hope this helps. Doug Gilbert *** See Linux Torvalds' post to this thread on 20180902 ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent removal of bsg read/write support 2018-09-03 8:34 ` Dror Levin 2018-09-04 4:10 ` Douglas Gilbert 2018-10-04 6:58 ` Dror Levin @ 2019-02-01 17:44 ` Douglas Gilbert 2 siblings, 0 replies; 13+ messages in thread From: Douglas Gilbert @ 2019-02-01 17:44 UTC (permalink / raw) To: Dror Levin, torvalds Cc: richard.weinberger, linux-kernel, linux-block, linux-scsi, hch, axboe Updated reply, see below. On 2018-09-03 4:34 a.m., Dror Levin wrote: > On Sun, Sep 2, 2018 at 8:55 PM Linus Torvalds > <torvalds@linux-foundation.org> wrote: >> >> On Sun, Sep 2, 2018 at 4:44 AM Richard Weinberger >> <richard.weinberger@gmail.com> wrote: >>> >>> CC'ing relevant people. Otherwise your mail might get lost. >> >> Indeed. > > Sorry for that. > >>> On Sun, Sep 2, 2018 at 1:37 PM Dror Levin <drorl@infinidat.com> wrote: >>>> >>>> We have an internal tool that uses the bsg read/write interface to >>>> issue SCSI commands as part of a test suite for a storage device. >>>> >>>> After recently reading on LWN that this interface is to be removed we >>>> tried porting our code to use sg instead. However, that raises new >>>> issues - mainly getting ENOMEM over iSCSI for unknown reasons. >> >> Is there any chance that you can make more data available? > > Sure, I can try. > > We use writev() to send up to SG_MAX_QUEUE tasks at a time. Occasionally not > all tasks are written at which point we wait for tasks to return before > sending more, but then writev() fails with ENOMEM and we see this in the syslog: > > Sep 1 20:58:14 gdc-qa-io-017 kernel: sd 441:0:0:5: [sg73] > sg_common_write: start_req err=-12 > > Failing tasks are reads of 128KiB. This is the block layer running out of resources. The sg driver is a relatively thin shim and when it gets a "no can do" from the layers below it, the driver has little option than to return said errno. >> I'd rather fix the sg interface (which while also broken garbage, we >> can't get rid of) than re-surrect the bsg interface. >> >> That said, the removed bsg code looks a hell of a lot prettier than >> the nasty sg interface code does, although it also lacks ansolutely >> _any_ kind of security checking. > > For us the bsg interface also has several advantages over sg: > 1. The device name is its HCTL which is nicer than an arbitrary integer. Not much the sg driver can do about that. The minor number the sg driver uses and HCT are all arbitrary integers (with the L coming from the storage device), but I agree the HCTL is more widely used. The ioctl(<sg_fd>, SG_GET_SCSI_ID) fills a structure which includes HCTL. In my sg v4 driver rewrite the L (LUN) has been tweaked to additionally send back the 8 byte T10 LUN representation. The lsscsi utility will show the relationship between HCTL and sg driver device name with 'lsscsi -g'. It uses sysfs datamining. > 2. write() supports writing more than one sg_io_v4 struct so we don't have > to resort to writev(). In my sg v4 rewrite the sg_io_v4 interface can only be sent through ioctl(SG_IO) [for sync usage] and ioctl(SG_IOSUBMIT) [for async usage]. So it can't be sent through write(2). SG_IOSUBMIT is new and uses the _IOWR macro which encodes the expected length into the SG_IOSUBMIT value and that is the size of sg_io_v4. So you can't send an arbitrary number of sg_io_v4 objects through that ioctl directly. If need be, that can be cured with another level of indirection (e.g. with a new flag the data-out can be interpreted as an array sg_io_v4 objects). > 3. Queue size is the device's queue depth and not SG_MAX_QUEUE which is 16. That limit is gone in the sg v4 driver rewrite. >>>> Because of this we would like to continue using the bsg interface, >>>> even if some changes are required to meet security concerns. >> >> I wonder if we could at least try to unify the bsg/sg code - possibly >> by making sg use the prettier bsg code (but definitely have to add all >> the security measures). >> >> And dammit, the SCSI people need to get their heads out of their >> arses. This whole "stream random commands over read/write" needs to go >> the f*ck away. >> >> Could we perhaps extend the SG_IO interace to have an async mode? >> Instead of "read/write", have "SG_IOSUBMIT" and "SG_IORECEIVE" and >> have the SG_IO ioctl just be a shorthand of "both". Done. > Just my two cents - having an interface other than read/write won't allow > users to treat this fd as a regular file with epoll() and read(). This is > a major bonus for this interface - an sg/bsg device can be used just like > a socket or pipe in any reactor (we use boost asio for example). Well poll() certainly works (see sg3_utils beta rev 809 testing/sgs_dd.c and testing/sgh_dd.c) and I can't see why epoll() won't work. These calls work against the file descriptor and the sg driver keeps the same context around sg device file descriptors as it has always done. [And that is the major design flaw in the bsg driver: it doesn't keep proper file descriptor context.] It is the security folks who don't like the sg inspired (there in lk 1.0.0 from 1992) write(2)/read(2) asynchronous interface. Also, ideally we need two streams: one for metadata (e.g. commands and responses (status and sense data)) and another for user data. Protection information could be a third stream, between the other two. Jamming that all into one stream is a bit ugly. References: sg v3 driver rewrite, description and downloads: http://sg.danny.cz/sg/sg_v40.html sg3_utils version 1.45 beta, rev 809, link at the top of this page: http://sg.danny.cz/sg Doug Gilbert ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent removal of bsg read/write support 2018-09-02 11:44 ` Richard Weinberger 2018-09-02 17:55 ` Linus Torvalds @ 2018-09-02 19:16 ` Douglas Gilbert 2018-09-03 12:28 ` Michal Hocko 1 sibling, 1 reply; 13+ messages in thread From: Douglas Gilbert @ 2018-09-02 19:16 UTC (permalink / raw) To: Richard Weinberger, drorl Cc: LKML, linux-block, linux-scsi, Linus Torvalds, Christoph Hellwig, Jens Axboe On 2018-09-02 01:44 PM, Richard Weinberger wrote: > CC'ing relevant people. Otherwise your mail might get lost. > > On Sun, Sep 2, 2018 at 1:37 PM Dror Levin <drorl@infinidat.com> wrote: >> >> Note: I am not subscribed to LKML so please CC replies to this email. >> >> Hi, >> >> We have an internal tool that uses the bsg read/write interface to >> issue SCSI commands as part of a test suite for a storage device. >> >> After recently reading on LWN that this interface is to be removed we >> tried porting our code to use sg instead. However, that raises new >> issues - mainly getting ENOMEM over iSCSI for unknown reasons. >> >> Because of this we would like to continue using the bsg interface, >> even if some changes are required to meet security concerns. >> >> Is there any chance for this removal to be reverted? I saw it was >> already included in 4.19-rc1. Hi, Both bsg and sg are relatively thin shims over the same block layer pass-through calls. And neither driver will themselves generate ENOMEM unless the CPU is running low of memory. In my experience, the main reason for unexpected ENOMEMs *** is from blk_rq_map_user_iov() in block/blk_map.c called from both drivers. That is a particular resource shortage rather than memory in general. I do notice the blk_rq_map_user_iov() is/was called with GFP_KERNEL in bsg and GFP_ATOMIC by sg. That suggests when you call write() on a sg device and get ENOMEM, then wait a little (depends on your app) and try again. Could you share a test program that illustrates the problem with us? One way of limiting user space programs that used bsg's write()/read() interface (via the sg v4 interface), from needing a significant rewrite would be to implement the sg_v4 interface in the sg driver. Currently the sg driver supports the sg v1 (sort of), sg v2 and sg v3 (typically used) interfaces. Doug Gilbert *** Prior to around February this year, that block layer resource shortage resulted in a EINVAL being returned to the sg driver. That had perplexed me for some time, as it happened only under heavy testing, typically with the same SCSI command being repeated (e.g. REQUEST SENSE). It was really a ENOMEM error that was being inadvertently overwritten. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent removal of bsg read/write support 2018-09-02 19:16 ` Douglas Gilbert @ 2018-09-03 12:28 ` Michal Hocko 2018-09-04 3:38 ` Douglas Gilbert 0 siblings, 1 reply; 13+ messages in thread From: Michal Hocko @ 2018-09-03 12:28 UTC (permalink / raw) To: Douglas Gilbert Cc: Richard Weinberger, drorl, LKML, linux-block, linux-scsi, Linus Torvalds, Christoph Hellwig, Jens Axboe On Sun 02-09-18 21:16:10, Douglas Gilbert wrote: > On 2018-09-02 01:44 PM, Richard Weinberger wrote: > > CC'ing relevant people. Otherwise your mail might get lost. > > > > On Sun, Sep 2, 2018 at 1:37 PM Dror Levin <drorl@infinidat.com> wrote: > > > > > > Note: I am not subscribed to LKML so please CC replies to this email. > > > > > > Hi, > > > > > > We have an internal tool that uses the bsg read/write interface to > > > issue SCSI commands as part of a test suite for a storage device. > > > > > > After recently reading on LWN that this interface is to be removed we > > > tried porting our code to use sg instead. However, that raises new > > > issues - mainly getting ENOMEM over iSCSI for unknown reasons. > > > > > > Because of this we would like to continue using the bsg interface, > > > even if some changes are required to meet security concerns. > > > > > > Is there any chance for this removal to be reverted? I saw it was > > > already included in 4.19-rc1. > > Hi, > Both bsg and sg are relatively thin shims over the same block layer > pass-through calls. And neither driver will themselves generate ENOMEM > unless the CPU is running low of memory. > > In my experience, the main reason for unexpected ENOMEMs *** is from > blk_rq_map_user_iov() in block/blk_map.c called from both drivers. > That is a particular resource shortage rather than memory in general. > I do notice the blk_rq_map_user_iov() is/was called with GFP_KERNEL > in bsg and GFP_ATOMIC by sg. That suggests when you call write() on > a sg device and get ENOMEM, then wait a little (depends on your app) > and try again. Well, what is the reason to use GFP_ATOMIC in the first place? I am not familiar with the code so I might be easily wrong but sg_start_req which calls blk_rq_map_user_iov resp. blk_rq_map_user with GFP_ATOMIC uses mutex. It is a conditional usage so the sleeping context might depend on the caller. But I guess it would be better to double check. It looks suspicious to me. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent removal of bsg read/write support 2018-09-03 12:28 ` Michal Hocko @ 2018-09-04 3:38 ` Douglas Gilbert 2018-09-04 6:59 ` Michal Hocko 0 siblings, 1 reply; 13+ messages in thread From: Douglas Gilbert @ 2018-09-04 3:38 UTC (permalink / raw) To: Michal Hocko Cc: Richard Weinberger, drorl, LKML, linux-block, linux-scsi, Linus Torvalds, Christoph Hellwig, Jens Axboe On 2018-09-03 02:28 PM, Michal Hocko wrote: > On Sun 02-09-18 21:16:10, Douglas Gilbert wrote: >> On 2018-09-02 01:44 PM, Richard Weinberger wrote: >>> CC'ing relevant people. Otherwise your mail might get lost. >>> >>> On Sun, Sep 2, 2018 at 1:37 PM Dror Levin <drorl@infinidat.com> wrote: >>>> >>>> Note: I am not subscribed to LKML so please CC replies to this email. >>>> >>>> Hi, >>>> >>>> We have an internal tool that uses the bsg read/write interface to >>>> issue SCSI commands as part of a test suite for a storage device. >>>> >>>> After recently reading on LWN that this interface is to be removed we >>>> tried porting our code to use sg instead. However, that raises new >>>> issues - mainly getting ENOMEM over iSCSI for unknown reasons. >>>> >>>> Because of this we would like to continue using the bsg interface, >>>> even if some changes are required to meet security concerns. >>>> >>>> Is there any chance for this removal to be reverted? I saw it was >>>> already included in 4.19-rc1. >> >> Hi, >> Both bsg and sg are relatively thin shims over the same block layer >> pass-through calls. And neither driver will themselves generate ENOMEM >> unless the CPU is running low of memory. >> >> In my experience, the main reason for unexpected ENOMEMs *** is from >> blk_rq_map_user_iov() in block/blk_map.c called from both drivers. >> That is a particular resource shortage rather than memory in general. >> I do notice the blk_rq_map_user_iov() is/was called with GFP_KERNEL >> in bsg and GFP_ATOMIC by sg. That suggests when you call write() on >> a sg device and get ENOMEM, then wait a little (depends on your app) >> and try again. > > Well, what is the reason to use GFP_ATOMIC in the first place? I am not > familiar with the code so I might be easily wrong but sg_start_req which > calls blk_rq_map_user_iov resp. blk_rq_map_user with GFP_ATOMIC uses > mutex. It is a conditional usage so the sleeping context might depend > on the caller. But I guess it would be better to double check. It looks > suspicious to me. Of the hundreds of 'hacks' on the sg driver over the years, the most common is an expert arguing that GFP_ATOMIC should be changed to GFP_KERNEL. They usually get their way. That is followed around 6 to 9 months later by a sg user complaining about an unexpected broken app. So back it goes to GFP_ATOMIC. Welcome to the merry-go-round. Doug Gilbert ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Recent removal of bsg read/write support 2018-09-04 3:38 ` Douglas Gilbert @ 2018-09-04 6:59 ` Michal Hocko 0 siblings, 0 replies; 13+ messages in thread From: Michal Hocko @ 2018-09-04 6:59 UTC (permalink / raw) To: Douglas Gilbert Cc: Richard Weinberger, drorl, LKML, linux-block, linux-scsi, Linus Torvalds, Christoph Hellwig, Jens Axboe On Tue 04-09-18 05:38:21, Douglas Gilbert wrote: > On 2018-09-03 02:28 PM, Michal Hocko wrote: > > On Sun 02-09-18 21:16:10, Douglas Gilbert wrote: > > > On 2018-09-02 01:44 PM, Richard Weinberger wrote: > > > > CC'ing relevant people. Otherwise your mail might get lost. > > > > > > > > On Sun, Sep 2, 2018 at 1:37 PM Dror Levin <drorl@infinidat.com> wrote: > > > > > > > > > > Note: I am not subscribed to LKML so please CC replies to this email. > > > > > > > > > > Hi, > > > > > > > > > > We have an internal tool that uses the bsg read/write interface to > > > > > issue SCSI commands as part of a test suite for a storage device. > > > > > > > > > > After recently reading on LWN that this interface is to be removed we > > > > > tried porting our code to use sg instead. However, that raises new > > > > > issues - mainly getting ENOMEM over iSCSI for unknown reasons. > > > > > > > > > > Because of this we would like to continue using the bsg interface, > > > > > even if some changes are required to meet security concerns. > > > > > > > > > > Is there any chance for this removal to be reverted? I saw it was > > > > > already included in 4.19-rc1. > > > > > > Hi, > > > Both bsg and sg are relatively thin shims over the same block layer > > > pass-through calls. And neither driver will themselves generate ENOMEM > > > unless the CPU is running low of memory. > > > > > > In my experience, the main reason for unexpected ENOMEMs *** is from > > > blk_rq_map_user_iov() in block/blk_map.c called from both drivers. > > > That is a particular resource shortage rather than memory in general. > > > I do notice the blk_rq_map_user_iov() is/was called with GFP_KERNEL > > > in bsg and GFP_ATOMIC by sg. That suggests when you call write() on > > > a sg device and get ENOMEM, then wait a little (depends on your app) > > > and try again. > > > > Well, what is the reason to use GFP_ATOMIC in the first place? I am not > > familiar with the code so I might be easily wrong but sg_start_req which > > calls blk_rq_map_user_iov resp. blk_rq_map_user with GFP_ATOMIC uses > > mutex. It is a conditional usage so the sleeping context might depend > > on the caller. But I guess it would be better to double check. It looks > > suspicious to me. > > Of the hundreds of 'hacks' on the sg driver over the years, the most > common is an expert arguing that GFP_ATOMIC should be changed to GFP_KERNEL. > They usually get their way. That is followed around 6 to 9 months later by > a sg user complaining about an unexpected broken app. So back it goes to > GFP_ATOMIC. Then I would strongly recommend to describe the actual reTquirements on the allocation context. Why is GFP_ATOMIC really needed? There are usually two reasons a) the allocation is called from an atomic context b) reclaim is not acceptable or desirable. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2019-02-01 17:44 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2018-09-02 11:35 Recent removal of bsg read/write support Dror Levin 2018-09-02 11:44 ` Richard Weinberger 2018-09-02 17:55 ` Linus Torvalds 2018-09-03 8:34 ` Dror Levin 2018-09-04 4:10 ` Douglas Gilbert 2018-10-04 6:58 ` Dror Levin 2018-10-05 22:35 ` Greg KH 2018-10-05 23:27 ` Douglas Gilbert 2019-02-01 17:44 ` Douglas Gilbert 2018-09-02 19:16 ` Douglas Gilbert 2018-09-03 12:28 ` Michal Hocko 2018-09-04 3:38 ` Douglas Gilbert 2018-09-04 6:59 ` Michal Hocko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox