All of lore.kernel.org
 help / color / mirror / Atom feed
* rounding dxfer_len to 512 bytes boundary in sg.c
@ 2007-11-15 15:38 Igor A. Nesterov
  2007-11-15 21:27 ` Douglas Gilbert
  0 siblings, 1 reply; 3+ messages in thread
From: Igor A. Nesterov @ 2007-11-15 15:38 UTC (permalink / raw)
  To: linux-scsi

dxfer_len value provided to SG_IO ioctl as a part of sg_io_hdr_t
structure is rounded up to the nearest 512 bytes boundary just before
scatterlist allocation in sg_build_indirect()

  /* round request up to next highest SG_SECTOR_SZ byte boundary */
        blk_size = (blk_size + SG_SECTOR_MSK) & (~SG_SECTOR_MSK);

I do not understand why is this. From the memory handling point of view
it seems meaningless because kernel memory for scatterlist is being
allocated in pages anyway. And from the I/O point of view it results
into buggy behavior if dxfer_len is bigger then def_reserved_size and is
not multiple of 512. Roughly, there are two cases handled a little bit
different in sg.c

1) dxfer_len <= def_reserved_size (32k by default)

in this case sg_start_req() uses preallocated reserved buffer, and calls
sg_link_reserve(), which truncates the length of the last scatterlist
element to match original dxfer_len value. Everything goes fine.

2) dxfer_len > def_reserved_size || reserved buffer is in use

in this case sg_start_req() allocates new scatterlist buffer with
sg_build_indirect() and does NOT truncate the last scatterlist element
leaving it rounded up to the nearest 512k boundary. Following IO
operation is screwed up. First, operation is issued to device with a
transfer length different from what meant by SG_IO originator. Sometimes
it does not matter, sometimes it does. Even if IO operation is some sort
of read (like read from a tape) and a device responds correctly,
residual value from scsi mid-layer is calculated based on rounded up IO
transfer length, and hence incorrect. When returned back to SG_IO
originator, it creates a false impression that the actual amount of data
is less then actually read. For example, when reading a 0x8002-bytes
tape block with dxfer_len = 0x8002, the actual IO operation issued to
tape drive has a transfer length 0x8200, and residual length after the
operation is completed is 510. So the SG_IO originator has a full
impression that the actual amount of data read is 0x8002 - 510.

There are different ways to fix this problem. For example, one might
amend residual length value returned to user, or the last scatterlist
element length can be fixed. But the first question is why is that
rounding being done in the first place? I have fixed the problem by
removing it, and nothing looks broken. What do maintainers think?

diff -urp a/drivers/scsi/sg.c b/drivers/scsi/sg.c
--- a/drivers/scsi/sg.c 2007-07-08 19:32:17.000000000 -0400
+++ b/drivers/scsi/sg.c 2007-11-14 14:54:53.000000000 -0500
@@ -1846,8 +1846,7 @@ sg_build_indirect(Sg_scatter_hold * schp
                return -EFAULT;
        if (0 == blk_size)
                ++blk_size;     /* don't know why */
-/* round request up to next highest SG_SECTOR_SZ byte boundary */
-       blk_size = (blk_size + SG_SECTOR_MSK) & (~SG_SECTOR_MSK);
+
        SCSI_LOG_TIMEOUT(4, printk("sg_build_indirect: buff_size=%d,
blk_size=%d\n",
                                   buff_size, blk_size));
 




^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: rounding dxfer_len to 512 bytes boundary in sg.c
  2007-11-15 15:38 rounding dxfer_len to 512 bytes boundary in sg.c Igor A. Nesterov
@ 2007-11-15 21:27 ` Douglas Gilbert
  2007-11-15 21:56   ` Igor A. Nesterov
  0 siblings, 1 reply; 3+ messages in thread
From: Douglas Gilbert @ 2007-11-15 21:27 UTC (permalink / raw)
  To: Igor A. Nesterov; +Cc: SCSI Mailing List

Igor A. Nesterov wrote:
> dxfer_len value provided to SG_IO ioctl as a part of sg_io_hdr_t
> structure is rounded up to the nearest 512 bytes boundary just before
> scatterlist allocation in sg_build_indirect()
> 
>   /* round request up to next highest SG_SECTOR_SZ byte boundary */
>         blk_size = (blk_size + SG_SECTOR_MSK) & (~SG_SECTOR_MSK);
> 
> I do not understand why is this. From the memory handling point of view
> it seems meaningless because kernel memory for scatterlist is being
> allocated in pages anyway. And from the I/O point of view it results
> into buggy behavior if dxfer_len is bigger then def_reserved_size and is
> not multiple of 512. Roughly, there are two cases handled a little bit
> different in sg.c
> 
> 1) dxfer_len <= def_reserved_size (32k by default)
> 
> in this case sg_start_req() uses preallocated reserved buffer, and calls
> sg_link_reserve(), which truncates the length of the last scatterlist
> element to match original dxfer_len value. Everything goes fine.
> 
> 2) dxfer_len > def_reserved_size || reserved buffer is in use
> 
> in this case sg_start_req() allocates new scatterlist buffer with
> sg_build_indirect() and does NOT truncate the last scatterlist element
> leaving it rounded up to the nearest 512k boundary. Following IO
> operation is screwed up. First, operation is issued to device with a
> transfer length different from what meant by SG_IO originator. Sometimes
> it does not matter, sometimes it does. Even if IO operation is some sort
> of read (like read from a tape) and a device responds correctly,
> residual value from scsi mid-layer is calculated based on rounded up IO
> transfer length, and hence incorrect. When returned back to SG_IO
> originator, it creates a false impression that the actual amount of data
> is less then actually read. For example, when reading a 0x8002-bytes
> tape block with dxfer_len = 0x8002, the actual IO operation issued to
> tape drive has a transfer length 0x8200, and residual length after the
> operation is completed is 510. So the SG_IO originator has a full
> impression that the actual amount of data read is 0x8002 - 510.
> 
> There are different ways to fix this problem. For example, one might
> amend residual length value returned to user, or the last scatterlist
> element length can be fixed. But the first question is why is that
> rounding being done in the first place? I have fixed the problem by
> removing it, and nothing looks broken. What do maintainers think?
> 
> diff -urp a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> --- a/drivers/scsi/sg.c 2007-07-08 19:32:17.000000000 -0400
> +++ b/drivers/scsi/sg.c 2007-11-14 14:54:53.000000000 -0500
> @@ -1846,8 +1846,7 @@ sg_build_indirect(Sg_scatter_hold * schp
>                 return -EFAULT;
>         if (0 == blk_size)
>                 ++blk_size;     /* don't know why */
> -/* round request up to next highest SG_SECTOR_SZ byte boundary */
> -       blk_size = (blk_size + SG_SECTOR_MSK) & (~SG_SECTOR_MSK);
> +
>         SCSI_LOG_TIMEOUT(4, printk("sg_build_indirect: buff_size=%d,
> blk_size=%d\n",
>                                    buff_size, blk_size));
>  

That code was there when I inherited the sg driver in 1998
(the original sg drivers dates from around 1993). The reason
is/was that the DMA element of some HBA needed it. I tried
to take it out once (probably in the last millenium) and it
broke somewhere so back in it went :-)

Now the SG_IO ioctl implementation in the block layer
doesn't have that code so perhaps we might assume from
that the HBA is no longer in use or its DMA logic
has been fixed. So perhaps it's time to pull out that
line again and see if anyone screams.

Doug Gilbert



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: rounding dxfer_len to 512 bytes boundary in sg.c
  2007-11-15 21:27 ` Douglas Gilbert
@ 2007-11-15 21:56   ` Igor A. Nesterov
  0 siblings, 0 replies; 3+ messages in thread
From: Igor A. Nesterov @ 2007-11-15 21:56 UTC (permalink / raw)
  To: dougg; +Cc: SCSI Mailing List

On Thu, 2007-11-15 at 16:27 -0500, Douglas Gilbert wrote:

> That code was there when I inherited the sg driver in 1998
> (the original sg drivers dates from around 1993). The reason
> is/was that the DMA element of some HBA needed it. I tried
> to take it out once (probably in the last millenium) and it
> broke somewhere so back in it went :-)

I would say that if an HBA accepts only rounded values, it should either
round them up itself, or reject non-rounded requests with one or another
status value. Well, in 1993 people might think differently.

> 
> Now the SG_IO ioctl implementation in the block layer
> doesn't have that code so perhaps we might assume from
> that the HBA is no longer in use or its DMA logic
> has been fixed. So perhaps it's time to pull out that
> line again and see if anyone screams.

But the code in sg_link_reserve() which fixes the last scatterlist
element length (and which was most probably added at later times),
nullifies the effect of that rounding anyway. It only has an effect on
requests bigger than def_reserved_size, and there it breaks IO. So I
scream. 



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2007-11-15 21:57 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-15 15:38 rounding dxfer_len to 512 bytes boundary in sg.c Igor A. Nesterov
2007-11-15 21:27 ` Douglas Gilbert
2007-11-15 21:56   ` Igor A. Nesterov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.