SYNCHRONIZE CACHE command from sd on close

public inbox for linux-scsi@vger.kernel.org
 help / color / mirror / Atom feed

* SYNCHRONIZE CACHE command from sd on close
@ 2010-02-15 13:03 Douglas Gilbert
  2010-02-15 13:25 ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Douglas Gilbert @ 2010-02-15 13:03 UTC (permalink / raw)
  To: SCSI development list

Recently, judging from error reports reaching me
from smartmontools, sdparm and sg_start, something
changed in the sd driver associated with the
SYNCHRONIZE CACHE command it issues when a device
is closed.

That only seems to happen when the device is opened
RW and it exposes a nasty difference between the
semantics of spinning up and down ATA disks compared
to SCSI disks.

If you send SYNCHRONIZE CACHE to a SATL then the ATA
disk behind it will be spun up if it happened to be
spun down. Send that command to a SCSI disk and you
will get an error message (sense) indicating that
you need to do START_STOP_UNIT(start) first.

One manifestation of this problem is that:
     sdparm -C stop <ata_disk_via_sd>
doesn't work. Being a SCSI utility it opens the sd
device RW reflecting that the START_STOP_UNIT to
a SCSI disk is potentially state changing (in the
sense that subsequent READs and WRITEs may fail).
But since we have a ATA disk then the SYNCHRONIZE
CACHE on close spins up the disk, defeating the
attempt to spin it down.

Now I'm playing lots of tricks in sdparm to get
around this but I think the correct solution is
for the sd driver to only send the SYNCHRONIZE
CACHE command to a device on close if something
has been written to it.

Doug Gilbert

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SYNCHRONIZE CACHE command from sd on close
  2010-02-15 13:03 SYNCHRONIZE CACHE command from sd on close Douglas Gilbert
@ 2010-02-15 13:25 ` Christoph Hellwig
  2010-02-15 13:51   ` Douglas Gilbert
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2010-02-15 13:25 UTC (permalink / raw)
  To: Douglas Gilbert; +Cc: SCSI development list

On Mon, Feb 15, 2010 at 02:03:06PM +0100, Douglas Gilbert wrote:
> Recently, judging from error reports reaching me
> from smartmontools, sdparm and sg_start, something
> changed in the sd driver associated with the
> SYNCHRONIZE CACHE command it issues when a device
> is closed.
>
> That only seems to happen when the device is opened
> RW and it exposes a nasty difference between the
> semantics of spinning up and down ATA disks compared
> to SCSI disks.

The sd driver itself never sends a SYNCHRONIZE CACHE in
response to access through the block device node, it is only
sent for barrier requests, when hot-unplugging a scsi device,
or when shutting down the system.

Now that has change recently is that we now send down a cache
flush from the block layer when fsync is called on the block
device node.  The kernel should never call that by itself when
closing the device, but can you double check that the tools
don't call fsync/fdatasync/msync or open the block device node
using O_SYNC/O_DYSNC?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SYNCHRONIZE CACHE command from sd on close
  2010-02-15 13:25 ` Christoph Hellwig
@ 2010-02-15 13:51   ` Douglas Gilbert
  2010-02-15 22:48     ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Douglas Gilbert @ 2010-02-15 13:51 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: SCSI development list

Christoph Hellwig wrote:
> On Mon, Feb 15, 2010 at 02:03:06PM +0100, Douglas Gilbert wrote:
>> Recently, judging from error reports reaching me
>> from smartmontools, sdparm and sg_start, something
>> changed in the sd driver associated with the
>> SYNCHRONIZE CACHE command it issues when a device
>> is closed.
>>
>> That only seems to happen when the device is opened
>> RW and it exposes a nasty difference between the
>> semantics of spinning up and down ATA disks compared
>> to SCSI disks.
> 
> The sd driver itself never sends a SYNCHRONIZE CACHE in
> response to access through the block device node, it is only
> sent for barrier requests, when hot-unplugging a scsi device,
> or when shutting down the system.
> 
> Now that has change recently is that we now send down a cache
> flush from the block layer when fsync is called on the block
> device node.  The kernel should never call that by itself when
> closing the device, but can you double check that the tools
> don't call fsync/fdatasync/msync or open the block device node
> using O_SYNC/O_DYSNC?

What about O_NONBLOCK (which stops a hang on open)?
The open code common to my utilities in Linux is
below.

Doug Gilbert


int
scsi_pt_open_device(const char * device_name, int read_only, int verbose)
{
     int oflags = O_NONBLOCK;

     oflags |= (read_only ? O_RDONLY : O_RDWR);
     return scsi_pt_open_flags(device_name, oflags, verbose);
}


int
scsi_pt_open_flags(const char * device_name, int flags, int verbose)
{
     int fd;

     if (verbose > 1) {
         if (NULL == sg_warnings_strm)
             sg_warnings_strm = stderr;
         fprintf(sg_warnings_strm, "open %s with flags=0x%x\n", device_name,
                 flags);
     }
     fd = open(device_name, flags);
     if (fd < 0)
         fd = -errno;
     return fd;
}

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SYNCHRONIZE CACHE command from sd on close
  2010-02-15 13:51   ` Douglas Gilbert
@ 2010-02-15 22:48     ` Christoph Hellwig
  2010-02-19  0:20       ` Douglas Gilbert
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2010-02-15 22:48 UTC (permalink / raw)
  To: Douglas Gilbert; +Cc: Christoph Hellwig, SCSI development list

On Mon, Feb 15, 2010 at 02:51:22PM +0100, Douglas Gilbert wrote:
> What about O_NONBLOCK (which stops a hang on open)?
> The open code common to my utilities in Linux is
> below.

No, O_NONBLOCK should have nothing to do with it and your code
snipplet looks fine.  We'll need to figure out what's really going
on here.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SYNCHRONIZE CACHE command from sd on close
  2010-02-15 22:48     ` Christoph Hellwig
@ 2010-02-19  0:20       ` Douglas Gilbert
  2010-02-19  8:04         ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Douglas Gilbert @ 2010-02-19  0:20 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: SCSI development list, Dan Horák

Christoph Hellwig wrote:
> On Mon, Feb 15, 2010 at 02:51:22PM +0100, Douglas Gilbert wrote:
>> What about O_NONBLOCK (which stops a hang on open)?
>> The open code common to my utilities in Linux is
>> below.
> 
> No, O_NONBLOCK should have nothing to do with it and your code
> snipplet looks fine.  We'll need to figure out what's really going
> on here.

Forget SYNCHRONIZE CACHE, the pass-through interface via
sd looks completely stupid when opened RW.

   'modprobe scsi_debug opts=1'
shows all SCSI commands sent to a device (dev/sdb in this
case).

   # sg_start --stop --readonly /dev/sdb
does the expected:
     scsi_debug: cmd 1b 00 00 00 00 00

but remove that '--readonly' and /dev/sdb is opened RW
with this command:
   # sg_start --stop /dev/sdb
then scsi_debug reports this load (of crap):
     scsi_debug: cmd 1b 00 00 00 00 00
     scsi_debug: cmd 12 00 00 00 fe 00
     scsi_debug: cmd 12 01 00 00 fe 00
     scsi_debug: cmd 12 01 83 00 fe 00
     scsi_debug: cmd 28 00 00 00 00 00 00 01 00 00
     scsi_debug: cmd 28 00 00 00 00 00 00 00 08 00
     scsi_debug: cmd 28 00 00 00 00 00 00 00 20 00
     scsi_debug: cmd 28 00 00 00 00 00 00 00 08 00
     scsi_debug: cmd 28 00 00 00 00 00 00 00 08 00

So send a START_STOP_UNIT(stop) through the SG_IO
ioctl on a sd device opened RW and as a bonus get
three INQUIRYs (one standard, two VPD pages) and 5 READ
commands!

If the device is SCSI (as the scsi_debug driver is
simulating) then those READs fail because the drive
is stopped. However if that is an ATA disk behind a
SAT layer, then the disk will be spun up. That defeats
the purpose of the pass-though, especially when it
is being used to spin down the disk.

My guess, reviewing the bug reports flowing into me is
that this nonsense started around lk 2.6.29 .

Please fix.

Doug Gilbert

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SYNCHRONIZE CACHE command from sd on close
  2010-02-19  0:20       ` Douglas Gilbert
@ 2010-02-19  8:04         ` Christoph Hellwig
  2010-02-19 11:56           ` Douglas Gilbert
  0 siblings, 1 reply; 7+ messages in thread
From: Christoph Hellwig @ 2010-02-19  8:04 UTC (permalink / raw)
  To: Douglas Gilbert; +Cc: Christoph Hellwig, SCSI development list, Dan Hor??k

On Fri, Feb 19, 2010 at 01:20:25AM +0100, Douglas Gilbert wrote:
> So send a START_STOP_UNIT(stop) through the SG_IO
> ioctl on a sd device opened RW and as a bonus get
> three INQUIRYs (one standard, two VPD pages) and 5 READ
> commands!
>
> If the device is SCSI (as the scsi_debug driver is
> simulating) then those READs fail because the drive
> is stopped. However if that is an ATA disk behind a
> SAT layer, then the disk will be spun up. That defeats
> the purpose of the pass-though, especially when it
> is being used to spin down the disk.
>
> My guess, reviewing the bug reports flowing into me is
> that this nonsense started around lk 2.6.29 .

We should never send INQUIRY or READ commands from the kernel
in response to opening a device.  But the combination sounds
like something udev might be doing for it's stable device
indentifier and manual partition scan because I don't trust
the kernel thing.  Can you check if these commands come from
udev or one of the realted tools (hal, device-kit-blah, udisks
whatever it is called today)


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: SYNCHRONIZE CACHE command from sd on close
  2010-02-19  8:04         ` Christoph Hellwig
@ 2010-02-19 11:56           ` Douglas Gilbert
  0 siblings, 0 replies; 7+ messages in thread
From: Douglas Gilbert @ 2010-02-19 11:56 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: SCSI development list, Dan Hor??k

Christoph Hellwig wrote:
> On Fri, Feb 19, 2010 at 01:20:25AM +0100, Douglas Gilbert wrote:
>> So send a START_STOP_UNIT(stop) through the SG_IO
>> ioctl on a sd device opened RW and as a bonus get
>> three INQUIRYs (one standard, two VPD pages) and 5 READ
>> commands!
>>
>> If the device is SCSI (as the scsi_debug driver is
>> simulating) then those READs fail because the drive
>> is stopped. However if that is an ATA disk behind a
>> SAT layer, then the disk will be spun up. That defeats
>> the purpose of the pass-though, especially when it
>> is being used to spin down the disk.
>>
>> My guess, reviewing the bug reports flowing into me is
>> that this nonsense started around lk 2.6.29 .
> 
> We should never send INQUIRY or READ commands from the kernel
> in response to opening a device.  But the combination sounds
> like something udev might be doing for it's stable device
> indentifier and manual partition scan because I don't trust
> the kernel thing.  Can you check if these commands come from
> udev or one of the realted tools (hal, device-kit-blah, udisks
> whatever it is called today)

Adding some sleep()s to sg_start it seems that all nasty
stuff gets sent to the device synchronized with the close()
of the /dev/sd* file descriptor (not the open() ).

The sg_start sequence is:
    a)  fd = open("/dev/sdb", O_RDWR | O_NOBLOCK)
    b)  ioctl(fd, SG_IO, <START_STOP_UNIT(stop) command>)
    c)  close(fd)

a) sends no SCSI commands to the device
b) sends the START_STOP_UNIT(stop) only
c) sends 3 INQUIRYs and 5 READs !?



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-02-19 11:56 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-02-15 13:03 SYNCHRONIZE CACHE command from sd on close Douglas Gilbert
2010-02-15 13:25 ` Christoph Hellwig
2010-02-15 13:51   ` Douglas Gilbert
2010-02-15 22:48     ` Christoph Hellwig
2010-02-19  0:20       ` Douglas Gilbert
2010-02-19  8:04         ` Christoph Hellwig
2010-02-19 11:56           ` Douglas Gilbert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox