linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Douglas Gilbert <dougg@torque.net>
To: ric@emc.com
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Jeff Garzik <jeff@garzik.org>,
	linux-scsi <linux-scsi@vger.kernel.org>,
	linux-fsdevel@vger.kernel.org,
	Linux-ide <linux-ide@vger.kernel.org>
Subject: Re: impact of 4k sector size on the IO & FS stack
Date: Mon, 12 Mar 2007 11:21:36 -0400	[thread overview]
Message-ID: <45F57000.8030604@torque.net> (raw)
In-Reply-To: <45F5565E.7010104@emc.com>

Ric Wheeler wrote:
> Alan Cox wrote:
>>> First generation of 1K sector drives will continue to use the same
>>> 512-byte ATA sector size you are familiar with.  A single 512-byte
>>> write will cause the drive to perform a read-modify-write cycle. 
>>> This configuration is physical 1K sector, logical 512b sector.
>>
>> The problem case is "read-modify-screwup"
>>
>> At that point we've trashed the block we were writing (a well studied
>> recovery case), and we've blasted some previously sane, totally
>> unrelated sector of data out of existance. Thats why we need to know
>> ideally if they are doing the write to a different physical block when
>> they do this, so that we don't lose the old data. My guess is they won't
>> as it'll be hard.
> 
> I think that the firmware would have to do this in the drive's write
> cache and would always write the modified data back to the same physical
> sector (unless a media error forces a sector remap).
> 
> If firmware modifies the 7 512 byte sectors that it read to do the 1 512
> byte sector write, then we certainly would see what you describe happen.
> 
> In general, it would seem to be a bad idea to do allocate a different
> physical sector to underpin this king of read-modify-write since that
> would kill contiguous layout of files, etc.
> 
>>> A future configuration will change the logical ATA interface away
>>> from 512-byte sectors to 1K or 4K.  Here, it is impossible to read a
>>> quantity smaller than 1K or 4K, whatever the sector size is.
>>
>> That one I'm not worried about - other than "guess how Redmond decide to
>> make partition tables work" that one is mostly easy (be fun to see how
>> many controllers simply can't cope with the command formats)
>>
> 
> This will be interesting to find out. I will be sharing a panel with
> some BIOS & MS people, so I will update all on what I hear,

Ric,
Just to add a SCSI perspective, it looks like 4 KB sectored
disks will be almost exclusively ATA devices. It is being
done to improve capacity at the expensive of performance.
[SCSI/FC/SAS disks typically trade off capacity for better
performance.]

Support for disks with smaller logical block size than
physical block size has already been added to SBC-3. The
overview of this document gives a rationale:
www.t10.org/ftp/t10/document.06/06-034r5.pdf

SAT is now a standard and an agenda item for SAT-2 is
to wire ATA8-ACS's large sector size support to the
additions to SBC-3 mentioned above.


I'm not sure how this stuff plays with end to end data
protection :-)
Most SCSI disks currently allow formatting sizes of 512
up to 528 bytes per logical block.

Doug Gilbert




  reply	other threads:[~2007-03-12 15:21 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-11 22:51 impact of 4k sector size on the IO & FS stack Ric Wheeler
2007-03-11 23:14 ` Jan Engelhardt
2007-03-12  2:45   ` Ric Wheeler
2007-03-12  3:27     ` Jan Engelhardt
2007-03-12  3:46       ` Andreas Dilger
2007-03-12 12:17       ` Alan Cox
2007-03-12 14:41       ` Jeff Garzik
2007-03-12 14:36   ` Jeff Garzik
2007-03-12 15:45     ` Alan Cox
2007-03-12 18:31     ` Bryan Henderson
2007-03-12 18:37       ` Sergei Shtylyov
2007-03-12 20:52         ` Bryan Henderson
2007-03-12 19:16       ` Douglas Gilbert
2007-03-12 19:28         ` Jeff Garzik
2007-03-12  0:02 ` Alan Cox
2007-03-12  0:44   ` Jeff Garzik
2007-03-12  2:37     ` Ric Wheeler
2007-03-12 12:24     ` Alan Cox
2007-03-12 13:32       ` Ric Wheeler
2007-03-12 15:21         ` Douglas Gilbert [this message]
2007-03-12 16:08           ` Martin K. Petersen
2007-03-12 14:26       ` Jeff Garzik
2007-03-13  5:11         ` Andreas Dilger
2007-03-13  6:34           ` Chris Wedgwood
2007-03-12  2:41   ` Ric Wheeler
2007-03-12  8:18 ` Christoph Hellwig
2007-03-12 14:40   ` James Bottomley
2007-03-12 14:45   ` Jeff Garzik
2007-03-12 14:57     ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=45F57000.8030604@torque.net \
    --to=dougg@torque.net \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=jeff@garzik.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=ric@emc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).