From: Jeff Garzik <jeff@garzik.org>
To: Jan Engelhardt <jengelh@linux01.gwdg.de>
Cc: Ric Wheeler <ric@emc.com>,
linux-scsi <linux-scsi@vger.kernel.org>,
linux-fsdevel@vger.kernel.org,
Linux-ide <linux-ide@vger.kernel.org>
Subject: Re: impact of 4k sector size on the IO & FS stack
Date: Mon, 12 Mar 2007 10:41:07 -0400 [thread overview]
Message-ID: <45F56683.9060604@garzik.org> (raw)
In-Reply-To: <Pine.LNX.4.61.0703120423360.4339@yvahk01.tjqt.qr>
Jan Engelhardt wrote:
> On Mar 11 2007 22:45, Ric Wheeler wrote:
>> Jan Engelhardt wrote:
>>> On Mar 11 2007 18:51, Ric Wheeler wrote:
>>>
>>>> During the recent IO/FS workshop, we spoke briefly about the
>>>> coming change to a 4k sector size for disks on linux. If I
>>>> recall correctly, the general feeling was that the impact was
>>>> not significant since we already do most file system IO in 4k
>>>> page sizes and should be fine as long as we partition drives
>>>> correctly and avoid non-4k aligned partitions.
>>>>
>>> Sorry about jumping right in, but what about an 'old-style'
>>> partition table that relies on 512 as a unit?
>>>
>>>
>> I think that the normal case would involve new drives which
>> would need to be partitioned in 4k aligned partitions.
>> Shouldn't that work regardless of the unit used in the
>> partition table?
>
> Assume this partition table on my current HD:
>
> Disk /dev/hdc: 251.0 GB, 251000193024 bytes
> 255 heads, 63 sectors/track, 30515 cylinders
> Units = cylinders of 16065 * 512 = 8225280 bytes
>
> Device Start End Blocks Id System
> /dev/hdc1 1 33 265041 82 Linux swap / Solaris
> /dev/hdc2 34 30515 244846665 5 Extended
>
> That is, 255 * 63 * 30515 * 512 == roughly 251 GB.
>
> Now, if this disk was copied byte per byte (/bin/dd) to a
> 4096-based disk, and Linux would start using a sector size of
> 4096, then I would suddenly have
>
> 255 * 63 * 30515 * 4096 == 2 TB
>
> Although I would not mind the 2 TB, the partition table would
> read quite differently (note the Blocks column which is
> multiplied by 4 (512x4=4096))
At this level, for RMW drives, nothing changes. The partition software,
ATA driver, and all other bits continue to think that sector size == 512
bytes.
The partition software /hopefully/ becomes smart enough to understand
the alignment necessary, but that is not a requirement.
This is the key to understanding the difference between a physical
(==platters) sector size change without a logical (==ATA interface)
sector size change.
> Device Start End Blocks Id System
> /dev/hdc1 1 33 1060164 82 Linux swap / Solaris
> /dev/hdc2 34 30515 979386660 5 Extended
>
> Which would mean that the swap partition reaches into the real
> data partition and would corrupt it.
For RMW drives, RMW cycles would occur but not corruption.
For non-RMW drives, this just wouldn't occur.
Jeff
next prev parent reply other threads:[~2007-03-12 14:41 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-03-11 22:51 impact of 4k sector size on the IO & FS stack Ric Wheeler
2007-03-11 23:14 ` Jan Engelhardt
2007-03-12 2:45 ` Ric Wheeler
2007-03-12 3:27 ` Jan Engelhardt
2007-03-12 3:46 ` Andreas Dilger
2007-03-12 12:17 ` Alan Cox
2007-03-12 14:41 ` Jeff Garzik [this message]
2007-03-12 14:36 ` Jeff Garzik
2007-03-12 15:45 ` Alan Cox
2007-03-12 18:31 ` Bryan Henderson
2007-03-12 18:37 ` Sergei Shtylyov
2007-03-12 20:52 ` Bryan Henderson
2007-03-12 19:16 ` Douglas Gilbert
2007-03-12 19:28 ` Jeff Garzik
2007-03-12 0:02 ` Alan Cox
2007-03-12 0:44 ` Jeff Garzik
2007-03-12 2:37 ` Ric Wheeler
2007-03-12 12:24 ` Alan Cox
2007-03-12 13:32 ` Ric Wheeler
2007-03-12 15:21 ` Douglas Gilbert
2007-03-12 16:08 ` Martin K. Petersen
2007-03-12 14:26 ` Jeff Garzik
2007-03-13 5:11 ` Andreas Dilger
2007-03-13 6:34 ` Chris Wedgwood
2007-03-12 2:41 ` Ric Wheeler
2007-03-12 8:18 ` Christoph Hellwig
2007-03-12 14:40 ` James Bottomley
2007-03-12 14:45 ` Jeff Garzik
2007-03-12 14:57 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=45F56683.9060604@garzik.org \
--to=jeff@garzik.org \
--cc=jengelh@linux01.gwdg.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-ide@vger.kernel.org \
--cc=linux-scsi@vger.kernel.org \
--cc=ric@emc.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).