public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj@kernel.org>
To: Karel Zak <kzak@redhat.com>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
	"linux-ide@vger.kernel.org" <linux-ide@vger.kernel.org>,
	lkml <linux-kernel@vger.kernel.org>,
	Daniel Taylor <Daniel.Taylor@wdc.com>,
	Jeff Garzik <jeff@garzik.org>, Mark Lord <kernel@teksavvy.com>,
	tytso@mit.edu, "H. Peter Anvin" <hpa@zytor.com>,
	hirofumi@mail.parknet.co.jp,
	Andrew Morton <akpm@linux-foundation.org>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	irtiger@gmail.com, Matthew Wilcox <matthew@wil.cx>,
	aschnell@suse.de, knikanth@suse.de, jdelvare@suse.de,
	Jim Meyering <jim@meyering.net>
Subject: Re: ATA 4 KiB sector issues.
Date: Tue, 09 Mar 2010 11:34:04 +0900	[thread overview]
Message-ID: <4B95B39C.70402@kernel.org> (raw)
In-Reply-To: <20100308195847.GC18077@nb.net.home>

Hello,

On 03/09/2010 04:58 AM, Karel Zak wrote:
>> Tejun> Reportedly, commonly used partitioners aren't ready to handle
>> Tejun> drives larger than 2 TiB in any configuration and alignment isn't
> 
> The limit is specific for DOS partition table (with 512-byte log.
> sectors), but for example GPT uses 64-bit LBA. I believe that our
> partitioning tools don't introduce any other restriction.

Hmmm... the 'reportedly' was from Daniel Taylor or maybe I just
misinterpreted the conversation.  Daniel, can you please fill in?

>> Tejun> done properly for drives with 4 KiB physical sectors.  4 KiB
>> Tejun> logical sector support is broken in both the kernel 
>>
>> Huh, what?  My homedir is on a 4KiB LBS/PBS drive and has been for ~2
>> years.

By default, they aren't aligned properly, are they?

>> Tejun> (need more details and probably a whole section on partitioner
>> Tejun> behaviors)
>>
>> I'm Cc:'ing Karel Zak and Jim Meyering who have been doing all the
>> alignment work for fdisk and parted respectively.  Karel, Jim: The full
>> writeup is here:
>>
>> 	http://ata.wiki.kernel.org/index.php/ATA_4_KiB_sector_issues
>>
>> It'd be great if you guys could share what you have been doing to the
>> tooling.
> 
>  small summary:
> 
>  - libblkid provides unified API to topology information, it supports:
>     - ioctls (kernel >= 2.6.32)
>     - sysfs (kernel >= 2.6.31)
>     - stripe chunk size and stripe width for DM, MD. LVM and evms on
>       old kernels
>  - libparted and fdisk are linked against libblkid
> 
>  - fdisk supports 4KiB logical sector size (util-linux-ng >= 2.15
>  - fdisk supports 4KiB physical sector size (util-linux-ng >= 2.17)
>  - fdisk uses 1MiB alignment (or more if optimal I/O size is bigger)
>    and alignment_offset for all partitions in non-DOS mode
>    (util-linux-ng >= 2.17.1)

That's great.  Daniel, maybe you were testing older versions?  Or
maybe those failures were manifested from libata mishandling 4KiB r/w
requets.

>  - parted supports 4KiB physical sector size
>  - parted uses 1MiB alignment for disks with unknown topology, disks
>    with topology information are aligned to optimal (or minimum) I/O
>    size (parted >= 2.1)

This will result in incorrect alignment for drives which lie about the
physical sector size to work around BIOS/drivers issues (C-1).  It
would probably be best to align to at least 1MiB.

>  - EFI GPT code in the kernel has been updated to works properly with 
>    4KiB sectors (kernel >= 2.6.33)

libata is broken for logical 4KiB ATA devices tho.  I'll fix it up.

>  - mkfs.{ext,xfs,gfs2,ocfs2} have been update to work properly with
>    topology information, mkfs.{ext,xfs} are linked against libblkid
>    for compatibility with old kernel (for stripe chunk size / width)
> 
>  - Fedora-13/RHEL6 installer uses libparted with 4KiB support
> 
>  - alignment_offset & 4KiB support is planned for LUKS (cryptsetup)
> 
>> Tejun> Unfortunately, the transition to 4 KiB sector size, physical only
>> Tejun> or logical too, is looking fairly ugly.  Hopefully, a reasonable
>> Tejun> solution can be reached in not too distant future but even with
>> Tejun> all the software side updated, it looks like it's gonna cause
>> Tejun> significant amount of confusion and frustration.
>>
>> With regards to XP compatibility I don't think we should go too much out
>> of our way to accommodate it.  XP has been disowned by its master and I
>> think virtualization will take care of the rest.

Yeah, good point.  I'm just a bit worried that it might generate a lot
of frustrated bug reports.  Well, maybe we should just advise users to
install windows first and then install Linux.

>> FWIW, recent fdisk has a command line flag that will enable/disable DOS
>> compatible layout.
> 
>  yes, util-linux-ng 2.17.1, fdisk -c
>  
>  Note that non-DOS mode will be default in the next major
>  util-linux-ng release.

I'll try to merge these information into the ata-4k doc.

Thank you very much.

-- 
tejun

  reply	other threads:[~2010-03-09  2:36 UTC|newest]

Thread overview: 129+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-08  3:48 ATA 4 KiB sector issues Tejun Heo
2010-03-08  5:38 ` Greg Freemyer
2010-03-08  7:00 ` James Bottomley
2010-03-08  7:53   ` H. Peter Anvin
2010-03-08 15:34     ` Martin K. Petersen
2010-03-09 22:36       ` Daniel Taylor
2010-03-09 22:46     ` Greg Freemyer
2010-03-10  0:05       ` Tejun Heo
2010-03-10  0:14         ` Daniel Taylor
2010-03-10  0:26           ` Tejun Heo
2010-03-10  0:36             ` H. Peter Anvin
2010-03-10  5:17           ` H. Peter Anvin
2010-03-10  7:09           ` Gabor Gombas
2010-03-10  0:32       ` H. Peter Anvin
2010-03-10 10:46         ` Johannes Stezenbach
2010-03-10 11:22           ` H. Peter Anvin
2010-03-08  7:56   ` H. Peter Anvin
2010-03-08 15:33   ` Martin K. Petersen
2010-03-08 15:38     ` Martin K. Petersen
2010-03-08 15:41       ` Martin K. Petersen
2010-03-08 18:50         ` H. Peter Anvin
2010-03-08 18:58           ` James Bottomley
2010-03-08 19:11             ` H. Peter Anvin
2010-03-08 20:02             ` Cláudio Martins
2010-03-08 21:07               ` Martin K. Petersen
2010-03-08 20:19           ` Martin K. Petersen
2010-03-08 21:16             ` H. Peter Anvin
2010-03-10  0:34           ` Tejun Heo
2010-03-10  7:53         ` Matthew Wilcox
2010-03-10 13:47           ` Jeff Garzik
2010-03-10 16:19             ` Damian Lukowski
2010-03-11 13:04               ` Theodore Tso
2010-03-11 13:57                 ` Nikanth Karthikesan
2010-03-11 14:28                   ` Theodore Tso
2010-03-11 14:39                     ` James Bottomley
2010-03-11 15:05                       ` Nikanth Karthikesan
2010-03-11 15:25                         ` tytso
2010-03-11 16:26                           ` Gene Heskett
2010-03-11 16:34                           ` Greg Freemyer
2010-03-12  1:09                             ` Tejun Heo
2010-03-11 14:48                     ` Mike Snitzer
2010-03-11 15:00                     ` Nikanth Karthikesan
2010-03-11 15:10                       ` Tejun Heo
2010-03-11 16:01                       ` Mike Snitzer
2010-03-11 18:26                         ` Christoph Hellwig
2010-03-11 16:33                   ` H. Peter Anvin
2010-03-08 15:18 ` Martin K. Petersen
2010-03-08 18:29   ` H. Peter Anvin
2010-03-08 20:01     ` Martin K. Petersen
2010-03-08 19:34   ` Mike Snitzer
2010-03-09  2:53     ` Tejun Heo
2010-03-09  3:20       ` Martin K. Petersen
2010-03-09  6:53     ` Michael Tokarev
2010-03-09 10:01       ` Karel Zak
2010-03-09 10:16         ` Michael Tokarev
2010-03-09 11:15           ` Dave Chinner
2010-03-09 11:38             ` Michael Tokarev
2010-03-09 12:20               ` Dave Chinner
2010-03-09 11:50           ` Karel Zak
2010-03-09 12:18           ` Karel Zak
2010-03-10  5:06             ` Martin K. Petersen
2010-03-10 20:50               ` Henrique de Moraes Holschuh
2010-03-10  4:57       ` Martin K. Petersen
2010-03-08 19:58   ` Karel Zak
2010-03-09  2:34     ` Tejun Heo [this message]
2010-03-09  2:42       ` Jeff Garzik
2010-03-09  2:49         ` Tejun Heo
2010-03-09  2:42       ` Tejun Heo
2010-03-09  3:11         ` Martin K. Petersen
2010-03-09  3:09       ` Martin K. Petersen
2010-03-09  3:38       ` Daniel Taylor
2010-03-09  4:54         ` Martin K. Petersen
2010-03-09  7:27     ` Jim Meyering
2010-03-09 23:56       ` Tejun Heo
2010-03-08 20:12   ` H. Peter Anvin
2010-03-09  2:22     ` Tejun Heo
2010-03-09  2:44   ` Tejun Heo
2010-03-09  3:18     ` Martin K. Petersen
2010-03-09 14:32       ` Mark Lord
2010-03-09  6:34 ` Mikael Abrahamsson
2010-03-09 10:06   ` Michal Soltys
2010-03-10  0:11     ` Tejun Heo
2010-03-14 21:09       ` Michal Soltys
2010-03-14 22:56         ` s ponnusa
2010-03-09 13:55 ` Mark Lord
2010-03-10  0:00   ` Tejun Heo
2010-03-10  6:08     ` Mark Lord
2010-03-09 23:46 ` Arnd Bergmann
2010-03-10  0:20   ` Tejun Heo
2010-03-10  9:14   ` Denys Vlasenko
2010-03-15  1:21     ` H. Peter Anvin
2010-03-15  2:26       ` Denys Vlasenko
2010-03-15  2:56         ` Greg Freemyer
2010-03-15  4:00         ` H. Peter Anvin
2010-03-15 12:30           ` Arnd Bergmann
2010-03-15  5:20         ` david
2010-03-15  9:56           ` Denys Vlasenko
2010-03-15 14:47             ` H. Peter Anvin
2010-03-16  2:30     ` Tejun Heo
2010-03-16  2:32       ` Tejun Heo
2010-03-16  6:14       ` James Bottomley
2010-03-16  6:22         ` Tejun Heo
2010-03-16 13:24           ` James Bottomley
2010-03-16 13:56             ` Tejun Heo
2010-03-16 14:21               ` James Bottomley
2010-03-16 14:25                 ` Arnd Bergmann
2010-03-16 14:50                 ` Tejun Heo
2010-03-16 15:02                   ` James Bottomley
2010-03-16 15:20                     ` Tejun Heo
2010-03-16 15:22                       ` Martin K. Petersen
2010-03-17  2:07                         ` Tejun Heo
2010-03-16 15:23                       ` James Bottomley
2010-03-16 15:37                         ` Tejun Heo
2010-03-16 20:42                           ` Ric Wheeler
2010-03-17  2:04                             ` Tejun Heo
2010-03-17  2:51                         ` Kevin Easton
2010-03-17  3:44                           ` Tejun Heo
2010-03-17  8:01                             ` jdow
2010-03-17 17:04                       ` Bill Davidsen
2010-03-16 14:38               ` Denys Vlasenko
2010-03-16 15:12                 ` Tejun Heo
2010-03-16 15:25                   ` Denys Vlasenko
2010-03-16 15:47                     ` Tejun Heo
2010-03-17  6:48             ` H. Peter Anvin
2010-03-16  6:27         ` Thomas Chou
  -- strict thread matches above, loose matches on Subject: below --
2010-03-12  3:10 H. Peter Anvin
2010-03-16 22:21 H. Peter Anvin
2010-03-17 15:08 ` Ric Wheeler
2010-03-17 17:13   ` H. Peter Anvin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4B95B39C.70402@kernel.org \
    --to=tj@kernel.org \
    --cc=Daniel.Taylor@wdc.com \
    --cc=akpm@linux-foundation.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=aschnell@suse.de \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=hpa@zytor.com \
    --cc=irtiger@gmail.com \
    --cc=jdelvare@suse.de \
    --cc=jeff@garzik.org \
    --cc=jim@meyering.net \
    --cc=kernel@teksavvy.com \
    --cc=knikanth@suse.de \
    --cc=kzak@redhat.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=matthew@wil.cx \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox