linux-ide.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Garzik <jeff@garzik.org>
To: Avi Kivity <avi@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Alan Cox <alan@lxorguk.ukuu.org.uk>,
	Szabolcs Szakacsits <szaka@ntfs-3g.com>,
	Grant Grundler <grundler@google.com>,
	Linux IDE mailing list <linux-ide@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Jens Axboe <jens.axboe@oracle.com>,
	Arjan van de Ven <arjan@infradead.org>
Subject: Re: Implementing NVMHCI...
Date: Tue, 14 Apr 2009 07:45:08 -0400	[thread overview]
Message-ID: <49E47744.3040103@garzik.org> (raw)
In-Reply-To: <49E46779.1040106@redhat.com>

Avi Kivity wrote:
> Jeff Garzik wrote:
>> Speaking of RMW...    in one sense, we have to deal with RMW anyway. 
>> Upcoming ATA hard drives will be configured with a normal 512b sector 
>> API interface, but underlying physical sector size is 1k or 4k.
>>
>> The disk performs the RMW for us, but we must be aware of physical 
>> sector size in order to determine proper alignment of on-disk data, to 
>> minimize RMW cycles.
>>
> 
> Virtualization has the same issue.  OS installers will typically setup 
> the first partition at sector 63, and that means every page-sized block 
> access will be misaligned.  Particularly bad when the guest's disk is 
> backed on a regular file.
> 
> Windows 2008 aligns partitions on a 1MB boundary, IIRC.

Makes a lot of sense...


>> At the moment, it seems like most of the effort to get these ATA 
>> devices to perform efficiently is in getting partition / RAID stripe 
>> offsets set up properly.
>>
>> So perhaps for NVMHCI we could
>> (a) hardcode NVM sector size maximum at 4k
>> (b) do RMW in the driver for sector size >4k, and
> 
> Why not do it in the block layer?  That way it isn't limited to one driver.

Sure.  "in the driver" is a highly relative phrase :)  If there is code 
to be shared among multiple callsites, let's share it.


>> (c) export information indicating the true sector size, in a manner 
>> similar to how the ATA driver passes that info to userland 
>> partitioning tools.
> 
> Eventually we'll want to allow filesystems to make use of the native 
> sector size.

At the kernel level, you mean?

Filesystems already must deal with issues such as avoiding RAID stripe 
boundaries (man mke2fs, search for 'RAID').

So I hope that same code should be applicable to cases where the 
"logical sector size" (as exported by storage interface) differs from 
"physical sector size" (the underlying hardware sector size, not 
directly accessible by OS).

But if you are talking about filesystems directly supporting sector 
sizes >4kb, well, I'll let Linus and others settle that debate :)  I 
will just write the driver once the dust settles...

	Jeff



  reply	other threads:[~2009-04-14 11:45 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20090412091228.GA29937@elte.hu>
2009-04-12 15:14 ` Implementing NVMHCI Szabolcs Szakacsits
2009-04-12 15:20   ` Alan Cox
2009-04-12 16:15     ` Avi Kivity
2009-04-12 17:11       ` Linus Torvalds
2009-04-13  6:32         ` Avi Kivity
2009-04-13 15:10           ` Linus Torvalds
2009-04-13 15:38             ` James Bottomley
2009-04-14  7:22             ` Andi Kleen
2009-04-14 10:07               ` Avi Kivity
2009-04-14  9:59             ` Avi Kivity
2009-04-14 10:23               ` Jeff Garzik
2009-04-14 10:37                 ` Avi Kivity
2009-04-14 11:45                   ` Jeff Garzik [this message]
2009-04-14 11:58                     ` Szabolcs Szakacsits
2009-04-17 22:45                       ` H. Peter Anvin
2009-04-14 12:08                     ` Avi Kivity
2009-04-14 12:21                       ` Jeff Garzik
2009-04-25  8:26                 ` Pavel Machek
2009-04-12 15:41   ` Linus Torvalds
2009-04-12 17:02     ` Robert Hancock
2009-04-12 17:20       ` Linus Torvalds
2009-04-12 18:35         ` Robert Hancock
2009-04-13 11:18         ` Avi Kivity
2009-04-12 17:23     ` James Bottomley
     [not found]     ` <6934efce0904141052j3d4f87cey9fc4b802303aa73b@mail.gmail.com>
2009-04-15  6:37       ` Artem Bityutskiy
2009-04-30 22:51         ` Jörn Engel
2009-04-30 23:36           ` Jeff Garzik
2009-04-11 17:33 Jeff Garzik
2009-04-11 19:32 ` Alan Cox
2009-04-11 19:52   ` Linus Torvalds
2009-04-11 20:21     ` Jeff Garzik
2009-04-11 21:49     ` Grant Grundler
2009-04-11 22:33       ` Linus Torvalds
2009-04-12  5:08         ` Leslie Rhorer
2009-04-11 23:25       ` Alan Cox
2009-04-11 23:51         ` Jeff Garzik
2009-04-12  0:49           ` Linus Torvalds
2009-04-12  1:59             ` Jeff Garzik
2009-04-12  1:15         ` david
2009-04-12  3:13           ` Linus Torvalds
2009-04-12 14:23         ` Mark Lord
2009-04-12 17:29           ` Jeff Garzik
2009-04-11 19:54   ` Jeff Garzik
2009-04-11 21:08     ` John Stoffel
2009-04-11 21:31       ` John Stoffel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=49E47744.3040103@garzik.org \
    --to=jeff@garzik.org \
    --cc=alan@lxorguk.ukuu.org.uk \
    --cc=arjan@infradead.org \
    --cc=avi@redhat.com \
    --cc=grundler@google.com \
    --cc=jens.axboe@oracle.com \
    --cc=linux-ide@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=szaka@ntfs-3g.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).