From mboxrd@z Thu Jan 1 00:00:00 1970 From: Jeff Garzik Subject: Re: Implementing NVMHCI... Date: Tue, 14 Apr 2009 06:23:51 -0400 Message-ID: <49E46437.5000804@garzik.org> References: <20090412091228.GA29937@elte.hu> <20090412162018.6c1507b4@lxorguk.ukuu.org.uk> <49E213AE.4060506@redhat.com> <49E2DC96.6090407@redhat.com> <49E45E9C.1020105@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Return-path: Received: from srv5.dvmed.net ([207.36.208.214]:34351 "EHLO mail.dvmed.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750967AbZDNKY0 (ORCPT ); Tue, 14 Apr 2009 06:24:26 -0400 In-Reply-To: <49E45E9C.1020105@redhat.com> Sender: linux-ide-owner@vger.kernel.org List-Id: linux-ide@vger.kernel.org To: Avi Kivity Cc: Linus Torvalds , Alan Cox , Szabolcs Szakacsits , Grant Grundler , Linux IDE mailing list , LKML , Jens Axboe , Arjan van de Ven Avi Kivity wrote: > Well, no one is talking about 64KB granularity for in-core files. Like > you noticed, Windows uses the mmu page size. We could keep doing that, > and still have 16KB+ sector sizes. It just means a RMW if you don't > happen to have the adjoining clean pages in cache. > > Sure, on a rotating disk that's a disaster, but we're talking SSD here, > so while you're doubling your access time, you're doubling a fairly > small quantity. The controller would do the same if it exposed smaller > sectors, so there's no huge loss. > > We still lose on disk storage efficiency, but I'm guessing that a modern > tree with some object files with debug information and a .git directory > it won't be such a great hit. For more mainstream uses, it would be > negligible. Speaking of RMW... in one sense, we have to deal with RMW anyway. Upcoming ATA hard drives will be configured with a normal 512b sector API interface, but underlying physical sector size is 1k or 4k. The disk performs the RMW for us, but we must be aware of physical sector size in order to determine proper alignment of on-disk data, to minimize RMW cycles. At the moment, it seems like most of the effort to get these ATA devices to perform efficiently is in getting partition / RAID stripe offsets set up properly. So perhaps for NVMHCI we could (a) hardcode NVM sector size maximum at 4k (b) do RMW in the driver for sector size >4k, and (c) export information indicating the true sector size, in a manner similar to how the ATA driver passes that info to userland partitioning tools. Jeff