From mboxrd@z Thu Jan 1 00:00:00 1970 From: James Bottomley Subject: Re: [RFC PATCH 2/4] pio-mapping: Add ARM support for the PIO mapping API Date: Mon, 08 Feb 2010 13:07:08 -0600 Message-ID: <1265656028.6289.24.camel@mulgrave.site> References: <20100205163044.30827.10915.stgit@pc1117.cambridge.arm.com> <20100205163154.30827.6636.stgit@pc1117.cambridge.arm.com> <1265388234.14404.47.camel@mulgrave.site> <1265390403.7692.101.camel@pc1117.cambridge.arm.com> <1265391384.14404.53.camel@mulgrave.site> <1265645421.4020.119.camel@pc1117.cambridge.arm.com> <20100208165417.GA29551@flint.arm.linux.org.uk> <1265649649.6289.19.camel@mulgrave.site> <20100208173353.GD29551@flint.arm.linux.org.uk> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Return-path: Received: from bedivere.hansenpartnership.com ([66.63.167.143]:38945 "EHLO bedivere.hansenpartnership.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751256Ab0BHTHN (ORCPT ); Mon, 8 Feb 2010 14:07:13 -0500 In-Reply-To: <20100208173353.GD29551@flint.arm.linux.org.uk> Sender: linux-arch-owner@vger.kernel.org List-ID: To: Russell King Cc: Catalin Marinas , linux-arch@vger.kernel.org On Mon, 2010-02-08 at 17:33 +0000, Russell King wrote: > On Mon, Feb 08, 2010 at 11:20:49AM -0600, James Bottomley wrote: > > On Mon, 2010-02-08 at 16:54 +0000, Russell King wrote: > > > On Mon, Feb 08, 2010 at 04:10:21PM +0000, Catalin Marinas wrote: > > > > The pio_data_direction could be dropped and use the DMA one. We could > > > > also use pio_kmap_read/pio_kmap_write or similar but we have to triple > > > > the number of functions, so I prefer the additional argument. > > > > > > Do we need to do anything for reading a buffer for PIO _out_ to the > > > device? My understanding is that this has never been a problem. > > > > Before outbound PIO, you need to flush all the aliases before sending. > > However, this should *already* be done in the block path, so for block > > I/O it should be unnecessary, yes. All we need to do on the TO_DEVICE > > case should be map and unmap the page if it's highmem. > > > > > The only problem I'm aware of is where PIO writes to the kernel > > > mapping of a lowmem pages; highmem pages need the data flushed out > > > of the temporary atomic kmap mapping anyway. > > > > Not quite. > > Yes quite; I disagree. > > > in the PIO FROM_DEVICE case, we've created a dirty kernel > > alias by reading the data from the device and writing it via the kernel > > mapping. > > Here, I agree 100%. > > > We have to flush that alias whether it exists as a highmem > > mapping or as a simple offset mapping before userspace will see the > > data. > > And this is where things get more complicated. If you have a CPU > where aliases exist for different virtual addresses, and you have > highmem, when you unmap the highmem mapping, you have to flush the > dirty mapping. > > That flush is there today. Right, but only in the highmem case. > The problem case is where you have a lowmem page cache page which > you're writing into - this is not kmapped, and so on kunmap*, there > is no flush. > > > The block API assumes the FROM_DEVICE transfer went straight from > > the device into main memory and didn't go via the kernel alias, so the > > block use case won't flush it. > > 50% correct. As I point out above, with an aliasing cache + highmem, > kunmap has to already do cache maintainence. To do additional > cache maintainence would be a pure duplication of what kunmap is > already doing. Which is why the API is a proposed combination of pio+kmap ... for highmem, in arch code, if you already do the flush it would be a pure kmap. For lowmem it would do the flush on unmap for FROM_DEVICE transfers. > > Additionally, for architectures that don't promise no movein on a > > flushed but untouched page (unlike parisc, which does), we might need to > > invalidate all the user aliases before passing the page back to kill any > > stale data that may have been speculated into the alias caches). > > For a read-in, there should be zero user mappings of that page. Not necessarily. For a faulted read in, yes. If we're doing SG_IO on a user page, the page will already exist in the user's space. There are other read paradigms that also do overwrite reads. > Do we really allow users to map a page before it's been written with > the corresponding data from disk? What if it previously contained the > data from the passwd or login program? I'd be a nice security hole if > we did allow this to happen. Not map, no ... but if we're doing I/O to existing user areas, then yes. > For a user mapping to exist, we receive a fault. If we don't have the > data in the page cache, we allocate a page and issue a request into the > fs layers to bring the data in from a block device, and wait for the > page to be up to date. Only when it is up to date do we map the page > into userspace. Not necessarily, this doesn't happen for DIO reads which bypass the page cache ... or for SG_IO ones which are essentially commands sent external to the whole of the page/file cache system. > It is my understanding that the situation that you describe can never > occur. James