From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754694Ab0CFVJB (ORCPT ); Sat, 6 Mar 2010 16:09:01 -0500 Received: from gate.crashing.org ([63.228.1.57]:55175 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752552Ab0CFVI7 (ORCPT ); Sat, 6 Mar 2010 16:08:59 -0500 Subject: Re: USB mass storage and ARM cache coherency From: Benjamin Herrenschmidt To: James Bottomley Cc: Russell King - ARM Linux , Pavel Machek , Catalin Marinas , FUJITA Tomonori , mdharm-kernel@one-eyed-alien.net, linux-usb@vger.kernel.org, x0082077@ti.com, sshtylyov@ru.mvista.com, tom.leiming@gmail.com, bigeasy@linutronix.de, oliver@neukum.org, linux-kernel@vger.kernel.org, santosh.shilimkar@ti.com, greg@kroah.com, linux-arm-kernel@lists.infradead.org In-Reply-To: <1267872443.8894.1443.camel@mulgrave.site> References: <20100226210030.GC23933@n2100.arm.linux.org.uk> <1267316072.23523.1842.camel@pasglop> <1267333263.2762.11.camel@mulgrave.site> <20100302211049V.fujita.tomonori@lab.ntt.co.jp> <1267549527.15401.78.camel@e102109-lin.cambridge.arm.com> <20100303215437.GF2579@ucw.cz> <1267709756.6526.380.camel@e102109-lin.cambridge.arm.com> <20100304135128.GA12191@atrey.karlin.mff.cuni.cz> <1267712512.31654.176.camel@mulgrave.site> <20100304142704.GB6622@n2100.arm.linux.org.uk> <1267872443.8894.1443.camel@mulgrave.site> Content-Type: text/plain; charset="UTF-8" Date: Sun, 07 Mar 2010 08:03:49 +1100 Message-ID: <1267909429.22204.127.camel@pasglop> Mime-Version: 1.0 X-Mailer: Evolution 2.28.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 2010-03-06 at 16:17 +0530, James Bottomley wrote: > On a fault in of exec data, we first try to get the page out of the page > cache. If it's not present, we put the faulting process to sleep and > fetch it in from storage. When we do the read, on the PIO path, the > kernel alias for the page becomes dirty. Some time later, we place the > page into the user space (updating the pte entry that caused a fault). > At this point, we'll call both flush_icache_page() and > update_mmu_cache() ... this is where the I/D resolution should be done. > Since it's after any I/O has occurred, it doesn't matter whether the CPU > speculatively moved anything in or not. As long as you flush the kernel > alias and invalidate the user I and D aliases, we're good to go. Using > the page arch flags is really only to optimise this process (defer > kernel D alias flushing). Ok, so while flush_icache_page() looks like something we could use instead of set_pte_at() for the icache flushing, it doesn't answer all the questions. Off the top of my mind: - I see the calls to flush_icache_page() in mm/memory.c but I don't see them next to all set_pte_at() that insert a valid PTE. For example, we don't flush the icache for anonymous pages. While that might seem like a good idea, we have been under pressure to "fix" that on powerpc to make sure there is no stale icache content from another process leaking into userspace. - It needs to be done -before- set_pte_at() but I think the code does it right, only your explanation above makes it unclear :-) - It doesn't take the PTE pointer as an argument, so here goes our trick on powerpc of filtering out exec permission rather than flushing when a page is accessed by a read fault - We -still- have the problem of tracking whether the icache has been flushed or not yet for a given physical page on archs with PIPT (or non aliasing VIPT) like powerpc. Without that tracking, we flush a lot more than necessary since we'll end up flushing things like glibc text pages for every process they are mapped into which is totally wasteful. Thus the idea of using a new PG bit to separate D$ from I$ tracking still makes sense. Cheers, Ben.