From: Trond Myklebust <Trond.Myklebust@netapp.com>
To: Russell King - ARM Linux <linux@arm.linux.org.uk>
Cc: "Marc Kleine-Budde" <mkl@pengutronix.de>,
"Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>,
linux-nfs@vger.kernel.org,
"Linus Torvalds" <torvalds@linux-foundation.org>,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
"Marc Kleine-Budde" <m.kleine-budde@pengutronix.de>
Subject: Re: still nfs problems [Was: Linux 2.6.37-rc8]
Date: Wed, 05 Jan 2011 12:17:27 -0500 [thread overview]
Message-ID: <1294247847.2998.23.camel@heimdal.trondhjem.org> (raw)
In-Reply-To: <20110105155230.GC8638@n2100.arm.linux.org.uk>
On Wed, 2011-01-05 at 15:52 +0000, Russell King - ARM Linux wrote:
> On Wed, Jan 05, 2011 at 10:14:17AM -0500, Trond Myklebust wrote:
> > OK. So,the new behaviour in 2.6.37 is that we're writing to a series of
> > pages via the usual kmap_atomic()/kunmap_atomic() and kmap()/kunmap()
> > interfaces, but we can end up reading them via a virtual address range
> > that gets set up via vm_map_ram() (that range gets set up before the
> > write occurs).
>
> kmap of lowmem pages will always reuses the existing kernel direct
> mapping, so there won't be a problem there.
>
> > Do we perhaps need an invalidate_kernel_vmap_range() before we can read
> > the data on ARM in this kind of scenario?
>
> Firstly, vm_map_ram() does no cache maintainence of any sort, nor does
> it take care of page colouring - so any architecture where cache aliasing
> can occur will see this problem. It should not limited to ARM.
>
> Secondly, no, invalidate_kernel_vmap_range() probably isn't sufficient.
> There's two problems here:
>
> addr = kmap(lowmem_page);
> *addr = stuff;
> kunmap(lowmem_page);
>
> Such lowmem pages are accessed through their kernel direct mapping.
>
> ptr = vm_map_ram(lowmem_page);
> read = *ptr;
>
> This creates a new mapping which can alias with the kernel direct mapping.
> Now, as this is a new mapping, there should be no cache lines associated
> with it. (Looking at vm_unmap_ram(), it calls free_unmap_vmap_area_addr(),
> free_unmap_vmap_area(), which then calls flush_cache_vunmap() on the
> region. vb_free() also calls flush_cache_vunmap() too.)
>
> If the write after kmap() hits an already present cache line, the cache
> line will be updated, but it won't be written back to memory. So, on
> a subsequent vm_map_ram(), with any kind of aliasing cache, there's
> no guarantee that you'll hit that cache line and read the data just
> written there.
>
> The kernel direct mapping would need to be flushed.
We should already be flushing the kernel direct mapping after writing by
means of the calls to flush_dcache_page() in xdr_partial_copy_from_skb()
and all the helpers in net/sunrpc/xdr.c.
The only new thing is the read access through the virtual address
mapping. That mapping is created outside the loop in
nfs_readdir_xdr_to_array(), which is why I'm thinking we do need the
invalidate_kernel_vmap_range(): we're essentially doing a series of
writes through the kernel direct mapping (i.e. readdir RPC calls), then
reading the results through the virtual mapping.
i.e. we're doing
ptr = vm_map_ram(lowmem_pages);
while (need_more_data) {
for (i = 0; i < npages; i++) {
addr = kmap_atomic(lowmem_page[i]);
*addr = rpc_stuff;
flush_dcache_page(lowmem_page[i]);
kunmap_atomic(lowmem_page[i]);
}
invalidate_kernel_vmap_range(ptr); // Needed here?
read = *ptr;
}
vm_unmap_ram(lowmem_pages)
> I'm really getting to the point of hating the poliferation of RAM
> remapping interfaces - it's going to (and is) causing nothing but lots
> of pain on virtual cache architectures, needing more and more cache
> flushing interfaces to be created.
>
> Is there any other solution to this?
Arbitrary sized pages. :-)
The problem here is that we want to read variable sized records (i.e.
readdir() records) from a multi-page buffer. We could do that by copying
those particular records that overlap with page boundaries, but that
would make for a fairly intrusive rewrite too.
--
Trond Myklebust
Linux NFS client maintainer
NetApp
Trond.Myklebust@netapp.com
www.netapp.com
next prev parent reply other threads:[~2011-01-05 17:17 UTC|newest]
Thread overview: 82+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <AANLkTi=-dNeeDjcSoznKtwcaNyw1mMXSqepFY89R2i+2@mail.gmail.com>
[not found] ` <20101230171453.GA5787@pengutronix.de>
2010-12-30 17:59 ` still nfs problems [Was: Linux 2.6.37-rc8] Trond Myklebust
2010-12-30 19:18 ` Uwe Kleine-König
2011-01-03 21:38 ` Uwe Kleine-König
2011-01-04 0:22 ` Trond Myklebust
2011-01-05 8:40 ` Uwe Kleine-König
2011-01-05 11:05 ` Uwe Kleine-König
2011-01-05 11:27 ` Russell King - ARM Linux
2011-01-05 12:14 ` Marc Kleine-Budde
2011-01-05 13:02 ` Nori, Sekhar
2011-01-05 15:34 ` Russell King - ARM Linux
2011-01-05 13:40 ` Uwe Kleine-König
2011-01-05 14:29 ` Jim Rees
2011-01-05 14:42 ` Marc Kleine-Budde
2011-01-05 15:38 ` Jim Rees
2011-01-05 14:53 ` Trond Myklebust
2011-01-05 15:01 ` Marc Kleine-Budde
2011-01-05 15:14 ` Trond Myklebust
2011-01-05 15:29 ` Trond Myklebust
2011-01-05 15:39 ` Marc Kleine-Budde
2011-01-05 15:52 ` Russell King - ARM Linux
2011-01-05 17:17 ` Trond Myklebust [this message]
2011-01-05 17:26 ` Russell King - ARM Linux
2011-01-05 18:12 ` Trond Myklebust
2011-01-05 18:27 ` Russell King - ARM Linux
2011-01-05 18:55 ` Trond Myklebust
2011-01-05 19:07 ` Russell King - ARM Linux
2011-01-14 2:25 ` Andy Isaacson
2011-01-14 2:40 ` Trond Myklebust
2011-01-14 4:22 ` Andy Isaacson
[not found] ` <AANLkTikvZF6Q1k0rETLHUffkUT3grxAh3FoB_0vs96B8@mail.gmail.com>
2010-12-30 18:24 ` Trond Myklebust
2010-12-30 18:50 ` Linus Torvalds
2010-12-30 19:25 ` Trond Myklebust
2010-12-30 20:02 ` Linus Torvalds
2010-12-31 3:17 George Spelvin
2010-12-31 4:32 ` Trond Myklebust
2011-01-01 1:03 ` George Spelvin
2011-01-01 1:18 ` Trond Myklebust
2011-01-01 5:44 ` George Spelvin
-- strict thread matches above, loose matches on Subject: below --
2011-01-05 19:05 James Bottomley
2011-01-05 19:18 ` Linus Torvalds
2011-01-05 19:36 ` James Bottomley
2011-01-05 19:49 ` Linus Torvalds
2011-01-05 20:35 ` James Bottomley
2011-01-05 20:00 ` Russell King - ARM Linux
2011-01-05 20:33 ` James Bottomley
2011-01-05 20:48 ` Linus Torvalds
2011-01-05 21:04 ` Russell King - ARM Linux
2011-01-05 21:08 ` Linus Torvalds
2011-01-05 21:16 ` Trond Myklebust
2011-01-05 21:30 ` Linus Torvalds
2011-01-05 23:06 ` Trond Myklebust
2011-01-05 23:28 ` James Bottomley
2011-01-06 17:40 ` James Bottomley
2011-01-06 17:47 ` Trond Myklebust
2011-01-06 17:51 ` James Bottomley
2011-01-06 17:55 ` Linus Torvalds
2011-01-07 18:53 ` Trond Myklebust
2011-01-07 19:02 ` Russell King - ARM Linux
2011-01-07 19:11 ` James Bottomley
2011-01-08 16:49 ` Trond Myklebust
2011-01-08 23:15 ` Trond Myklebust
2011-01-10 10:50 ` Uwe Kleine-König
2011-01-10 16:25 ` Trond Myklebust
2011-01-10 17:08 ` Marc Kleine-Budde
2011-01-10 17:20 ` Trond Myklebust
[not found] ` <1294680035.3349.19.camel-rJ7iovZKK19ZJLDQqaL3InhyD016LWXt@public.gmane.org>
2011-01-10 17:26 ` Marc Kleine-Budde
2011-01-10 19:25 ` Uwe Kleine-König
2011-01-10 19:29 ` Trond Myklebust
2011-01-10 19:31 ` James Bottomley
2011-01-10 19:34 ` Linus Torvalds
2011-01-10 20:15 ` Trond Myklebust
2011-01-10 12:44 ` Marc Kleine-Budde
2011-01-07 19:13 ` Trond Myklebust
2011-01-07 19:05 ` James Bottomley
2011-01-06 18:05 ` Russell King - ARM Linux
2011-01-06 18:14 ` James Bottomley
2011-01-06 18:25 ` James Bottomley
2011-01-06 21:07 ` James Bottomley
2011-01-06 20:19 ` John Stoffel
2011-01-05 23:28 ` Linus Torvalds
2011-01-05 23:59 ` Russell King - ARM Linux
2011-01-05 21:16 ` James Bottomley
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1294247847.2998.23.camel@heimdal.trondhjem.org \
--to=trond.myklebust@netapp.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=linux@arm.linux.org.uk \
--cc=m.kleine-budde@pengutronix.de \
--cc=mkl@pengutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=u.kleine-koenig@pengutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).