Re: [RFC][PATCH] vmalloc: simplify vread()/vwrite()

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Wu Fengguang <fengguang.wu@intel.com>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Tejun Heo <tj@kernel.org>, Ingo Molnar <mingo@elte.hu>,
	Nick Piggin <npiggin@suse.de>, Andi Kleen <andi@firstfloor.org>,
	Hugh Dickins <hugh.dickins@tiscali.co.uk>,
	Christoph Lameter <cl@linux-foundation.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Huang, Ying" <ying.huang@intel.com>
Subject: Re: [RFC][PATCH] vmalloc: simplify vread()/vwrite()
Date: Thu, 7 Jan 2010 11:21:31 +0800	[thread overview]
Message-ID: <20100107032130.GA12815@localhost> (raw)
In-Reply-To: <20100107115736.ee815579.kamezawa.hiroyu@jp.fujitsu.com>

On Thu, Jan 07, 2010 at 10:57:36AM +0800, KAMEZAWA Hiroyuki wrote:
> On Thu, 7 Jan 2010 10:50:54 +0800
> Wu Fengguang <fengguang.wu@intel.com> wrote:
>  
> > > > The changes are:
> > > > - remove the vmlist walk and rely solely on vmalloc_to_page()
> > > > - replace the VM_IOREMAP check with (page && page_is_ram(pfn))
> > > > 
> > > > The VM_IOREMAP check is introduced in commit d0107eb07320b for per-cpu
> > > > alloc. Kame, would you double check if this change is OK for that
> > > > purpose?
> > > > 
> > > I think VM_IOREMAP is for avoiding access to device configuration area and
> > > unexpected breakage in device. Then, VM_IOREMAP are should be skipped by
> > > the caller. (My patch _just_ moves the avoidance of callers to vread()/vwrite())
> > 
> > "device configuration area" is not RAM, so testing of RAM would be
> > able to skip them?
> >
> Sorry, that's an area what I'm not sure. 

I believe the fundamental correctness requirement would be: to avoid
accessing _PAGE_CACHE_UC/WC pages by building _PAGE_CACHE_WB mapping
to them(which kmap_atomic do), whether it be physical RAM or device
address area.

For example, /dev/mem allows access to
- device areas, with proper _PAGE_CACHE_* via ioremap_cache() 
- _low_ physical RAM, which it directly reuse the 1:1 kernel mapping

> But, page_is_ram() implementation other than x86 seems not very safe...
> (And it seems that it's not defiend in some archs.)

Ah didn't know archs other than x86.  And hotplug won't update e820 as
well..

> > > 
> > > > The page_is_ram() check is necessary because kmap_atomic() is not
> > > > designed to work with non-RAM pages.
> > > > 
> > > I think page_is_ram() is not a complete method...on x86, it just check
> > > e820's memory range. checking VM_IOREMAP is better, I think.
> > 
> > (double check) Not complete or not safe?
> > 
> I think not-safe because e820 doesn't seem to be updated.
> 
> > EFI seems to not update e820 table by default.  Ying, do you know why?

Ying just confirmed that the kernel didn't update e820 with EFI memmap
because the bootloader will fake one e820 table based on EFI memmap,
in order to run legacy/unmodified kernel.

> 
> I hope all this kinds can be fixed by kernel/resource.c in generic way....
> Now, each archs have its own.
 
Agreed.

> > > > Even for a RAM page, we don't own the page, and cannot assume it's a
> > > > _PAGE_CACHE_WB page. So I wonder whether it's necessary to do another
> > > > patch to call reserve_memtype() before kmap_atomic() to ensure cache
> > > > consistency?
> > > > 
> > > > TODO: update comments accordingly
> > > > 
> > > 
> > > BTW, f->f_pos problem on 64bit machine still exists and this patch is still
> > > hard to test. I stopped that because anyone doesn't show any interests.
> > 
> > I'm using your patch :)
> > 
> > I feel most inconfident on this patch, so submitted it for RFC first.
> > I'll then submit a full patch series including your f_pos fix.
> > 
> Thank you, it's helpful.

Thanks,
Fengguang

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2010-01-07  3:21 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-01-07  1:24 [RFC][PATCH] vmalloc: simplify vread()/vwrite() Wu Fengguang
2010-01-07  1:38 ` KAMEZAWA Hiroyuki
2010-01-07  2:50   ` Wu Fengguang
2010-01-07  2:57     ` KAMEZAWA Hiroyuki
2010-01-07  3:21       ` Wu Fengguang [this message]
2010-01-07  3:15     ` Huang Ying
2010-01-07  3:23       ` KAMEZAWA Hiroyuki
2010-01-07  5:24         ` Wu Fengguang
2010-01-07  5:39           ` KAMEZAWA Hiroyuki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100107032130.GA12815@localhost \
    --to=fengguang.wu@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andi@firstfloor.org \
    --cc=cl@linux-foundation.org \
    --cc=hugh.dickins@tiscali.co.uk \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=tj@kernel.org \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).