Re: Overlapping ioremap() calls, set_memory_*() semantics

linux-arch.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Ingo Molnar <mingo@kernel.org>
To: Toshi Kani <toshi.kani@hpe.com>
Cc: "Luis R. Rodriguez" <mcgrof@kernel.org>,
	Toshi Kani <toshi.kani@hp.com>,
	Paul McKenney <paulmck@linux.vnet.ibm.com>,
	Dave Airlie <airlied@redhat.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	linux-arch@vger.kernel.org, X86 ML <x86@kernel.org>,
	Daniel Vetter <daniel.vetter@intel.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Borislav Petkov <bp@alien8.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andy Lutomirski <luto@kernel.org>,
	Brian Gerst <brgerst@gmail.com>
Subject: Re: Overlapping ioremap() calls, set_memory_*() semantics
Date: Wed, 9 Mar 2016 10:15:25 +0100	[thread overview]
Message-ID: <20160309091525.GA11866@gmail.com> (raw)
In-Reply-To: <1457483385.15454.519.camel@hpe.com>


* Toshi Kani <toshi.kani@hpe.com> wrote:

> On Tue, 2016-03-08 at 13:16 +0100, Ingo Molnar wrote:
> > * Toshi Kani <toshi.kani@hpe.com> wrote:
> > 
> > > > So where is the problem? The memtype implementation and hence most
> > > > ioremap() users are supposed to be safe. set_memory_*() APIs are
> > > > supposed
> > > > to be safe as well, as they too go via the memtype API.
> > > 
> > > Let me try to summarize...
> > > 
> > > The original issue Luis brought up was that drivers written to work
> > > with MTRR may create a single ioremap range covering multiple cache
> > > attributes since MTRR can overwrite cache attribute of a certain range.
> > >  Converting such drivers with PAT-based ioremap interfaces, i.e.
> > > ioremap_wc() and ioremap_nocache(), requires a separate ioremap map for
> > > each cache attribute, which can be challenging as it may result in
> > > overlapping ioremap ranges (in his term) with different cache
> > > attributes.
> > > 
> > > So, Luis asked about 'sematics of overlapping ioremap()' calls.  Hence,
> > > I responded that aliasing mapping itself is supported, but alias with
> > > different cache attribute is not.  We have checks in place to detect
> > > such condition.  Overlapping ioremap calls with a different cache
> > > attribute either fails or gets redirected to the existing cache
> > > attribute on x86.
> > 
> > Ok, fair enough!
> > 
> > So to go back to the original suggestion from Luis, I've quoted it, but
> > with a s/overlapping/aliased substitution:
> > 
> > > I had suggested long ago then that one possible resolution was for us
> > > to add an API that *enables* aliased ioremap() calls, and only use it
> > > on select locations in the kernel. This means we only have to convert a
> > > few users to that call to white list such semantics, and by default
> > > we'd disable aliased calls. To kick things off -- is this strategy
> > > agreeable for all other architectures?
> > 
> > I'd say that since the overwhelming majority of ioremap() calls are not
> > aliased, ever, thus making it 'harder' to accidentally alias is probably
> > a good idea.
> 
> Did you mean 'aliased' or 'aliased with different cache attribute'?  The former 
> check might be too strict.

I'd say even 'same attribute' aliasing is probably relatively rare.

And 'different but compatible cache attribute' is in fact more of a sign that the 
driver author does the aliasing for a valid _reason_: to have two different types 
of access methods to the same piece of physical address space...

> > The memtype infrastructure of phyisical memory ranges in that case acts as a 
> > security measure, to avoid unintended (not just physically incompatible) 
> > aliasing. I suspect it would be helpful during driver development as well.
> 
> The memtype infrastructure does not track caller interfaces.  So, it will check 
> against to any map, i.e. kernel & user map.  I assume a kernel map is created 
> before user map, though.

I don't understand this, could you give me a specific example?

> > the other problem listed:
> > 
> > > As another problem case, set_memor_*() will not fail on MMIO even
> > > though set_memor_*() is designed only for RAM.
> > 
> > So what does this mean exactly? Having WB caching on MMIO area is not
> > good, but UC, WC and WB sure is still sensible in some cases, right?
> 
> I responded to Luis in other email that:
> | Drivers use ioremap family with a right cache type when mapping MMIO
> | ranges, ex. ioremap_wc().  They do not need to change the type to MMIO.
> | RAM is different since it's already mapped with WB at boot-time.
> | set_memory_*() allows us to change the type from WB, and put it back to
> | WB.

So there's two different uses here:

 - set_memory_x() to set the caching attribute
 - set_memory_x() to set any of the RWX access attributes

I'd in fact suggest we split these two uses to avoid confusion: keep 
set_memory_x() APIs for the RWX access attributes uses, and introduce
a new API that sets caching attributes:

	set_cache_attr_wc()
	set_cache_attr_uc()
	set_cache_attr_wb()
	set_cache_attr_wt()

it has the same arguments, so it's basically just a rename initially.

And at that point we could definitely argue that set_cache_attr_*() APIs should 
probably generate a warning for _RAM_, because they mostly make sense for MMIO 
type of physical addresses, right? Regular RAM should always be WB.

Are there cases where we change the caching attribute of RAM for valid reasons, 
outside of legacy quirks?

> > Basically if ioremap_uc() and ioremap_wc() is allowed on MMIO areas, then
> > I see no reason in principle why it should be invalid to change the area
> > from UC to WC after it has been ioremap()ed.
> 
> The current implementation does not support MMIO.
>  - It does not track cache attribute correctly for MMIO since it uses
> __pa().

Hm, so __pa() works on mmio addresses as well - at least on x86. The whole mmtype 
tree is physical address based - which too is MMIO compatible in principle.

The only problem would be if it was fundamentally struct page * based - but it's 
not AFAICS. We have a few APIs that work on struct page * arrays, but those are 
just vectored helper APIs AFAICS.

What am I missing?

>  - It only supports attribute transition of {WB -> NewType -> WB} for RAM.
>  RAM is tracked differently that WB is treated as "no map".  So, this
> transition does not cause a conflict on RAM.  This will causes a conflict
> on MMIO when it is tracked correctly.   

That looks like a bug?

Thanks,

	Ingo

next prev parent reply	other threads:[~2016-03-09  9:15 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-03 21:28 Overlapping ioremap() calls, set_memory_*() semantics Luis R. Rodriguez
2016-03-03 21:28 ` Luis R. Rodriguez
2016-03-04  9:44 ` Ingo Molnar
2016-03-04 18:18   ` Toshi Kani
2016-03-04 18:18     ` Toshi Kani
2016-03-04 18:51     ` Luis R. Rodriguez
2016-03-04 21:39       ` Toshi Kani
2016-03-05 11:42       ` Ingo Molnar
2016-03-05 11:40     ` Ingo Molnar
2016-03-07 17:03       ` Toshi Kani
2016-03-07 17:03         ` Toshi Kani
2016-03-08 12:16         ` Ingo Molnar
2016-03-09  0:29           ` Toshi Kani
2016-03-09  9:15             ` Ingo Molnar [this message]
2016-03-11 22:13               ` Toshi Kani
2016-03-16  1:45                 ` Luis R. Rodriguez
2016-03-16  1:45                   ` Luis R. Rodriguez
2016-03-17 22:44                   ` Toshi Kani
2016-04-13 21:16                     ` Luis R. Rodriguez
2016-04-15 14:47                       ` Toshi Kani
2016-04-15 14:47                         ` Toshi Kani
2016-04-16  9:20                         ` Ingo Molnar
2016-04-16  9:20                           ` Ingo Molnar
2016-03-21 17:38               ` Maciej W. Rozycki
2016-04-13 21:03                 ` Luis R. Rodriguez
2016-03-11  6:47         ` Andy Lutomirski
2016-03-11 22:36           ` Toshi Kani
2016-03-13  1:02             ` Andy Lutomirski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160309091525.GA11866@gmail.com \
    --to=mingo@kernel.org \
    --cc=a.p.zijlstra@chello.nl \
    --cc=airlied@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=benh@kernel.crashing.org \
    --cc=bp@alien8.de \
    --cc=brgerst@gmail.com \
    --cc=daniel.vetter@intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    --cc=toshi.kani@hp.com \
    --cc=toshi.kani@hpe.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).