From: Ingo Molnar <mingo@kernel.org>
To: Toshi Kani <toshi.kani@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"hpa@zytor.com" <hpa@zytor.com>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"mingo@redhat.com" <mingo@redhat.com>,
"arnd@arndb.de" <arnd@arndb.de>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"x86@kernel.org" <x86@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"dave.hansen@intel.com" <dave.hansen@intel.com>,
"Elliott, Robert (Server Storage)" <Elliott@hp.com>
Subject: Re: [PATCH v3 6/6] x86, mm: Support huge KVA mappings on x86
Date: Wed, 4 Mar 2015 21:17:48 +0100 [thread overview]
Message-ID: <20150304201748.GA6634@gmail.com> (raw)
In-Reply-To: <1425486216.17007.236.camel@misato.fc.hp.com>
* Toshi Kani <toshi.kani@hp.com> wrote:
> On Wed, 2015-03-04 at 01:00 +0000, Andrew Morton wrote:
> > On Tue, 03 Mar 2015 16:14:32 -0700 Toshi Kani <toshi.kani@hp.com> wrote:
> >
> > > On Tue, 2015-03-03 at 14:44 -0800, Andrew Morton wrote:
> > > > On Tue, 3 Mar 2015 10:44:24 -0700 Toshi Kani <toshi.kani@hp.com> wrote:
> > > :
> > > > > +
> > > > > +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
> > > > > +int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
> > > > > +{
> > > > > + u8 mtrr;
> > > > > +
> > > > > + /*
> > > > > + * Do not use a huge page when the range is covered by non-WB type
> > > > > + * of MTRRs.
> > > > > + */
> > > > > + mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
> > > > > + if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
> > > > > + return 0;
> > > >
> > > > It would be good to notify the operator in some way when this happens.
> > > > Otherwise the kernel will run more slowly and there's no way of knowing
> > > > why. I guess slap a pr_info() in there. Or maybe pr_warn()?
> > >
> > > We only use 4KB mappings today, so this case will not make it run
> > > slowly, i.e. it will be the same as today.
> >
> > Yes, but it would be slower than it would be if the operator fixed the
> > mtrr settings! How do we let the operator know this?
> >
> > > Also, adding a message here
> > > can generate a lot of messages when MTRRs cover a large area.
> >
> > Really? This is only going to happen when a device driver
> > requests a huge io mapping, isn't it? That's rare. We could emit
> > a warning, return an error code and fall all the way back to the
> > top-level ioremap code which can then retry with 4k mappings. Or
> > something similar - somehow record the fact that this warning has
> > been emitted or use printk ratelimiting (bad option).
>
> Yes, an IO device with a huge MMIO space that is covered by MTRRs is
> a rare case. BIOS does not need to specify how MMIO of each card
> needs to be accessed with MTRRs (or BIOS should not do it since an
> MMIO address is configurable on each card).
>
> However, PCIe has the MMCONFIG space, PCIe config space, which is
> also memory mapped and must be accessed with UC. The PCI subsystem
> calls ioremap_nocache() to map the entire MMCONFIG space, which
> covers the PCIe config space of all possible cards. Here are boot
> messages on my test system.
>
> :
> PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xc0000000-0xcf
> ffffff] (base 0xc0000000)
> PCI: MMCONFIG at [mem 0xc0000000-0xcfffffff] reserved in E820
> :
>
> And MTRRs cover this MMCONFIG space with UC to assure that the range is
> always accessed with UC.
So the PCI code ioremap()s this 256 MB mmconfig space in its entirety
currently?
>
> # cat /proc/mtrr
> reg00: base=0x0c0000000 ( 3072MB), size= 1024MB, count=1: uncachable
>
> So, if we add a message into the code, it will be displayed many
> times in this ioremap_nocache() call from PCI.
So, in this specific case, when a single MTRR covers it with a single
cache policy, I think we can safely map it UC using hugepmds?
That will 'shut up' the warning the right way: by making the code
work?
Thanks,
Ingo
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Ingo Molnar <mingo@kernel.org>
To: Toshi Kani <toshi.kani@hp.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"hpa@zytor.com" <hpa@zytor.com>,
"tglx@linutronix.de" <tglx@linutronix.de>,
"mingo@redhat.com" <mingo@redhat.com>,
"arnd@arndb.de" <arnd@arndb.de>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"x86@kernel.org" <x86@kernel.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"dave.hansen@intel.com" <dave.hansen@intel.com>,
"Elliott, Robert (Server Storage)" <Elliott@hp.com>
Subject: Re: [PATCH v3 6/6] x86, mm: Support huge KVA mappings on x86
Date: Wed, 4 Mar 2015 21:17:48 +0100 [thread overview]
Message-ID: <20150304201748.GA6634@gmail.com> (raw)
In-Reply-To: <1425486216.17007.236.camel@misato.fc.hp.com>
* Toshi Kani <toshi.kani@hp.com> wrote:
> On Wed, 2015-03-04 at 01:00 +0000, Andrew Morton wrote:
> > On Tue, 03 Mar 2015 16:14:32 -0700 Toshi Kani <toshi.kani@hp.com> wrote:
> >
> > > On Tue, 2015-03-03 at 14:44 -0800, Andrew Morton wrote:
> > > > On Tue, 3 Mar 2015 10:44:24 -0700 Toshi Kani <toshi.kani@hp.com> wrote:
> > > :
> > > > > +
> > > > > +#ifdef CONFIG_HAVE_ARCH_HUGE_VMAP
> > > > > +int pud_set_huge(pud_t *pud, phys_addr_t addr, pgprot_t prot)
> > > > > +{
> > > > > + u8 mtrr;
> > > > > +
> > > > > + /*
> > > > > + * Do not use a huge page when the range is covered by non-WB type
> > > > > + * of MTRRs.
> > > > > + */
> > > > > + mtrr = mtrr_type_lookup(addr, addr + PUD_SIZE);
> > > > > + if ((mtrr != MTRR_TYPE_WRBACK) && (mtrr != 0xFF))
> > > > > + return 0;
> > > >
> > > > It would be good to notify the operator in some way when this happens.
> > > > Otherwise the kernel will run more slowly and there's no way of knowing
> > > > why. I guess slap a pr_info() in there. Or maybe pr_warn()?
> > >
> > > We only use 4KB mappings today, so this case will not make it run
> > > slowly, i.e. it will be the same as today.
> >
> > Yes, but it would be slower than it would be if the operator fixed the
> > mtrr settings! How do we let the operator know this?
> >
> > > Also, adding a message here
> > > can generate a lot of messages when MTRRs cover a large area.
> >
> > Really? This is only going to happen when a device driver
> > requests a huge io mapping, isn't it? That's rare. We could emit
> > a warning, return an error code and fall all the way back to the
> > top-level ioremap code which can then retry with 4k mappings. Or
> > something similar - somehow record the fact that this warning has
> > been emitted or use printk ratelimiting (bad option).
>
> Yes, an IO device with a huge MMIO space that is covered by MTRRs is
> a rare case. BIOS does not need to specify how MMIO of each card
> needs to be accessed with MTRRs (or BIOS should not do it since an
> MMIO address is configurable on each card).
>
> However, PCIe has the MMCONFIG space, PCIe config space, which is
> also memory mapped and must be accessed with UC. The PCI subsystem
> calls ioremap_nocache() to map the entire MMCONFIG space, which
> covers the PCIe config space of all possible cards. Here are boot
> messages on my test system.
>
> :
> PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0xc0000000-0xcf
> ffffff] (base 0xc0000000)
> PCI: MMCONFIG at [mem 0xc0000000-0xcfffffff] reserved in E820
> :
>
> And MTRRs cover this MMCONFIG space with UC to assure that the range is
> always accessed with UC.
So the PCI code ioremap()s this 256 MB mmconfig space in its entirety
currently?
>
> # cat /proc/mtrr
> reg00: base=0x0c0000000 ( 3072MB), size= 1024MB, count=1: uncachable
>
> So, if we add a message into the code, it will be displayed many
> times in this ioremap_nocache() call from PCI.
So, in this specific case, when a single MTRR covers it with a single
cache policy, I think we can safely map it UC using hugepmds?
That will 'shut up' the warning the right way: by making the code
work?
Thanks,
Ingo
next prev parent reply other threads:[~2015-03-04 20:17 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-03-03 17:44 [PATCH v3 0/6] Kernel huge I/O mapping support Toshi Kani
2015-03-03 17:44 ` Toshi Kani
2015-03-03 17:44 ` [PATCH v3 1/6] mm: Change __get_vm_area_node() to use fls_long() Toshi Kani
2015-03-03 17:44 ` Toshi Kani
2015-03-03 17:44 ` [PATCH v3 2/6] lib: Add huge I/O map capability interfaces Toshi Kani
2015-03-03 17:44 ` Toshi Kani
2015-03-03 17:44 ` [PATCH v3 3/6] mm: Change ioremap to set up huge I/O mappings Toshi Kani
2015-03-03 17:44 ` Toshi Kani
2015-03-04 22:09 ` Ingo Molnar
2015-03-04 22:09 ` Ingo Molnar
2015-03-04 23:15 ` Toshi Kani
2015-03-04 23:15 ` Toshi Kani
2015-03-03 17:44 ` [PATCH v3 4/6] mm: Change vunmap to tear down huge KVA mappings Toshi Kani
2015-03-03 17:44 ` Toshi Kani
2015-03-03 17:44 ` [PATCH v3 5/6] x86, mm: Support huge I/O mapping capability I/F Toshi Kani
2015-03-03 17:44 ` Toshi Kani
2015-03-03 17:44 ` [PATCH v3 6/6] x86, mm: Support huge KVA mappings on x86 Toshi Kani
2015-03-03 17:44 ` Toshi Kani
2015-03-03 22:44 ` Andrew Morton
2015-03-03 22:44 ` Andrew Morton
2015-03-03 23:14 ` Toshi Kani
2015-03-03 23:14 ` Toshi Kani
2015-03-04 1:00 ` Andrew Morton
2015-03-04 1:00 ` Andrew Morton
2015-03-04 16:23 ` Toshi Kani
2015-03-04 16:23 ` Toshi Kani
2015-03-04 20:17 ` Ingo Molnar [this message]
2015-03-04 20:17 ` Ingo Molnar
2015-03-04 21:16 ` Toshi Kani
2015-03-04 21:16 ` Toshi Kani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150304201748.GA6634@gmail.com \
--to=mingo@kernel.org \
--cc=Elliott@hp.com \
--cc=akpm@linux-foundation.org \
--cc=arnd@arndb.de \
--cc=dave.hansen@intel.com \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mingo@redhat.com \
--cc=tglx@linutronix.de \
--cc=toshi.kani@hp.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.