From: Fenghua Yu <fenghua.yu@intel.com>
To: "Luck, Tony" <tony.luck@intel.com>
Cc: Thomas Gleixner <tglx@linutronix.de>,
Fenghua Yu <fenghua.yu@intel.com>,
"H. Peter Anvin" <h.peter.anvin@intel.com>,
Ingo Molnar <mingo@elte.hu>,
Peter Zijlstra <peterz@infradead.org>,
Stephane Eranian <eranian@google.com>,
Borislav Petkov <bp@suse.de>, Dave Hansen <dave.hansen@intel.com>,
Nilay Vaish <nilayvaish@gmail.com>, Shaohua Li <shli@fb.com>,
David Carrillo-Cisneros <davidcc@google.com>,
Ravi V Shankar <ravi.v.shankar@intel.com>,
Sai Prakhya <sai.praneeth.prakhya@intel.com>,
Vikas Shivappa <vikas.shivappa@linux.intel.com>,
linux-kernel <linux-kernel@vger.kernel.org>, x86 <x86@kernel.org>
Subject: Re: [PATCH v4 13/18] x86/intel_rdt: Add mkdir to resctrl file system
Date: Mon, 17 Oct 2016 19:56:42 -0700 [thread overview]
Message-ID: <20161018025642.GE8999@linux.intel.com> (raw)
In-Reply-To: <20161017233729.GA6386@intel.com>
On Mon, Oct 17, 2016 at 04:37:30PM -0700, Luck, Tony wrote:
> On Tue, Oct 18, 2016 at 01:20:36AM +0200, Thomas Gleixner wrote:
> > On Mon, 17 Oct 2016, Fenghua Yu wrote:
> > > part0: L3:0=1;1=1 closid0/cbm=1 on cache0 and closid0/cbm=1 on cache1
> > > (closid 15 on cache0 combined with 16 different closids on cache1)
> > > ...
> > > part254: L3:0=ffff;1=7fff closid15/cbm=ffff on cache0 and closid14/cbm=7fff on cache1
> > > part255: L3:0=ffff;1=ffff closid15/cbm=ffff on cache0 and closid15/cbm=ffff on cache1
> > >
> > > To utilize as much combinations as possbile, we may implement a
> > > more complex allocation than current one.
> > >
> > > Does this make sense?
> >
> > Thanks for the explanation. I knew that I'm missing something.
> >
> > But how is that supposed to work? The schemata files have no idea of
> > closids simply because the closids are assigned automatically. And that
> > makes the whole thing exponentially complex. You must allow to create ALL
> > rdt groups (initialy as a copy of the root group) and then when the
> > schemata file is written you have to look whether the particular CBM value
> > for a particular domain is already used and assign the same cosid for this
> > domain. That of course makes the whole L2 business completely diffuse
> > because you might end up with:
> >
> > Dom0 = COSID1 and DOM1 = COSID9
> >
> > So you can set the L2 for Dom0, but not for DOM1 and then if you set L2 for
> > Dom0 you must find a new COSID for Dom0. If there is none, then you must
> > reject the write and leave the admin puzzled.
> >
> > There is a reason why I suggested:
> >
> > https://lkml.kernel.org/r/alpine.DEB.2.11.1511181534450.3761@nanos
> >
> > It's certainly not perfect (missing L2 etc.), but clearly avoids exactly
> > the above issues. And it would allow you to utilize the 256 groups in an
> > understandable way.
>
> If you head down that path someone with a 4-socket system will try to
> make 16x16x16x16 = 65536 groups and "understandable" takes a bit of
> a beating. The eight socket system with 16^8 = 4G groups defies any
> rationale hope. Best not to think about 16 sockets.
The number of 16^L3 cache numbers is max partition number limitation
that a sysadmin can create in theory. Beyond the number, allocation
returns no space. It's kind of like other cases eg many many mkdir in one
directory can fail at one point because mkdir run out of disk space etc.
>
> The L2 + L3 configuration space gets unbelievably messy too.
>
> There's a reason why I ripped out the allocation code and went with
> a simple global allocator in this version. If we decide we need something
> fancier we can adapt later. Some solutions might be transparent to
> applications, others might add a "closid" file into each directory to
> give 2nd generation applications hooks to view (and maybe control)
> which closid is used by each group.
Fully agree with Tony. We understand the complexity of the situation and
just have a simple and working solution for the first version.
Thanks.
-Fenghua
next prev parent reply other threads:[~2016-10-17 23:53 UTC|newest]
Thread overview: 52+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-10-15 2:12 [PATCH v4 00/18] Intel Cache Allocation Technology Fenghua Yu
2016-10-15 2:12 ` [PATCH v4 01/18] Documentation, ABI: Add a document entry for cache id Fenghua Yu
2016-10-17 10:31 ` Thomas Gleixner
2016-10-15 2:12 ` [PATCH v4 02/18] cacheinfo: Introduce " Fenghua Yu
2016-10-17 10:32 ` Thomas Gleixner
2016-10-15 2:12 ` [PATCH v4 03/18] x86, intel_cacheinfo: Enable cache id in x86 Fenghua Yu
2016-10-17 10:48 ` Thomas Gleixner
2016-10-15 2:12 ` [PATCH v4 04/18] x86/intel_rdt: Feature discovery Fenghua Yu
2016-10-15 2:12 ` [PATCH v4 05/18] Documentation, x86: Documentation for Intel resource allocation user interface Fenghua Yu
2016-10-15 2:12 ` [PATCH v4 06/18] x86/intel_rdt: Add CONFIG, Makefile, and basic initialization Fenghua Yu
2016-10-17 10:57 ` Thomas Gleixner
2016-10-15 2:12 ` [PATCH v4 07/18] x86/intel_rdt: Add Haswell feature discovery Fenghua Yu
2016-10-17 11:03 ` Thomas Gleixner
2016-10-15 2:12 ` [PATCH v4 08/18] x86/intel_rdt: Pick up L3/L2 RDT parameters from CPUID Fenghua Yu
2016-10-17 13:45 ` Thomas Gleixner
2016-10-17 18:06 ` Fenghua Yu
2016-10-17 16:35 ` Luck, Tony
2016-10-17 16:43 ` Yu, Fenghua
2016-10-17 20:20 ` Luck, Tony
2016-10-17 16:54 ` Thomas Gleixner
2016-10-17 16:53 ` Thomas Gleixner
2016-10-17 17:02 ` Thomas Gleixner
2016-10-17 21:22 ` Fenghua Yu
2016-10-15 2:12 ` [PATCH v4 09/18] x86/cqm: Move PQR_ASSOC management code into generic code used by both CQM and CAT Fenghua Yu
2016-10-15 2:12 ` [PATCH v4 10/18] x86/intel_rdt: Build structures for each resource based on cache topology Fenghua Yu
2016-10-17 14:44 ` Thomas Gleixner
2016-10-15 2:12 ` [PATCH v4 11/18] x86/intel_rdt: Add basic resctrl filesystem support Fenghua Yu
2016-10-17 19:35 ` Thomas Gleixner
2016-10-15 2:12 ` [PATCH v4 12/18] x86/intel_rdt: Add "info" files to resctrl file system Fenghua Yu
2016-10-17 19:46 ` Thomas Gleixner
2016-10-15 2:12 ` [PATCH v4 13/18] x86/intel_rdt: Add mkdir " Fenghua Yu
2016-10-17 21:14 ` Thomas Gleixner
2016-10-17 21:50 ` Luck, Tony
2016-10-17 22:52 ` Thomas Gleixner
2016-10-17 23:00 ` Luck, Tony
2016-10-17 23:03 ` Thomas Gleixner
2016-10-17 23:10 ` Luck, Tony
2016-10-17 23:25 ` Thomas Gleixner
2016-10-18 1:18 ` Fenghua Yu
2016-10-17 23:20 ` Thomas Gleixner
2016-10-17 23:37 ` Luck, Tony
2016-10-18 2:56 ` Fenghua Yu [this message]
2016-10-18 10:44 ` Thomas Gleixner
2016-10-15 2:12 ` [PATCH v4 14/18] x86/intel_rdt: Add cpus file Fenghua Yu
2016-10-17 21:27 ` Thomas Gleixner
2016-10-15 2:12 ` [PATCH v4 15/18] x86/intel_rdt: Add tasks files Fenghua Yu
2016-10-17 22:01 ` Thomas Gleixner
2016-10-17 22:17 ` Luck, Tony
2016-10-15 2:12 ` [PATCH v4 16/18] x86/intel_rdt: Add schemata file Fenghua Yu
2016-10-17 22:35 ` Thomas Gleixner
2016-10-15 2:12 ` [PATCH v4 17/18] x86/intel_rdt: Add scheduler hook Fenghua Yu
2016-10-15 2:12 ` [PATCH v4 18/18] MAINTAINERS: Add maintainer for Intel RDT resource allocation Fenghua Yu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161018025642.GE8999@linux.intel.com \
--to=fenghua.yu@intel.com \
--cc=bp@suse.de \
--cc=dave.hansen@intel.com \
--cc=davidcc@google.com \
--cc=eranian@google.com \
--cc=h.peter.anvin@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@elte.hu \
--cc=nilayvaish@gmail.com \
--cc=peterz@infradead.org \
--cc=ravi.v.shankar@intel.com \
--cc=sai.praneeth.prakhya@intel.com \
--cc=shli@fb.com \
--cc=tglx@linutronix.de \
--cc=tony.luck@intel.com \
--cc=vikas.shivappa@linux.intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).