From: Marcelo Tosatti <mtosatti@redhat.com>
To: "Yu, Fenghua" <fenghua.yu@intel.com>,
Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
"x86@kernel.org" <x86@kernel.org>,
Luiz Capitulino <lcapitulino@redhat.com>,
"Shivappa, Vikas" <vikas.shivappa@intel.com>,
Tejun Heo <tj@kernel.org>,
"Shankar, Ravi V" <ravi.v.shankar@intel.com>,
"Luck, Tony" <tony.luck@intel.com>
Subject: Re: [RFD] CAT user space interface revisited
Date: Wed, 23 Dec 2015 08:28:20 -0200 [thread overview]
Message-ID: <20151223102820.GA2559@amt.cnet> (raw)
In-Reply-To: <3E5A0FA7E9CA944F9D5414FEC6C712205DF4E157@ORSMSX106.amr.corp.intel.com>
On Tue, Dec 22, 2015 at 06:12:05PM +0000, Yu, Fenghua wrote:
> > From: Thomas Gleixner [mailto:tglx@linutronix.de]
> > Sent: Wednesday, November 18, 2015 10:25 AM
> > Folks!
> >
> > After rereading the mail flood on CAT and staring into the SDM for a while, I
> > think we all should sit back and look at it from scratch again w/o our
> > preconceptions - I certainly had to put my own away.
> >
> > Let's look at the properties of CAT again:
> >
> > - It's a per socket facility
> >
> > - CAT slots can be associated to external hardware. This
> > association is per socket as well, so different sockets can have
> > different behaviour. I missed that detail when staring the first
> > time, thanks for the pointer!
> >
> > - The association ifself is per cpu. The COS selection happens on a
> > CPU while the set of masks which are selected via COS are shared
> > by all CPUs on a socket.
> >
> > There are restrictions which CAT imposes in terms of configurability:
> >
> > - The bits which select a cache partition need to be consecutive
> >
> > - The number of possible cache association masks is limited
> >
> > Let's look at the configurations (CDP omitted and size restricted)
> >
> > Default: 1 1 1 1 1 1 1 1
> > 1 1 1 1 1 1 1 1
> > 1 1 1 1 1 1 1 1
> > 1 1 1 1 1 1 1 1
> >
> > Shared: 1 1 1 1 1 1 1 1
> > 0 0 1 1 1 1 1 1
> > 0 0 0 0 1 1 1 1
> > 0 0 0 0 0 0 1 1
> >
> > Isolated: 1 1 1 1 0 0 0 0
> > 0 0 0 0 1 1 0 0
> > 0 0 0 0 0 0 1 0
> > 0 0 0 0 0 0 0 1
> >
> > Or any combination thereof. Surely some combinations will not make any
> > sense, but we really should not make any restrictions on the stupidity of a
> > sysadmin. The worst outcome might be L3 disabled for everything, so what?
> >
> > Now that gets even more convoluted if CDP comes into play and we really
> > need to look at CDP right now. We might end up with something which looks
> > like this:
> >
> > 1 1 1 1 0 0 0 0 Code
> > 1 1 1 1 0 0 0 0 Data
> > 0 0 0 0 0 0 1 0 Code
> > 0 0 0 0 1 1 0 0 Data
> > 0 0 0 0 0 0 0 1 Code
> > 0 0 0 0 1 1 0 0 Data
> > or
> > 0 0 0 0 0 0 0 1 Code
> > 0 0 0 0 1 1 0 0 Data
> > 0 0 0 0 0 0 0 1 Code
> > 0 0 0 0 0 1 1 0 Data
> >
> > Let's look at partitioning itself. We have two options:
> >
> > 1) Per task partitioning
> >
> > 2) Per CPU partitioning
> >
> > So far we only talked about #1, but I think that #2 has a value as well. Let me
> > give you a simple example.
> >
> > Assume that you have isolated a CPU and run your important task on it. You
> > give that task a slice of cache. Now that task needs kernel services which run
> > in kernel threads on that CPU. We really don't want to (and cannot) hunt
> > down random kernel threads (think cpu bound worker threads, softirq
> > threads ....) and give them another slice of cache. What we really want is:
> >
> > 1 1 1 1 0 0 0 0 <- Default cache
> > 0 0 0 0 1 1 1 0 <- Cache for important task
> > 0 0 0 0 0 0 0 1 <- Cache for CPU of important task
> >
> > It would even be sufficient for particular use cases to just associate a piece of
> > cache to a given CPU and do not bother with tasks at all.
> >
> > We really need to make this as configurable as possible from userspace
> > without imposing random restrictions to it. I played around with it on my new
> > intel toy and the restriction to 16 COS ids (that's 8 with CDP
> > enabled) makes it really useless if we force the ids to have the same meaning
> > on all sockets and restrict it to per task partitioning.
> >
> > Even if next generation systems will have more COS ids available, there are
> > not going to be enough to have a system wide consistent view unless we
> > have COS ids > nr_cpus.
> >
> > Aside of that I don't think that a system wide consistent view is useful at all.
> >
> > - If a task migrates between sockets, it's going to suffer anyway.
> > Real sensitive applications will simply pin tasks on a socket to
> > avoid that in the first place. If we make the whole thing
> > configurable enough then the sysadmin can set it up to support
> > even the nonsensical case of identical cache partitions on all
> > sockets and let tasks use the corresponding partitions when
> > migrating.
> >
> > - The number of cache slices is going to be limited no matter what,
> > so one still has to come up with a sensible partitioning scheme.
> >
> > - Even if we have enough cos ids the system wide view will not make
> > the configuration problem any simpler as it remains per socket.
> >
> > It's hard. Policies are hard by definition, but this one is harder than most
> > other policies due to the inherent limitations.
> >
> > So now to the interface part. Unfortunately we need to expose this very
> > close to the hardware implementation as there are really no abstractions
> > which allow us to express the various bitmap combinations. Any abstraction I
> > tried to come up with renders that thing completely useless.
> >
> > I was not able to identify any existing infrastructure where this really fits in. I
> > chose a directory/file based representation. We certainly could do the same
>
> Is this be /sys/devices/system/?
> Then create qos/cat directory. In the future, other directories may be created
> e.g. qos/mbm?
>
> Thanks.
>
> -Fenghua
Fenghua,
I suppose Thomas is talking about the socketmask only, as discussed in
the call with Intel.
Thomas, is that correct? (if you want a change in directory structure,
please explain the whys, because we don't need that change in directory
structure).
next prev parent reply other threads:[~2015-12-23 10:28 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-18 18:25 [RFD] CAT user space interface revisited Thomas Gleixner
2015-11-18 19:38 ` Luiz Capitulino
2015-11-18 19:55 ` Auld, Will
2015-11-18 22:34 ` Marcelo Tosatti
2015-11-19 0:34 ` Marcelo Tosatti
2015-11-19 8:35 ` Thomas Gleixner
2015-11-19 13:44 ` Luiz Capitulino
2015-11-20 14:15 ` Marcelo Tosatti
2015-11-19 8:11 ` Thomas Gleixner
2015-11-19 0:01 ` Marcelo Tosatti
2015-11-19 1:05 ` Marcelo Tosatti
2015-11-19 9:09 ` Thomas Gleixner
2015-11-19 20:59 ` Marcelo Tosatti
2015-11-20 7:53 ` Thomas Gleixner
2015-11-20 17:51 ` Marcelo Tosatti
2015-11-19 20:30 ` Marcelo Tosatti
2015-11-19 9:07 ` Thomas Gleixner
2015-11-24 8:27 ` Chao Peng
[not found] ` <20151124212543.GA11303@amt.cnet>
2015-11-25 1:29 ` Marcelo Tosatti
2015-11-24 7:31 ` Chao Peng
2015-11-24 23:06 ` Marcelo Tosatti
2015-12-22 18:12 ` Yu, Fenghua
2015-12-23 10:28 ` Marcelo Tosatti [this message]
2015-12-29 12:44 ` Thomas Gleixner
2015-12-31 19:22 ` Marcelo Tosatti
2015-12-31 22:30 ` Thomas Gleixner
2016-01-04 17:20 ` Marcelo Tosatti
2016-01-04 17:44 ` Marcelo Tosatti
2016-01-05 23:09 ` Thomas Gleixner
2016-01-06 12:46 ` Marcelo Tosatti
2016-01-06 13:10 ` Tejun Heo
2016-01-08 20:21 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151223102820.GA2559@amt.cnet \
--to=mtosatti@redhat.com \
--cc=fenghua.yu@intel.com \
--cc=lcapitulino@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=ravi.v.shankar@intel.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=tony.luck@intel.com \
--cc=vikas.shivappa@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.