From: Marcelo Tosatti <mtosatti@redhat.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
x86@kernel.org, Luiz Capitulino <lcapitulino@redhat.com>,
Vikas Shivappa <vikas.shivappa@intel.com>,
Tejun Heo <tj@kernel.org>, Yu Fenghua <fenghua.yu@intel.com>
Subject: Re: [RFD] CAT user space interface revisited
Date: Wed, 18 Nov 2015 23:05:36 -0200 [thread overview]
Message-ID: <20151119010535.GA30233@amt.cnet> (raw)
In-Reply-To: <20151119000153.GA27997@amt.cnet>
On Wed, Nov 18, 2015 at 10:01:53PM -0200, Marcelo Tosatti wrote:
> On Wed, Nov 18, 2015 at 07:25:03PM +0100, Thomas Gleixner wrote:
> > Folks!
> >
> > After rereading the mail flood on CAT and staring into the SDM for a
> > while, I think we all should sit back and look at it from scratch
> > again w/o our preconceptions - I certainly had to put my own away.
> >
> > Let's look at the properties of CAT again:
> >
> > - It's a per socket facility
> >
> > - CAT slots can be associated to external hardware. This
> > association is per socket as well, so different sockets can have
> > different behaviour. I missed that detail when staring the first
> > time, thanks for the pointer!
> >
> > - The association ifself is per cpu. The COS selection happens on a
> > CPU while the set of masks which are selected via COS are shared
> > by all CPUs on a socket.
> >
> > There are restrictions which CAT imposes in terms of configurability:
> >
> > - The bits which select a cache partition need to be consecutive
> >
> > - The number of possible cache association masks is limited
> >
> > Let's look at the configurations (CDP omitted and size restricted)
> >
> > Default: 1 1 1 1 1 1 1 1
> > 1 1 1 1 1 1 1 1
> > 1 1 1 1 1 1 1 1
> > 1 1 1 1 1 1 1 1
> >
> > Shared: 1 1 1 1 1 1 1 1
> > 0 0 1 1 1 1 1 1
> > 0 0 0 0 1 1 1 1
> > 0 0 0 0 0 0 1 1
> >
> > Isolated: 1 1 1 1 0 0 0 0
> > 0 0 0 0 1 1 0 0
> > 0 0 0 0 0 0 1 0
> > 0 0 0 0 0 0 0 1
> >
> > Or any combination thereof. Surely some combinations will not make any
> > sense, but we really should not make any restrictions on the stupidity
> > of a sysadmin. The worst outcome might be L3 disabled for everything,
> > so what?
> >
> > Now that gets even more convoluted if CDP comes into play and we
> > really need to look at CDP right now. We might end up with something
> > which looks like this:
> >
> > 1 1 1 1 0 0 0 0 Code
> > 1 1 1 1 0 0 0 0 Data
> > 0 0 0 0 0 0 1 0 Code
> > 0 0 0 0 1 1 0 0 Data
> > 0 0 0 0 0 0 0 1 Code
> > 0 0 0 0 1 1 0 0 Data
> > or
> > 0 0 0 0 0 0 0 1 Code
> > 0 0 0 0 1 1 0 0 Data
> > 0 0 0 0 0 0 0 1 Code
> > 0 0 0 0 0 1 1 0 Data
> >
> > Let's look at partitioning itself. We have two options:
> >
> > 1) Per task partitioning
> >
> > 2) Per CPU partitioning
> >
> > So far we only talked about #1, but I think that #2 has a value as
> > well. Let me give you a simple example.
> >
> > Assume that you have isolated a CPU and run your important task on
> > it. You give that task a slice of cache. Now that task needs kernel
> > services which run in kernel threads on that CPU. We really don't want
> > to (and cannot) hunt down random kernel threads (think cpu bound
> > worker threads, softirq threads ....) and give them another slice of
> > cache. What we really want is:
> >
> > 1 1 1 1 0 0 0 0 <- Default cache
> > 0 0 0 0 1 1 1 0 <- Cache for important task
> > 0 0 0 0 0 0 0 1 <- Cache for CPU of important task
> >
> > It would even be sufficient for particular use cases to just associate
> > a piece of cache to a given CPU and do not bother with tasks at all.
> >
> > We really need to make this as configurable as possible from userspace
> > without imposing random restrictions to it. I played around with it on
> > my new intel toy and the restriction to 16 COS ids (that's 8 with CDP
> > enabled) makes it really useless if we force the ids to have the same
> > meaning on all sockets and restrict it to per task partitioning.
> >
> > Even if next generation systems will have more COS ids available,
> > there are not going to be enough to have a system wide consistent
> > view unless we have COS ids > nr_cpus.
> >
> > Aside of that I don't think that a system wide consistent view is
> > useful at all.
> >
> > - If a task migrates between sockets, it's going to suffer anyway.
> > Real sensitive applications will simply pin tasks on a socket to
> > avoid that in the first place. If we make the whole thing
> > configurable enough then the sysadmin can set it up to support
> > even the nonsensical case of identical cache partitions on all
> > sockets and let tasks use the corresponding partitions when
> > migrating.
> >
> > - The number of cache slices is going to be limited no matter what,
> > so one still has to come up with a sensible partitioning scheme.
> >
> > - Even if we have enough cos ids the system wide view will not make
> > the configuration problem any simpler as it remains per socket.
> >
> > It's hard. Policies are hard by definition, but this one is harder
> > than most other policies due to the inherent limitations.
> >
> > So now to the interface part. Unfortunately we need to expose this
> > very close to the hardware implementation as there are really no
> > abstractions which allow us to express the various bitmap
> > combinations. Any abstraction I tried to come up with renders that
> > thing completely useless.
>
> No you don't.
Actually, there is a point that is useful: you might want the important
application to share the L3 portion with HW (that HW DMAs into), and
have only the application and the HW use that region.
So its a good point that controlling the exact position of the reservation
is important.
next prev parent reply other threads:[~2015-11-19 1:06 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-18 18:25 [RFD] CAT user space interface revisited Thomas Gleixner
2015-11-18 19:38 ` Luiz Capitulino
2015-11-18 19:55 ` Auld, Will
2015-11-18 22:34 ` Marcelo Tosatti
2015-11-19 0:34 ` Marcelo Tosatti
2015-11-19 8:35 ` Thomas Gleixner
2015-11-19 13:44 ` Luiz Capitulino
2015-11-20 14:15 ` Marcelo Tosatti
2015-11-19 8:11 ` Thomas Gleixner
2015-11-19 0:01 ` Marcelo Tosatti
2015-11-19 1:05 ` Marcelo Tosatti [this message]
2015-11-19 9:09 ` Thomas Gleixner
2015-11-19 20:59 ` Marcelo Tosatti
2015-11-20 7:53 ` Thomas Gleixner
2015-11-20 17:51 ` Marcelo Tosatti
2015-11-19 20:30 ` Marcelo Tosatti
2015-11-19 9:07 ` Thomas Gleixner
2015-11-24 8:27 ` Chao Peng
[not found] ` <20151124212543.GA11303@amt.cnet>
2015-11-25 1:29 ` Marcelo Tosatti
2015-11-24 7:31 ` Chao Peng
2015-11-24 23:06 ` Marcelo Tosatti
2015-12-22 18:12 ` Yu, Fenghua
2015-12-23 10:28 ` Marcelo Tosatti
2015-12-29 12:44 ` Thomas Gleixner
2015-12-31 19:22 ` Marcelo Tosatti
2015-12-31 22:30 ` Thomas Gleixner
2016-01-04 17:20 ` Marcelo Tosatti
2016-01-04 17:44 ` Marcelo Tosatti
2016-01-05 23:09 ` Thomas Gleixner
2016-01-06 12:46 ` Marcelo Tosatti
2016-01-06 13:10 ` Tejun Heo
2016-01-08 20:21 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151119010535.GA30233@amt.cnet \
--to=mtosatti@redhat.com \
--cc=fenghua.yu@intel.com \
--cc=lcapitulino@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=vikas.shivappa@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox