public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Marcelo Tosatti <mtosatti@redhat.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Luiz Capitulino <lcapitulino@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vikas Shivappa <vikas.shivappa@intel.com>,
	Tejun Heo <tj@kernel.org>, Yu Fenghua <fenghua.yu@intel.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC] ioctl based CAT interface
Date: Mon, 16 Nov 2015 23:01:43 -0200	[thread overview]
Message-ID: <20151117010143.GA12871@amt.cnet> (raw)
In-Reply-To: <20151116163903.GA28870@amt.cnet>

On Mon, Nov 16, 2015 at 02:39:03PM -0200, Marcelo Tosatti wrote:
> On Mon, Nov 16, 2015 at 10:07:56AM +0100, Peter Zijlstra wrote:
> > On Fri, Nov 13, 2015 at 03:33:04PM -0200, Marcelo Tosatti wrote:
> > > On Fri, Nov 13, 2015 at 05:51:00PM +0100, Peter Zijlstra wrote:
> > > > On Fri, Nov 13, 2015 at 02:39:33PM -0200, Marcelo Tosatti wrote:
> > > > > + * 	* one tcrid entry can be in different locations 
> > > > > + * 	  in different sockets.
> > > > 
> > > > NAK on that without cpuset integration.
> > > > 
> > > > I do not want freely migratable tasks having radically different
> > > > performance profiles depending on which CPU they land.
> > > 
> > > Ok, so, configuration:
> > > 
> > > 
> > > Socket-1				Socket-2
> > > 
> > > pinned thread-A with			100% L3 free
> > > 80% of L3 
> > > reserved
> > > 
> > > 
> > > So it is a problem if a thread running on socket-2 is scheduled to 
> > > socket-1 because performance is radically different, fine.
> > > 
> > > Then one way to avoid that is to not allow freely migratable tasks
> > > to move to Socket-1. Fine.
> > > 
> > > Then you want to use cpusets for that.
> > > 
> > > Can you fill in the blanks what is missing here?
> > 
> > I'm still not seeing what the problem with CAT-cgroup is.
> > 
> > /cgroups/cpuset/
> >   socket-1/cpus = $socket-1
> >            tasks = $thread-A
> > 
> >   socket-2/cpus = $socket-2
> >            tasks = $thread-B
> > 
> > /cgroups/cat/
> >   group-A/bitmap = 0x3F / 0xFF
> >   group-A/tasks = $thread-A
> > 
> >   group-B/bitmap = 0xFF / 0xFF
> >   group-B/tasks = $thread-B
> > 
> > 
> > That gets you thread-A on socket-1 with 6/8 of the L3 and thread-B on
> > socket-2 with 8/8 of the L3.
> 
> Going that route, might as well expose the region shared with HW
> to userspace and let userspace handle the problem of contiguous free regions,
> which means the cgroups bitmask maps one-to-one to HW bitmap.
> 
> All is necessary then is to modify the Intel patches to
> 
> 1) Support bitmaps per socket.

Consider the following scenario, one server with two sockets:

	socket-1 			socket-2
[ [***]	 	    ]		[	          [***]	]
	L3 cache bitmap		    L3 cache bitmap

[*] refers to the region shared with HW, as reported by CPUID (read the
Intel documentation). 

socket-1.shared_region_with_hw = [bit 2, bit 5]
socket-2.shared_region_with_hw = [bit 16, bit 18]

Given that your application is critical, you do not want it to share any
reservation with HW. I was informed that there is no guarantee these
regions end up in the same location for different sockets. Lets say you
need 15 bits of reservation, and the total is 20 bits. One possibility would be:

socket-1.reservation = [bit 5, bit 15]
socket-2.reservation = [bit 1, bit 15]

For the current Intel CAT patchset, this restriction exists:

static int cbm_validate_rdt_cgroup(struct intel_rdt *ir, unsigned long
cbmvalue)
{
        struct cgroup_subsys_state *css;
        struct intel_rdt *par, *c;
        unsigned long cbm_tmp = 0;
        int err = 0;

        if (!cbm_validate(cbmvalue)) {
                err = -EINVAL;
                goto out_err;
        }

        par = parent_rdt(ir);
        clos_cbm_table_read(par->closid, &cbm_tmp);
        if (!bitmap_subset(&cbmvalue, &cbm_tmp, MAX_CBM_LENGTH)) {
                err = -EINVAL;
                goto out_err;
        }


Do you (or the author of the patch), can explain why is this
restriction here?

If the restriction has to be maintained, than one 
hierarchy per-socket will be necessary to support different
bitmaps per socket.

If the restriction can be removed, then non hierarchical support
could look like:

/cgroups/cat/group-A/tasks = $thread-A
/cgroups/cat/group-A/socket-1/bitmap = 0x3F / 0xFF
/cgroups/cat/group-A/socket-2/bitmap = 0x... / 0xFF

Or one l3_cbm file containing one mask per socket,
separated by commas, similar to 

/sys/devices/system/node/node0/cpumap

> 2) Remove hierarchical support.

There is nothing hierarchical in CAT, its flat.

Each set of tasks is associated with a number of bits
in each socket's L3 CBM mask.

> 3) Lazy enforcement (which can be done later as an improvement).
> 

  reply	other threads:[~2015-11-17  1:02 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-13 16:39 [PATCH RFC] ioctl based CAT interface Marcelo Tosatti
2015-11-13 16:51 ` Peter Zijlstra
2015-11-13 17:27   ` Marcelo Tosatti
2015-11-13 17:43     ` Marcelo Tosatti
2015-11-16  8:59       ` Peter Zijlstra
2015-11-16 13:03         ` Marcelo Tosatti
2015-11-16 14:42           ` Thomas Gleixner
2015-11-16 19:52             ` Marcelo Tosatti
2015-11-16 15:01           ` Peter Zijlstra
2015-11-16 19:54             ` Marcelo Tosatti
2015-11-16 21:22             ` Marcelo Tosatti
2015-11-13 19:04     ` Luiz Capitulino
2015-11-13 20:22       ` Marcelo Tosatti
2015-11-16  9:03       ` Peter Zijlstra
2015-11-13 17:33   ` Marcelo Tosatti
2015-11-16  9:07     ` Peter Zijlstra
2015-11-16 14:37       ` Marcelo Tosatti
2015-11-16 15:37         ` Peter Zijlstra
2015-11-16 16:18       ` Luiz Capitulino
2015-11-16 16:26         ` Peter Zijlstra
2015-11-16 16:48           ` Luiz Capitulino
2015-11-16 16:39       ` Marcelo Tosatti
2015-11-17  1:01         ` Marcelo Tosatti [this message]
2015-11-13 18:01   ` Marcelo Tosatti
2015-11-16  9:09     ` Peter Zijlstra
2015-11-13 19:08 ` Luiz Capitulino
2015-12-03 21:58 ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151117010143.GA12871@amt.cnet \
    --to=mtosatti@redhat.com \
    --cc=fenghua.yu@intel.com \
    --cc=lcapitulino@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=vikas.shivappa@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox