From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755354AbcGZCNK (ORCPT ); Mon, 25 Jul 2016 22:13:10 -0400 Received: from mx1.redhat.com ([209.132.183.28]:41939 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752999AbcGZCNH (ORCPT ); Mon, 25 Jul 2016 22:13:07 -0400 Date: Sat, 23 Jul 2016 01:31:03 -0300 From: Marcelo Tosatti To: "Luck, Tony" Cc: Fenghua Yu , Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , Tejun Heo , Borislav Petkov , Stephane Eranian , Peter Zijlstra , David Carrillo-Cisneros , Ravi V Shankar , Vikas Shivappa , Sai Prakhya , linux-kernel , x86 Subject: Re: [PATCH 04/32] x86/intel_rdt: Add L3 cache capacity bitmask management Message-ID: <20160723043103.GA22015@amt.cnet> References: <1468371785-53231-1-git-send-email-fenghua.yu@intel.com> <1468371785-53231-5-git-send-email-fenghua.yu@intel.com> <20160722071203.GA18422@amt.cnet> <20160722214322.GA938@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160722214322.GA938@intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.39]); Tue, 26 Jul 2016 02:13:05 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 22, 2016 at 02:43:23PM -0700, Luck, Tony wrote: > On Fri, Jul 22, 2016 at 04:12:04AM -0300, Marcelo Tosatti wrote: > > How does this patchset handle the following condition: > > > > 6) Create reservations in such a way that the sum is larger than > > total amount of cache, and CPU pinning (example from Karen Noel): > > > > VM-1 on socket-1 with 80% of reservation. > > VM-2 on socket-2 with 80% of reservation. > > VM-1 pinned to socket-1. > > VM-2 pinned to socket-2. > > That's legal, but perhaps we need a description of > overlapping cache reservations. > > Hardware tells you how finely you can divide the cache (and this > information is shown in /sys/fs/resctrl/info/l3/max_cbm_len to save > you from digging in CPUID leaves). E.g. on Broadwell the value is > 20, so you can control cache allocations in 5% slices. > > A bitmask defines which slices you can use (and h/w has the restriction > that you must have contiguous '1' bits in any mask). So you can pick > your 80% using 0x0ffff, 0x1fffe, 0x3fffc, 0x7fff8 or 0xffff0. > > There is no requirement that masks be exclusive of each other. So > you might pick the two extremes: 0x0ffff and 0xffff0 for your two > VM's in this example. Each would be allowed to allocate up to 80%, > but with a big overlap in the middle. Each has 20% exclusive, but > there is a 60% range in the middle that they would compete for. This are different sockets, so there is no competing/sharing of L3 cache here: the question is about whether the interface allows the user to specify that 80/80 reservation without complaining: because the VM's are pinned, they will never actually share the same L3 cache. (haven't finished reading the patchset to be certain). > Is this specific case useful? Possibly not. I think the more common > overlap cases might be between processes that you know have shared > code/data. Also the case where some rdtgroup has access to allocate > in the entire cache (mask 0xfffff on Broadwell) and some other > rdtgroups > have limited cache allocation with less bits in the mask. > > -Tony All you have to do is to build the bitmask for a given processor from the union of the tasks which have been scheduled on that processor.