From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754582AbcEQLrG (ORCPT ); Tue, 17 May 2016 07:47:06 -0400 Received: from mail-wm0-f66.google.com ([74.125.82.66]:33637 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751124AbcEQLrD (ORCPT ); Tue, 17 May 2016 07:47:03 -0400 Date: Tue, 17 May 2016 13:46:55 +0200 From: luca abeni To: Tommaso Cucinotta Cc: linux-kernel , Peter Zijlstra , Juri Lelli , mingo@redhat.com Subject: Re: SCHED_DEADLINE cpudeadline.{h,c} fixup Message-ID: <20160517134655.266c7201@utopia> In-Reply-To: <5739EE84.9070801@sssup.it> References: <5739EE84.9070801@sssup.it> Organization: university of trento X-Mailer: Claws Mail 3.13.2 (GTK+ 2.24.30; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi all, On Mon, 16 May 2016 18:00:04 +0200 Tommaso Cucinotta wrote: > Hi, > > looking at the SCHED_DEADLINE code, I spotted an opportunity to > make cpudeadline.c faster, in that we can skip real swaps during > re-heapify()ication of items after addition/removal. As such ops > are done under a domain spinlock, it sounded like an interesting > try. [...] I do not know the cpudeadline code too much, but I think every "dl = 0" looks like a bug... So, I think this hunk actually fixes a real bug: [...] - cp->elements[cp->size - 1].dl = 0; - cp->elements[cp->size - 1].cpu = cpu; - cp->elements[cpu].idx = cp->size - 1; - cpudl_change_key(cp, cp->size - 1, dl); - cpumask_clear_cpu(cpu, cp->free_cpus); + cpumask_set_cpu(cpu, cp->free_cpus); } else { - cpudl_change_key(cp, old_idx, dl); + if (old_idx == IDX_INVALID) { + int sz1 = cp->size++; + cp->elements[sz1].dl = dl; [...] Maybe the "cp->elements[cp->size - 1].dl = 0" -> "cp->elements[cp->size - 1].dl = 0" change can be split in a separate patch, which is a bugfix (and IMHO uncontroversial)? Thanks, Luca > > Indeed, I've got a speed-up of up to ~6% for the cpudl_set() calls > on a randomly generated workload of 1K,10K,100K random insertions > and deletions (75% cpudl_set() calls with is_valid=1 and 25% with > is_valid=0), and randomly generated cpu IDs with 2, 4, ..., 256 CPUs. > Details in the attached plot. > > The attached patch does this, along with a minimum of rework of > cpudeadline.c internals, and a final clean-up of the cpudeadline.h > interface (second patch). > > The measurements have been made on an Intel Core2 Duo with the CPU > frequency fixed at max, by letting cpudeadline.c be initialized with > various numbers of CPUs, then making many calls sequentially, taking > the rdtsc among calls, then dumping all numbers through printk(), > and I'm plotting the average of clock ticks between consecutive calls. > [ I can share the benchmarking code as well if needed ] > > Also, this fixes what seems to me a bug I noticed comparing the whole > heap contents as handledbut the modified code vs the original one, > insertion by insertion. The problem is in this code: > > cp->elements[cp->size - 1].dl = 0; > cp->elements[cp->size - 1].cpu = cpu; > cp->elements[cpu].idx = cp->size - 1; > mycpudl_change_key(cp, cp->size - 1, dl); > > when fed by an absolute deadline that is so large to have a negative > value as a s64. In such a case, as from dl_time_before(), the kernel > should handle correctly the abs deadline wrap-around, however the > current code in cpudeadline.c goes mad, and doesn't re-heapify > correctly the just inserted element... that said, if these are ns, > such a bug should be hit after a ~292 years of uptime :-D... > > I'd be happy to hear comments from others. I can provide additional > info / make additional experiments as needed. > > Please, reply-all to this e-mail, I'm not subscribed to linux-kernel@. > > Thanks, > > Tommaso