From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.6 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_PASS,USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 886A3C43218 for ; Thu, 25 Apr 2019 14:36:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8C4CC2067D for ; Thu, 25 Apr 2019 14:36:55 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=digitalocean.com header.i=@digitalocean.com header.b="M1qn47Eo" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727212AbfDYOgx (ORCPT ); Thu, 25 Apr 2019 10:36:53 -0400 Received: from mail-qt1-f193.google.com ([209.85.160.193]:35843 "EHLO mail-qt1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726102AbfDYOgx (ORCPT ); Thu, 25 Apr 2019 10:36:53 -0400 Received: by mail-qt1-f193.google.com with SMTP id c35so336873qtk.3 for ; Thu, 25 Apr 2019 07:36:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=digitalocean.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=/IiAuR/yG1CgRbvu1CfSN4UcnI+hLTFsJSqCSSn4xbA=; b=M1qn47Eo1JnQPkhpxy9gwpAnssbsUT2IEFJGykKlS2nqF6IEtO9mGXjBPWWNAgaTNL FyukTUTRPUadxDEinCF7EbgTDC0mFEUkvEpVoe4lKfZD4+VQGIH26EJIff6zY+2ZVLzT 1ugNPLkV2v8LB2V84surzF1BeNce7QcKVCJrk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=/IiAuR/yG1CgRbvu1CfSN4UcnI+hLTFsJSqCSSn4xbA=; b=nZkeqLp1T4vkdGPP4ZP/iEkGJO4DFpe+1vVUyl7kwe02VTQuT4sDpV4hbEsZwK8X6A fCTqH+n9VTZ1vWiN8VICBlrAoJMYySxZA9iNpxS1UQikGin8UgurhyeWNFv8s5V0WP0O ScJYSVobSa21WsH+u6Z8g6FsUlbkbbb2prJBO5XNkbDbDOc4mW1vEK4czI4u6FH7zlMK 3dCU4/X1D3TVHFqzAzZUsBxAgFCZvOpg3z8d7Y3OUgEJz/toeyxRYUOZHD5/WNOapOkb RNV2lzpC5R8e03+qn6YZMtv2MexrIUiwM5jMnmGsfaWcxnkX7FoLlSxrQn+UTm0FkPRZ 0aBQ== X-Gm-Message-State: APjAAAUsOaR9PaISOCEgH5P95NRuiICd5vvmAt6/xjUGdWOwabnampYO jnkZOEfqijUPXXbnvzT15STs8Q== X-Google-Smtp-Source: APXvYqxVfP8sKhk0K8ZzV4eXBJ0UBMnBanR9qxebp1BVjB6EKetZADPzooN3YeaSF5sOl40eYsm8fw== X-Received: by 2002:a0c:9d02:: with SMTP id m2mr21955194qvf.32.1556203012444; Thu, 25 Apr 2019 07:36:52 -0700 (PDT) Received: from sinkpad (192-222-189-155.qc.cable.ebox.net. [192.222.189.155]) by smtp.gmail.com with ESMTPSA id k65sm6550220qkc.79.2019.04.25.07.36.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Apr 2019 07:36:51 -0700 (PDT) Date: Thu, 25 Apr 2019 10:36:44 -0400 From: Julien Desfossez To: Vineeth Remanan Pillai Cc: Nishanth Aravamudan , Peter Zijlstra , Tim Chen , mingo@kernel.org, tglx@linutronix.de, pjt@google.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, subhra.mazumdar@oracle.com, fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Phil Auld , Aaron Lu , Aubrey Li , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini Subject: Re: [RFC PATCH v2 00/17] Core scheduling v2 Message-ID: <20190425143644.GA13531@sinkpad> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Mailer: Mutt 1.5.24 (2015-08-30) User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 23-Apr-2019 04:18:05 PM, Vineeth Remanan Pillai wrote: > Second iteration of the core-scheduling feature. > > This version fixes apparent bugs and performance issues in v1. This > doesn't fully address the issue of core sharing between processes > with different tags. Core sharing still happens 1% to 5% of the time > based on the nature of workload and timing of the runnable processes. > > Changes in v2 > ------------- > - rebased on mainline commit: 6d906f99817951e2257d577656899da02bb33105 Here are our benchmark results. Environment setup: ------------------ Skylake server, 2 numa nodes, total 72 CPUs with HT on Workload in KVM virtual machines, one cpu cgroup per VM (including qemu and vhost threads) Case 1: MySQL TPC-C ------------------- 1 12-vcpus-32gb MySQL server per numa node (clients on another physical machine) 96 semi-idle 1-vcpu-512mb VM per numa node (sending metrics over a VPN every 15 seconds) --> 3 vcpus per physical CPU Average of 10 5-minutes runs. - baseline: - avg tps: 1878 - stdev tps: 47 - nosmt: - avg tps: 959 (-49% from baseline) - stdev tps: 35 - core scheduling: - avg tps: 1406 (-25% from baseline) - stdev tps: 48 - Co-scheduling stats (5 minutes sample): - 48.9% VM threads - 49.6% idle - 1.3% foreign threads So in the v2, the case with a very noisy test, benefits from core scheduling (the baseline is also better compared to v1 so we probably benefit from other changes in the kernel). Case 2: linpack with enough room -------------------------------- 2 12-vcpus-32gb linpack VMs both pinned on the same NUMA node (36 hardware threads with SMT on). 100k context switches/sec. Average of 5 15-minutes runs. - baseline: - avg gflops: 403 - stdev: 20 - nosmt: - avg gflops: 355 (-12% from baseline) - stdev: 28 - core scheduling: - avg gflops: 364 (-9% from baseline) - stdev: 59 - Co-scheduling stats (5 minutes sample): - 39.3% VM threads - 59.3% idle - 0.07% foreign threads No real difference between nosmt and core scheduling when there is enough room to run a cpu-intensive workload even with smt off. Case 3: full node linpack ------------------------- 3 12-vcpus-32gb linpack VMs all pinned on the same NUMA node (36 hardware threads with SMT on). 155k context switches/sec Average of 5 15-minutes runs. - baseline: - avg gflops: 270 - stdev: 5 - nosmt (switching to 2:1 ratio of vcpu to hardware threads): - avg gflops: 209 (-22.46% from baseline) - stdev: 6.2 - core scheduling - avg gflops: 269 (-0.11% from baseline) - stdev: 5.7 - Co-scheduling stats (5 minutes sample): - 93.7% VM threads - 6.3% idle - 0.04% foreign threads Here the core scheduling is a major improvement in terms of performance compared to nosmt. Julien