From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.3 required=3.0 tests=BAYES_00,DKIM_ADSP_CUSTOM_MED, DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC8F8C4363D for ; Tue, 22 Sep 2020 14:20:14 +0000 (UTC) Received: from lists.xenproject.org (lists.xenproject.org [192.237.175.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7A4DA2076E for ; Tue, 22 Sep 2020 14:20:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bZEz5ZSO" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7A4DA2076E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=xen-devel-bounces@lists.xenproject.org Received: from localhost ([127.0.0.1] helo=lists.xenproject.org) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kKj9J-0004CG-R7; Tue, 22 Sep 2020 14:20:05 +0000 Received: from us1-rack-iad1.inumbo.com ([172.99.69.81]) by lists.xenproject.org with esmtp (Exim 4.92) (envelope-from ) id 1kKj9J-00044D-2A for xen-devel@lists.xenproject.org; Tue, 22 Sep 2020 14:20:05 +0000 X-Inumbo-ID: 878c22f3-c6a0-4a52-aaa1-5bd256a32647 Received: from mail-wr1-x431.google.com (unknown [2a00:1450:4864:20::431]) by us1-rack-iad1.inumbo.com (Halon) with ESMTPS id 878c22f3-c6a0-4a52-aaa1-5bd256a32647; Tue, 22 Sep 2020 14:20:03 +0000 (UTC) Received: by mail-wr1-x431.google.com with SMTP id j2so17290847wrx.7; Tue, 22 Sep 2020 07:20:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:reply-to:to:cc:references:in-reply-to:subject:date:message-id :mime-version:content-transfer-encoding:content-language :thread-index; bh=wkHCv8F92nzhMr1RHQy8ugQFBMe6EAXcxhpmVIgirJg=; b=bZEz5ZSOdtlw1hbrTRbCt4TQleWCOIrQqCh+zziBxHWn/Y0rHwCD2DqETsuIw6SC/Z Nwx527eMnUlgT1UQhKhP7Sre1nwVc/H/uCCyARnpnduUEqFXuW8PZcmnvSMwgA25VnyZ 50jTfjbzXWOJ0kKCJsHUcNk4wddyijduXNdhp+NkzXvKUiIh+t2UxFhyu2Kl66/LvPFx mwS4SYVojCXTeXgiXOIbcir5lJ40mdIEsCBSs8yQsnThld9PtSwQJOurgPO5tisUdBZu PiTsU/a53PBRrQ/ssZIXIhVd+JmsfG+8g4X+C/WoxY+ndYQLiAWr4CnhAbXajGjoXumo 8C0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:reply-to:to:cc:references:in-reply-to :subject:date:message-id:mime-version:content-transfer-encoding :content-language:thread-index; bh=wkHCv8F92nzhMr1RHQy8ugQFBMe6EAXcxhpmVIgirJg=; b=krItZQtTWudcYD1bjwR1a46hK54qZ+Jb6i8qLco1cJcaENBRRPcmViCwToCygT/aI7 tZZiM2C2smM2l5vcKXo0MVKY8AEeI3cu+lQuvNpvG2fVUJ8y+D9/KInVRoEcAmRBUqS9 0spHTKwV5YN+OXcAt7noC3EhQIR5hGdJcmMqvVAsSJxje32rYvsTU7KztpFzyFG0SWDu obsb9mYhJYPcRd9ue3no8PeYpgrAcK2igJN2rV6YnlhGLQuxnMNcKeRkx0u09ZyRwxG0 Q4S6VScY9XNMBb/ZAH871ONUTxpyOzp8L3S3VEdH5fvHtSgAZg7wjIg/6SjjipDDLUe9 ewdw== X-Gm-Message-State: AOAM533ra1+uDL05d64CeXebEovbExEqjmorJLaPLj6LGUDOeSeiz5/q BJD8/eKBEVT0hvBEb/eMhDM= X-Google-Smtp-Source: ABdhPJzMVopL2etqqrOMerK9YUMg+ERAPjQyT1St+WmgTl8JcD5ZXnV+iVmMkZ8BpOR61tk5KR8wuA== X-Received: by 2002:adf:fd90:: with SMTP id d16mr5777892wrr.52.1600784402720; Tue, 22 Sep 2020 07:20:02 -0700 (PDT) Received: from CBGR90WXYV0 (host86-176-94-160.range86-176.btcentralplus.com. [86.176.94.160]) by smtp.gmail.com with ESMTPSA id h8sm26758632wrw.68.2020.09.22.07.20.01 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 22 Sep 2020 07:20:01 -0700 (PDT) From: Paul Durrant X-Google-Original-From: "Paul Durrant" To: =?utf-8?Q?'J=C3=BCrgen_Gro=C3=9F'?= , "'Edwin Torok'" , , "'Anthony Perard'" , Cc: , , References: <46f1f50dc02c53391958d9d4bb5fc57d23ba6ede.camel@citrix.com> <00a101d690e6$33a88bd0$9af9a370$@xen.org> <816d5bd8-6794-7fcd-bd08-6eb5a2248328@suse.com> In-Reply-To: <816d5bd8-6794-7fcd-bd08-6eb5a2248328@suse.com> Subject: RE: oxenstored performance issue when starting VMs in parallel Date: Tue, 22 Sep 2020 15:20:00 +0100 Message-ID: <00e901d690eb$759ff5a0$60dfe0e0$@xen.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable X-Mailer: Microsoft Outlook 16.0 Content-Language: en-gb Thread-Index: AQIMrhWN1wQeREaSEc+MmVb+KxeEJQB0POE8AQkglPkA0UpCvgG7bFMSAn0bfWsCYHSeyqjBPSiw X-BeenThere: xen-devel@lists.xenproject.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Xen developer discussion List-Unsubscribe: , List-Post: List-Help: List-Subscribe: , Reply-To: paul@xen.org Errors-To: xen-devel-bounces@lists.xenproject.org Sender: "Xen-devel" > -----Original Message----- > From: J=C3=BCrgen Gro=C3=9F > Sent: 22 September 2020 15:18 > To: paul@xen.org; 'Edwin Torok' ; = sstabellini@kernel.org; 'Anthony Perard' > ; xen-devel@lists.xenproject.org > Cc: xen-users@lists.xenproject.org; jerome.leseinne@gmail.com; = julien@xen.org > Subject: Re: oxenstored performance issue when starting VMs in = parallel >=20 > On 22.09.20 15:42, Paul Durrant wrote: > >> -----Original Message----- > >> From: Edwin Torok > >> Sent: 22 September 2020 14:29 > >> To: sstabellini@kernel.org; Anthony Perard = ; xen- > >> devel@lists.xenproject.org; paul@xen.org > >> Cc: xen-users@lists.xenproject.org; jerome.leseinne@gmail.com; = julien@xen.org > >> Subject: Re: oxenstored performance issue when starting VMs in = parallel > >> > >> On Tue, 2020-09-22 at 15:17 +0200, jerome leseinne wrote: > >>> Hi, > >>> > >>> Edwin you rock ! This call in qemu is effectively the culprit ! > >>> I have disabled this xen_bus_add_watch call and re-run test on our > >>> big server: > >>> > >>> - oxenstored is now between 10% to 20% CPU usage (previously was > >>> 100% all the time) > >>> - All our VMs are responsive > >>> - All our VM start in less than 10 seconds (before the fix some = VMs > >>> could take more than one minute to be fully up > >>> - Dom0 is more responsive > >>> > >>> Disabling the watch may not be the ideal solution ( I let the qemu > >>> experts answer this and the possible side effects), > >> > >> Hi, > >> > >> CC-ed Qemu maintainer of Xen code, please see this discussion about > >> scalability issues with the backend watching code in qemu 4.1+. > >> > >> I think the scalability issue is due to this code in qemu, which = causes > >> an instance of qemu to see watches from all devices (even those > >> belonging to other qemu instances), such that adding a single = device > >> causes N watches to be fired on each N instances of qemu: > >> xenbus->backend_watch =3D > >> xen_bus_add_watch(xenbus, "", /* domain root node */ > >> "backend", xen_bus_backend_changed, > >> &local_err); > >> > >> I can understand that for backwards compatibility you might need = this > >> code, but is there a way that an up-to-date (xl) toolstack could = tell > >> qemu what it needs to look at (e.g. via QMP, or other keys in = xenstore) > >> instead of relying on an overly broad watch? > > > > I think this could be made more efficient. The call to = "module_call_init(MODULE_INIT_XEN_BACKEND)" > just prior to this watch will register backends that do auto-creation = so we could register individual > watches for the various backend types instead of this single one. >=20 > The watch should be on guest domain level, e.g. for: >=20 > /local/domain/0/backend/vbd/5 >=20 > We have one qemu process per guest, after all. >=20 I'll see if I can spin a patch this afternoon. Paul >=20 > Juergen