From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.3 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE334ECE58C for ; Tue, 8 Oct 2019 16:05:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 78FB92070B for ; Tue, 8 Oct 2019 16:05:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="l6JCjbFE" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 78FB92070B Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AD2398E0005; Tue, 8 Oct 2019 12:05:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A812B8E0003; Tue, 8 Oct 2019 12:05:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 96FB38E0005; Tue, 8 Oct 2019 12:05:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0086.hostedemail.com [216.40.44.86]) by kanga.kvack.org (Postfix) with ESMTP id 7573A8E0003 for ; Tue, 8 Oct 2019 12:05:11 -0400 (EDT) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 06553348B for ; Tue, 8 Oct 2019 16:05:11 +0000 (UTC) X-FDA: 76021091622.04.cast48_6757d4ba2f100 X-HE-Tag: cast48_6757d4ba2f100 X-Filterd-Recvd-Size: 7614 Received: from mail-lj1-f193.google.com (mail-lj1-f193.google.com [209.85.208.193]) by imf32.hostedemail.com (Postfix) with ESMTP for ; Tue, 8 Oct 2019 16:05:10 +0000 (UTC) Received: by mail-lj1-f193.google.com with SMTP id d1so18131081ljl.13 for ; Tue, 08 Oct 2019 09:05:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:date:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=0UVqNIV0POnW5dby0dbdx1U9vQuG/qxbGoGcY7weL30=; b=l6JCjbFEMicjT+WZHFJ9SZSsNbcdDznuB5cWhtuUqJpZ9/ptyYNLLvjYkr6d/3UwlH rYyubjcuQZdJIXDqSnBveJXvgDi991jdXGnBfrE68sIEn+pVBY2AGXKZCB8KnhRY2wPz vxsuPTxXdT0rudK/lytKWlp5EdJTrSu8WhhmejhwYXcF/jLZ6pSB6OYbkocxFQCVV0WT slQpjweKTnSZsV3GRfjEo9ScHTkT4RbVfjh+SM/3233h7NhldsjZwu9JJqyizDQ/ca/1 eg4R31Q+FSxtski9I2klDvpD3dW5ekRwNHGydskVofkKuC3umEMyQZB1nfI+sbxnVVIy aH2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:date:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=0UVqNIV0POnW5dby0dbdx1U9vQuG/qxbGoGcY7weL30=; b=LNkBZNn/9s3w/QfldNT+z+StNyAKR6XSzhvqskjGf6oBx82J10VKUSeKnq1BeaCIkL KLikhPWObw46EYTTNXMVghb7SbC2eZR6Hk69QYhl9PYemLfa7ev99r0m9YoINf4V7JT3 XeU5xLISKa5QPhyjfbFWL7g2aG1xkRBbc6icTQ2yIW9FbsTKvVcswpzeGxqISS2UY30W tXhQeA3Zi2A6Yal0W/6nXCYj7MSSnO0VjeJzyZykjz59Pf6xMFsi9FPGtn176hBxOGLi bGocsBWIw0IBJXCRVEKcADZU00L6tTOOHgCWt35rEHbXs+I8BdOo7Fwfu+kP1xV8yozl 3Ueg== X-Gm-Message-State: APjAAAWHQBkC9OAQRRmn5u9JlAj5XK060FV0YYEbx98yfuQjf/kNzWYF SrMugc+V7IF8vTB9guHPi6g= X-Google-Smtp-Source: APXvYqzQ+Jv3XFkYlnT6ZtELKk4KkJGJgKgVz2z5Xp+/7eAQ8tYVpnb/MGsstkt4G1Ez0hKgmJJ+4A== X-Received: by 2002:a2e:7a16:: with SMTP id v22mr22450142ljc.61.1570550708288; Tue, 08 Oct 2019 09:05:08 -0700 (PDT) Received: from pc636 ([37.139.158.167]) by smtp.gmail.com with ESMTPSA id m15sm4084586ljh.50.2019.10.08.09.05.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Oct 2019 09:05:07 -0700 (PDT) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Tue, 8 Oct 2019 18:04:59 +0200 To: Uladzislau Rezki Cc: Sebastian Andrzej Siewior , Daniel Wagner , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-rt-users@vger.kernel.org, Andrew Morton Subject: Re: [PATCH] mm: vmalloc: Use the vmap_area_lock to protect ne_fit_preload_node Message-ID: <20191008160459.GA5487@pc636> References: <20191004153728.c5xppuqwqcwecbe6@linutronix.de> <20191004162041.GA30806@pc636> <20191004163042.jpiau6dlxqylbpfh@linutronix.de> <20191007083037.zu3n5gindvo7damg@beryllium.lan> <20191007105631.iau6zhxqjeuzajnt@linutronix.de> <20191007162330.GA26503@pc636> <20191007163443.6owts5jp2frum7cy@beryllium.lan> <20191007165611.GA26964@pc636> <20191007173644.hiiukrl2xryziro3@linutronix.de> <20191007214420.GA3212@pc636> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191007214420.GA3212@pc636> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Oct 07, 2019 at 11:44:20PM +0200, Uladzislau Rezki wrote: > On Mon, Oct 07, 2019 at 07:36:44PM +0200, Sebastian Andrzej Siewior wrote: > > On 2019-10-07 18:56:11 [+0200], Uladzislau Rezki wrote: > > > Actually there is a high lock contention on vmap_area_lock, because it > > > is still global. You can have a look at last slide: > > > > > > https://linuxplumbersconf.org/event/4/contributions/547/attachments/287/479/Reworking_of_KVA_allocator_in_Linux_kernel.pdf > > > > > > so this change will make it a bit higher. From the other hand i agree > > > that for rt it should be fixed, probably it could be done like: > > > > > > ifdef PREEMPT_RT > > > migrate_disable() > > > #else > > > preempt_disable() > > > ... > > > > > > but i am not sure it is good either. > > > > What is to be expected on average? Is the lock acquired and then > > released again because the slot is empty and memory needs to be > > allocated or can it be assumed that this hardly happens? > > > The lock is not released(we are not allowed), instead we just try > to allocate with GFP_NOWAIT flag. It can happen if preallocation > has been failed with GFP_KERNEL flag earlier: > > > ... > } else if (type == NE_FIT_TYPE) { > /* > * Split no edge of fit VA. > * > * | | > * L V NVA V R > * |---|-------|---| > */ > lva = __this_cpu_xchg(ne_fit_preload_node, NULL); > if (unlikely(!lva)) { > ... > lva = kmem_cache_alloc(vmap_area_cachep, GFP_NOWAIT); > ... > } > ... > > > How often we need an extra object for split purpose, the answer > is it depends on. For example fork() path falls to that pattern. > > I think we can assume that migration can hardly ever happen and > that should be considered as rare case. Thus we can do a prealoading > without worrying much if a it occurs: > > > urezki@pc636:~/data/ssd/coding/linux-stable$ git diff > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index e92ff5f7dd8b..bc782edcd1fd 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -1089,20 +1089,16 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, > * Even if it fails we do not really care about that. Just proceed > * as it is. "overflow" path will refill the cache we allocate from. > */ > - preempt_disable(); > - if (!__this_cpu_read(ne_fit_preload_node)) { > - preempt_enable(); > + if (!this_cpu_read(ne_fit_preload_node)) { > pva = kmem_cache_alloc_node(vmap_area_cachep, GFP_KERNEL, node); > - preempt_disable(); > > - if (__this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) { > + if (this_cpu_cmpxchg(ne_fit_preload_node, NULL, pva)) { > if (pva) > kmem_cache_free(vmap_area_cachep, pva); > } > } > > spin_lock(&vmap_area_lock); > - preempt_enable(); > > /* > * If an allocation fails, the "vend" address is > urezki@pc636:~/data/ssd/coding/linux-stable$ > > > so, we do not guarantee, instead we minimize number of allocations > with GFP_NOWAIT flag. For example on my 4xCPUs i am not able to > even trigger the case when CPU is not preloaded. > > I can test it tomorrow on my 12xCPUs to see its behavior there. > Tested it on different systems. For example on my 8xCPUs system that runs PREEMPT kernel i see only few GFP_NOWAIT allocations, i.e. it happens when we land to another CPU that was not preloaded. I run the special test case that follows the preload pattern and path. So 20 "unbind" threads run it and each does 1000000 allocations. As a result only 3.5 times among 1000000, during splitting, CPU was not preloaded thus, GFP_NOWAIT was used to obtain an extra object. It is obvious that slightly modified approach still minimizes allocations in atomic context, so it can happen but the number is negligible and can be ignored, i think. -- Vlad Rezki