From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B712BC4346E for ; Thu, 24 Sep 2020 05:13:13 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F34C32311A for ; Thu, 24 Sep 2020 05:13:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=mg.codeaurora.org header.i=@mg.codeaurora.org header.b="Rdh5vtEM" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F34C32311A Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 16A316B0062; Thu, 24 Sep 2020 01:13:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 11C226B0068; Thu, 24 Sep 2020 01:13:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 009AC6B006C; Thu, 24 Sep 2020 01:13:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0049.hostedemail.com [216.40.44.49]) by kanga.kvack.org (Postfix) with ESMTP id D88316B0062 for ; Thu, 24 Sep 2020 01:13:11 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 8D657824999B for ; Thu, 24 Sep 2020 05:13:11 +0000 (UTC) X-FDA: 77296786182.14.blood97_2417bd52715c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin14.hostedemail.com (Postfix) with ESMTP id 708BC18229818 for ; Thu, 24 Sep 2020 05:13:11 +0000 (UTC) X-HE-Tag: blood97_2417bd52715c X-Filterd-Recvd-Size: 7452 Received: from z5.mailgun.us (z5.mailgun.us [104.130.96.5]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Thu, 24 Sep 2020 05:13:10 +0000 (UTC) DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1600924390; h=Message-ID: References: In-Reply-To: Subject: Cc: To: From: Date: Content-Transfer-Encoding: Content-Type: MIME-Version: Sender; bh=3rstsPGgySfsl4lGNnBfruTzexmptpldkhpyMx81+PQ=; b=Rdh5vtEMoAK2UyCKwMxBpHFg2tzKwJ9LAhGIVjldCntkWi4UgyhqHvXpuXCK/sLpic0stF52 UMXZz6VMU8hpUMHA/yAcLyrDdOXFNKaLugjDiNEQwWJftxTZkUEfnD3xwsQ6I5B872Wn/918 /lfnjvXDCSi2AAUXlEP7lzSQVhg= X-Mailgun-Sending-Ip: 104.130.96.5 X-Mailgun-Sid: WyIwY2Q3OCIsICJsaW51eC1tbUBrdmFjay5vcmciLCAiYmU5ZTRhIl0= Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n01.prod.us-east-1.postgun.com with SMTP id 5f6c2ae6244d44dc65f15865 (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Thu, 24 Sep 2020 05:13:10 GMT Received: by smtp.codeaurora.org (Postfix, from userid 1001) id 777BDC433FE; Thu, 24 Sep 2020 05:13:09 +0000 (UTC) Received: from mail.codeaurora.org (localhost.localdomain [127.0.0.1]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: cgoldswo) by smtp.codeaurora.org (Postfix) with ESMTPSA id 6A0CDC433CB; Thu, 24 Sep 2020 05:13:07 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Wed, 23 Sep 2020 22:13:07 -0700 From: Chris Goldsworthy To: David Hildenbrand Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org, pratikp@codeaurora.org, pdaly@codeaurora.org, sudaraja@codeaurora.org, iamjoonsoo.kim@lge.com, linux-arm-msm-owner@vger.kernel.org, Vinayak Menon , linux-kernel-owner@vger.kernel.org Subject: Re: [PATCH v2] mm: cma: indefinitely retry allocations in cma_alloc In-Reply-To: <5cfa914fca107d884aa845b9273ec656@codeaurora.org> References: <06489716814387e7f147cf53d1b185a8@codeaurora.org> <1599851809-4342-1-git-send-email-cgoldswo@codeaurora.org> <010101747e998731-e49f209f-8232-4496-a9fc-2465334e70d7-000000@us-west-2.amazonses.com> <72ae0f361df527cf70946992e4ab1eb3@codeaurora.org> <5cfa914fca107d884aa845b9273ec656@codeaurora.org> Message-ID: <23a1565b9bb5f6002bd3e529d533f22b@codeaurora.org> X-Sender: cgoldswo@codeaurora.org User-Agent: Roundcube Webmail/1.3.9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2020-09-17 10:54, Chris Goldsworthy wrote: > On 2020-09-15 00:53, David Hildenbrand wrote: >> On 14.09.20 20:33, Chris Goldsworthy wrote: >>> On 2020-09-14 02:31, David Hildenbrand wrote: >>>> On 11.09.20 21:17, Chris Goldsworthy wrote: >>>>> >>>>> So, inside of cma_alloc(), instead of giving up when >>>>> alloc_contig_range() >>>>> returns -EBUSY after having scanned a whole CMA-region bitmap, >>>>> perform >>>>> retries indefinitely, with sleeps, to give the system an >>>>> opportunity >>>>> to >>>>> unpin any pinned pages. >>>>> >>>>> Signed-off-by: Chris Goldsworthy >>>>> Co-developed-by: Vinayak Menon >>>>> Signed-off-by: Vinayak Menon >>>>> --- >>>>> mm/cma.c | 25 +++++++++++++++++++++++-- >>>>> 1 file changed, 23 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/mm/cma.c b/mm/cma.c >>>>> index 7f415d7..90bb505 100644 >>>>> --- a/mm/cma.c >>>>> +++ b/mm/cma.c >>>>> @@ -442,8 +443,28 @@ struct page *cma_alloc(struct cma *cma, size_t >>>>> count, unsigned int align, >>>>> bitmap_maxno, start, bitmap_count, mask, >>>>> offset); >>>>> if (bitmap_no >= bitmap_maxno) { >>>>> - mutex_unlock(&cma->lock); >>>>> - break; >>>>> + if (ret == -EBUSY) { >>>>> + mutex_unlock(&cma->lock); >>>>> + >>>>> + /* >>>>> + * Page may be momentarily pinned by some other >>>>> + * process which has been scheduled out, e.g. >>>>> + * in exit path, during unmap call, or process >>>>> + * fork and so cannot be freed there. Sleep >>>>> + * for 100ms and retry the allocation. >>>>> + */ >>>>> + start = 0; >>>>> + ret = -ENOMEM; >>>>> + msleep(100); >>>>> + continue; >>>>> + } else { >>>>> + /* >>>>> + * ret == -ENOMEM - all bits in cma->bitmap are >>>>> + * set, so we break accordingly. >>>>> + */ >>>>> + mutex_unlock(&cma->lock); >>>>> + break; >>>>> + } >>>>> } >>>>> bitmap_set(cma->bitmap, bitmap_no, bitmap_count); >>>>> /* >>>>> >>>> >>>> What about long-term pinnings? IIRC, that can happen easily e.g., >>>> with >>>> vfio (and I remember there is a way via vmsplice). >>>> >>>> Not convinced trying forever is a sane approach in the general case >>>> ... >>> >>> V1: >>> [1] https://lkml.org/lkml/2020/8/5/1097 >>> [2] https://lkml.org/lkml/2020/8/6/1040 >>> [3] https://lkml.org/lkml/2020/8/11/893 >>> [4] https://lkml.org/lkml/2020/8/21/1490 >>> [5] https://lkml.org/lkml/2020/9/11/1072 >>> >>> We're fine with doing indefinite retries, on the grounds that if >>> there >>> is some long-term pinning that occurs when alloc_contig_range returns >>> -EBUSY, that it should be debugged and fixed. Would it be possible >>> to >>> make this infinite-retrying something that could be enabled or >>> disabled >>> by a defconfig option? >> >> Two thoughts: >> >> This means I strongly prefer something like [3] if feasible. > > _Resending so that this ends up on LKML_ > > I can give [3] some further thought then. Also, I realized [3] will > not > completely solve the problem, it just reduces the window in which > _refcount > _mapcount (as mentioned in earlier threads, we encountered > the pinning when a task in copy_one_pte() or in the exit_mmap() path > gets context switched out). If we were to try a sleeping-lock based > solution, do you think it would be permissible to add another lock to > struct page? I have not been able to think of a clean way of introducing calls to preempt_disable() in exit_mmap(), which is the more problematic case. We would need to track state across multiple invocations of zap_pte_range() (which is called for each entry in a PMD when a process's memory is being unmapped), and would also need to extend this to tlb_finish_mmu(), which is called after all the process's memory has been unmapped: https://elixir.bootlin.com/linux/v5.8.10/source/mm/mmap.c#L3164. As a follow-up to this patch, I'm submitting a patch that re-introduces the GFP mask for cma_alloc, that will perform indefinite retires if __GFP_NOFAIL is passed to the function. -- The Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project