From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B841C433DF for ; Fri, 9 Oct 2020 20:27:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id AE5E12158C for ; Fri, 9 Oct 2020 20:27:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=mg.codeaurora.org header.i=@mg.codeaurora.org header.b="KZkR6XSD" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org AE5E12158C Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 1EBAF94000A; Fri, 9 Oct 2020 16:27:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 174B38E0001; Fri, 9 Oct 2020 16:27:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F2EC794000A; Fri, 9 Oct 2020 16:27:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0094.hostedemail.com [216.40.44.94]) by kanga.kvack.org (Postfix) with ESMTP id BCE4C8E0001 for ; Fri, 9 Oct 2020 16:27:25 -0400 (EDT) Received: from smtpin20.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6533A3623 for ; Fri, 9 Oct 2020 20:27:25 +0000 (UTC) X-FDA: 77353522050.20.farm57_540c0c3271e3 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin20.hostedemail.com (Postfix) with ESMTP id 405ED180C07A3 for ; Fri, 9 Oct 2020 20:27:25 +0000 (UTC) X-HE-Tag: farm57_540c0c3271e3 X-Filterd-Recvd-Size: 6906 Received: from z5.mailgun.us (z5.mailgun.us [104.130.96.5]) by imf16.hostedemail.com (Postfix) with ESMTP for ; Fri, 9 Oct 2020 20:27:20 +0000 (UTC) DKIM-Signature: a=rsa-sha256; v=1; c=relaxed/relaxed; d=mg.codeaurora.org; q=dns/txt; s=smtp; t=1602275244; h=Message-ID: References: In-Reply-To: Subject: Cc: To: From: Date: Content-Transfer-Encoding: Content-Type: MIME-Version: Sender; bh=52lzXpIM5c5WEUGQw7Y5zF9EbTe3/SJ1ngiJwmEncaE=; b=KZkR6XSD7Y+obP+pwsekxq5H6wXoVPwGB8EhNBzmPJiDVKKDCuZqAWpoGAQAK/vzXjBFIvBa A1eWfn2/rHlZL1LBzuGGxHm4cuNoKlr7UETBY74pXLfLMoLG9047QlgLIKtO0NEy6d+Me62K CV0su5kG8Bji1o1zYWnaOHiDoSI= X-Mailgun-Sending-Ip: 104.130.96.5 X-Mailgun-Sid: WyIwY2Q3OCIsICJsaW51eC1tbUBrdmFjay5vcmciLCAiYmU5ZTRhIl0= Received: from smtp.codeaurora.org (ec2-35-166-182-171.us-west-2.compute.amazonaws.com [35.166.182.171]) by smtp-out-n07.prod.us-west-2.postgun.com with SMTP id 5f80c795f9168450ea17ffac (version=TLS1.2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); Fri, 09 Oct 2020 20:27:01 GMT Received: by smtp.codeaurora.org (Postfix, from userid 1001) id D9DF6C433CB; Fri, 9 Oct 2020 20:27:01 +0000 (UTC) Received: from mail.codeaurora.org (localhost.localdomain [127.0.0.1]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: cgoldswo) by smtp.codeaurora.org (Postfix) with ESMTPSA id 8D6C2C433F1; Fri, 9 Oct 2020 20:26:59 +0000 (UTC) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Fri, 09 Oct 2020 13:26:59 -0700 From: Chris Goldsworthy To: Christoph Hellwig Cc: akpm@linux-foundation.org, linux-mm@kvack.org, minchan@kernel.org, linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org, pratikp@codeaurora.org, pdaly@codeaurora.org, sudaraja@codeaurora.org, iamjoonsoo.kim@lge.com, david@redhat.com, vinmenon@codeaurora.org, minchan.kim@gmail.com Subject: Re: [PATCH v4] mm: cma: indefinitely retry allocations in cma_alloc In-Reply-To: <20200929055937.GA5332@infradead.org> References: <20200929055937.GA5332@infradead.org> Message-ID: <3cdd6c30c062cf11eb1a7e3c47ff111e@codeaurora.org> X-Sender: cgoldswo@codeaurora.org User-Agent: Roundcube Webmail/1.3.9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2020-09-28 22:59, Christoph Hellwig wrote: > On Mon, Sep 28, 2020 at 01:30:27PM -0700, Chris Goldsworthy wrote: >> CMA allocations will fail if 'pinned' pages are in a CMA area, since >> we >> cannot migrate pinned pages. The _refcount of a struct page being >> greater >> than _mapcount for that page can cause pinning for anonymous pages. >> This >> is because try_to_unmap(), which (1) is called in the CMA allocation >> path, >> and (2) decrements both _refcount and _mapcount for a page, will stop >> unmapping a page from VMAs once the _mapcount for a page reaches 0. >> This >> implies that after try_to_unmap() has finished successfully for a page >> where _recount > _mapcount, that _refcount will be greater than 0. >> Later >> in the CMA allocation path in migrate_page_move_mapping(), we will >> have one >> more reference count than intended for anonymous pages, meaning the >> allocation will fail for that page. >> >> If a process ends up causing _refcount > _mapcount for a page (by >> either >> incrementing _recount or decrementing _mapcount), such that the >> process is >> context switched out after modifying one refcount but before modifying >> the >> other, the page will be temporarily pinned. >> >> One example of where _refcount can be greater than _mapcount is inside >> of >> zap_pte_range(), which is called for all the entries of a PMD when a >> process is exiting, to unmap the process's memory. Inside of >> zap_pte_range(), after unammping a page with page_remove_rmap(), we >> have >> that _recount > _mapcount. _refcount can only be decremented after a >> TLB >> flush is performed for the page - this doesn't occur until enough >> pages >> have been batched together for flushing. The flush can either occur >> inside >> of zap_pte_range() (during the same invocation or a later one), or if >> there >> aren't enough pages collected by the time we unmap all of the pages in >> a >> process, the flush will occur in tlb_finish_mmu() in exit_mmap(). >> After >> the flush has occurred, tlb_batch_pages_flush() will decrement the >> references on the flushed pages. >> >> Another such example like the above is inside of copy_one_pte(), which >> is >> called during a fork. For PTEs for which pte_present(pte) == true, >> copy_one_pte() will increment the _refcount field followed by the >> _mapcount field of a page. >> >> So, inside of cma_alloc(), add the option of letting users pass in >> __GFP_NOFAIL to indicate that we should retry CMA allocations >> indefinitely, >> in the event that alloc_contig_range() returns -EBUSY after having >> scanned >> a whole CMA-region bitmap. > > And who is going to use this? AS-is this just seems to add code that > isn't actually used and thus actually tested. (In addition to beeing > a relly bad idea as discussed before) Hi Christoph, That had slipped my mind - what we would have submitted would have been a modified /drivers/dma-heap/heaps/cma_heap.c, which would have created a "linux,cma-nofail" heap, that when allocated from, passes GFP_NOFAIL to cma_alloc(). But, since this retry approach (finite and infinite) has effectively been nacked, I've gone back to the drawing board to find either (1) a lock based approach to solving this (as posed by Andrew Morton here: https://lkml.org/lkml/2020/8/21/1490), or (2) using preempt_disable() calls. Thanks, Chris. >> --- a/kernel/dma/contiguous.c >> +++ b/kernel/dma/contiguous.c >> @@ -196,7 +196,7 @@ struct page *dma_alloc_from_contiguous(struct >> device *dev, size_t count, >> if (align > CONFIG_CMA_ALIGNMENT) >> align = CONFIG_CMA_ALIGNMENT; >> >> - return cma_alloc(dev_get_cma_area(dev), count, align, no_warn); >> + return cma_alloc(dev_get_cma_area(dev), count, align, no_warn ? >> __GFP_NOWARN : 0); > > Also don't add pointlessly overlong lines. -- The Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project