From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEC19C433DF for ; Fri, 21 Aug 2020 22:01:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9AAD920855 for ; Fri, 21 Aug 2020 22:01:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1598047305; bh=aq7qmIMUGdMaoUlA+6jQpfeWcG91tpxIXFUSsaIi1FU=; h=Date:From:To:Cc:Subject:In-Reply-To:References:List-ID:From; b=vLjlYd2a746c/Ox11UH7jfoppRUf1yisXQ5b7yExPSNJFI3V+3lQJ1C5g7v0drNre vqgftk5rPiPjGkLOlYFaeeulZhLYetUNCmPzME3mSp+f62rEz5BT9bXLnulw1i+Ndg MeL5v7Girm4OC1Uq4zaMGX9Ee9IBRhO1Qqk/x8V8= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726873AbgHUWBp (ORCPT ); Fri, 21 Aug 2020 18:01:45 -0400 Received: from mail.kernel.org ([198.145.29.99]:37208 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726731AbgHUWBo (ORCPT ); Fri, 21 Aug 2020 18:01:44 -0400 Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 95B9D207C3; Fri, 21 Aug 2020 22:01:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1598047304; bh=aq7qmIMUGdMaoUlA+6jQpfeWcG91tpxIXFUSsaIi1FU=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=yQNLm87wvpu/PSjqkUXCuyDFfqEx5vmk96zjhAzeD2wDj86JlCP+M142C36DgZQkP pvwYdh0H9KVArGe0M6Ah0Rd8+GCxjWa9LNEE+BeYyaOuaPAENds6qAanbnF/rZP3Kv vcsfzrgfrj5bsVjBAnLSE9AsG9/HbFjoj11LyXIc= Date: Fri, 21 Aug 2020 15:01:43 -0700 From: Andrew Morton To: cgoldswo@codeaurora.org Cc: linux-mm@kvack.org, linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org, pratikp@codeaurora.org, pdaly@codeaurora.org, sudraja@codeaurora.org, iamjoonsoo.kim@lge.com, linux-arm-msm-owner@vger.kernel.org Subject: Re: cma_alloc(), add sleep-and-retry for temporary page pinning Message-Id: <20200821150143.8a8645b3fabc11016311b78d@linux-foundation.org> In-Reply-To: <896f92e8c37936e7cb2914e79273e9e8@codeaurora.org> References: <896f92e8c37936e7cb2914e79273e9e8@codeaurora.org> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-arm-msm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-arm-msm@vger.kernel.org On Tue, 11 Aug 2020 15:20:47 -0700 cgoldswo@codeaurora.org wrote: > On 2020-08-06 18:31, Andrew Morton wrote: > > On Wed, 5 Aug 2020 19:56:21 -0700 Chris Goldsworthy > > wrote: > > > >> On mobile devices, failure to allocate from a CMA area constitutes a > >> functional failure. Sometimes during CMA allocations, we have > >> observed > >> that pages in a CMA area allocated through alloc_pages(), that we're > >> trying > >> to migrate away to make room for a CMA allocation, are temporarily > >> pinned. > >> This temporary pinning can occur when a process that owns the pinned > >> page > >> is being forked (the example is explained further in the commit text). > >> This patch addresses this issue by adding a sleep-and-retry loop in > >> cma_alloc() . There's another example we know of similar to the above > >> that > >> occurs during exit_mmap() (in zap_pte_range() specifically), but I > >> need to > >> determine if this is still relevant today. > > > > > Sounds fairly serious but boy, we're late for 5.9. > > > > I can queue it for 5.10 with a cc:stable so that it gets backported > > into earlier kernels a couple of months from now, if we think the > > seriousness justifies backporting(?). > > > > Queuing this seems like the best way to proceed I'd really prefer not. It's very hacky and it isn't a fix - it's a likelihood-reducer. > > > > And... it really is a sad little patch, isn't it? Instead of fixing > > the problem, it reduces the problem's probability by 5x. Can't we do > > better than this? > > I have one alternative in mind. I have been able to review the > exit_mmap() > case, so before proceeding, let's do a breakdown of the problem: we can > categorize the pinning issue we're trying to address here as being one > of > (1) incrementing _refcount and getting context-switched out before > incrementing _mapcount (applies to forking a process / copy_one_pte()), > and > (2) decrementing _mapcount and getting context-switched out before > decrementing _refcount (applies to tearing down a process / > exit_mmap()). > So, one alternative would be to insert preempt_disable/enable() calls at > affected sites. So, for the copy_one_pte() pinning case, we could do the > following inside of copy_one_pte(): > > if (page) { > + preempt_disable(); > get_page(page); > page_dup_rmap(page, false); > + preempt_enable(); > rss[mm_counter(page)]++; > } This would make the retry loop much more reliable, and preempt_disable() is fast. Such additions should be clearly commented (a bare preempt_disable() explains nothing). Perhaps by wrapping the prerempt_disable() in a suitably-named wrapper function. But it's still really unpleasant. > > The good thing about this patch is that it has been stable in our kernel > for four years (though for some SoCs we increased the retry counts). That's discouraging! > One > thing to stress is that there are other instances of CMA page pinning, > that > this patch isn't attempting to address. Oh. How severe are these? > Please let me know if you're > okay > with queuing this for the 5.10 merge window - if you are, I can add an > option to configure the number of retries, and will resend the patch > once > the 5.9 merge window closes. Well. Why not wait infinitely? Because there are other sources of CMA page pinning, I guess. Could we take a sleeping lock on the exit_mmap() path and on the migration path? So that the migration path automatically waits for the exact amount of time?