From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A79F9C433F5 for ; Sat, 21 May 2022 16:36:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EE1976B0074; Sat, 21 May 2022 12:36:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E91CA6B0075; Sat, 21 May 2022 12:36:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D31708D0001; Sat, 21 May 2022 12:36:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C32BA6B0074 for ; Sat, 21 May 2022 12:36:37 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 96425120733 for ; Sat, 21 May 2022 16:36:37 +0000 (UTC) X-FDA: 79490303634.04.4B8D907 Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) by imf15.hostedemail.com (Postfix) with ESMTP id 9E5D9A0018 for ; Sat, 21 May 2022 16:36:19 +0000 (UTC) Received: by mail-pg1-f177.google.com with SMTP id j21so9985808pga.13 for ; Sat, 21 May 2022 09:36:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=6s6WVdCnyStqPvDqluR9cvNh+3kAmaoZbyO+JNs1GmM=; b=Al/GqXg2rNQzNcq9/YItA5F3eVvUj/PxLIjH5mdQkLudgxEg1H5uAbeyeN0H8F9GW9 NAF3hoTJ9y98I84MLc4fEPBTIZWcf5wUmoWoUioxzv496J9IyjFFX3Gmr/x1SGkFRcx+ QiwGkMjvQgk2FnEXyrPMA+iitH6DQtre4xS09Y084oIv9mnJrfA/a/6+w9AfDpLd2zUl xfuq1r3Yfa7vxx6+1BomjhDgyNuBZ9x3Vnyd7eNypjtXhYT1uwpg+9ctJTEdxHo8UUTS is/rnNkvwdR++z5iW7h3nxWggS36RTUJ9MOX0BWvQER4cP8UWOQnN5CvKqu+PJGp1Bj0 jMGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=6s6WVdCnyStqPvDqluR9cvNh+3kAmaoZbyO+JNs1GmM=; b=E8c1HjsZLqIvCNpfHpz/kPevfIqoqp2sG+YUelh8rh8A+HmmZDyKq8aM+tpV5DcyfS F74ivkXBiqIObqU4TQ+qQNphMkomnkWVa6/ypWaXGSYNetVIKh567fg3qo/bDok7qPtl 6kJy+HW9kHSAuf39arlskseviTakzAcNG3SdHKPAKeJ5Dga1SUzyg0txtlVvfF5oufz8 AOPdKAVYnD+qB3//IOsIMlQIaRBGNCOfVyGkUIx9q6AJ6DHbjw+0w2aQdW3G/3iYXc61 6lLeps+qgs1Uisq3qlFaQkAXxAIIIrJ7Msj1/Q3q0vfb1wXm1kcrP5gjiaiJBH7DPvjw Lugw== X-Gm-Message-State: AOAM532tI1rhP+3Zz/4fTjLGriyrz0mpVe0pTrfYKkiSS3nr7QGwQQYJ pWinHBy+//bxteylfd16o2c= X-Google-Smtp-Source: ABdhPJy8e97IxMQLA6KpgQRTomU5Pn68A/IFn9Q0caJur8XXJ5uHGGzJbcbA0zFnKB8TzwytSI4Tzw== X-Received: by 2002:a62:6410:0:b0:4f3:9654:266d with SMTP id y16-20020a626410000000b004f39654266dmr15287548pfb.59.1653150995896; Sat, 21 May 2022 09:36:35 -0700 (PDT) Received: from google.com ([2620:15c:211:201:ef57:ac0e:cc3e:9974]) by smtp.gmail.com with ESMTPSA id l17-20020a629111000000b0050dc76281ccsm3816690pfe.166.2022.05.21.09.36.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 21 May 2022 09:36:35 -0700 (PDT) Date: Sat, 21 May 2022 09:36:33 -0700 From: Minchan Kim To: David Hildenbrand Cc: Mike Kravetz , John Hubbard , Andrew Morton , syzbot , linux-kernel@vger.kernel.org, linux-mm@kvack.org, llvm@lists.linux.dev, nathan@kernel.org, ndesaulniers@google.com, syzkaller-bugs@googlegroups.com, trix@redhat.com, Matthew Wilcox , Stephen Rothwell Subject: Re: [syzbot] WARNING in follow_hugetlb_page Message-ID: References: <6d281052-485c-5e17-4f1c-ef5689831450@oracle.com> <0be9132d-a928-9ebe-a9cf-6d140b907d59@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 9E5D9A0018 X-Stat-Signature: y4o5bqfbar3ofch3z7zkfknk4xbu5qfj X-Rspam-User: Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="Al/GqXg2"; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none); spf=pass (imf15.hostedemail.com: domain of minchan.kim@gmail.com designates 209.85.215.177 as permitted sender) smtp.mailfrom=minchan.kim@gmail.com X-HE-Tag: 1653150979-247346 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000002, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, May 21, 2022 at 05:51:58PM +0200, David Hildenbrand wrote: > On 21.05.22 17:24, Minchan Kim wrote: > > On Fri, May 20, 2022 at 05:04:22PM -0700, Mike Kravetz wrote: > >> On 5/20/22 16:43, Minchan Kim wrote: > >>> On Fri, May 20, 2022 at 04:31:31PM -0700, Mike Kravetz wrote: > >>>> On 5/20/22 15:56, John Hubbard wrote: > >>>>> On 5/20/22 15:19, Minchan Kim wrote: > >>>>>> The memory offline would be an issue so we shouldn't allow pinning of any > >>>>>> pages in *movable zone*. > >>>>>> > >>>>>> Isn't alloc_contig_range just best effort? Then, it wouldn't be a big > >>>>>> problem to allow pinning on those area. The matter is what target range > >>>>>> on alloc_contig_range is backed by CMA or movable zone and usecases. > >>>>>> > >>>>>> IOW, movable zone should be never allowed. But CMA case, if pages > >>>>>> are used by normal process memory instead of hugeTLB, we shouldn't > >>>>>> allow longterm pinning since someone can claim those memory suddenly. > >>>>>> However, we are fine to allow longterm pinning if the CMA memory > >>>>>> already claimed and mapped at userspace(hugeTLB case IIUC). > >>>>>> > >>>>> > >>>>> From Mike's comments and yours, plus a rather quick reading of some > >>>>> CMA-related code in mm/hugetlb.c (free_gigantic_page(), alloc_gigantic_pages()), the following seems true: > >>>>> > >>>>> a) hugetlbfs can allocate pages *from* CMA, via cma_alloc() > >>>>> > >>>>> b) while hugetlbfs is using those CMA-allocated pages, it is debatable > >>>>> whether those pages should be allowed to be long term pinned. That's > >>>>> because there are two cases: > >>>>> > >>>>>     Case 1: pages are longterm pinned, then released, all while > >>>>>             owned by hugetlbfs. No problem. > >>>>> > >>>>>     Case 2: pages are longterm pinned, but then hugetlbfs releases the > >>>>>             pages entirely (via unmounting hugetlbfs, I presume). In > >>>>>             this case, we now have CMA page that are long-term pinned, > >>>>>             and that's the state we want to avoid. > >>>> > >>>> I do not think case 2 can happen. A hugetlb page can only be changed back > >>>> to 'normal' (buddy) pages when ref count goes to zero. > >>>> > >>>> It should also be noted that hugetlb code sets up the CMA area from which > >>>> hugetlb pages can be allocated. This area is never unreserved/freed. > >>>> > >>>> I do not think there is a reason to disallow long term pinning of hugetlb > >>>> pages allocated from THE hugetlb CMA area. > > Hm. We primarily use CMA for gigantic pages only IIRC. Ordinary huge > pages come via the buddy. > > Assume we allocated a (movable) 2MiB huge page ordinarily via the buddy > and it ended up on that CMA area by pure luck (as it's movable). If we'd > allow to pin it long-term, allocating a gigantic page from the > designated CMA area would fail. If we allow the longterm pin against the hugetlb page come via buddy, it should be migrated out of CMA before the longterm pinning by check_and_migrate_movable_pages, IIUC. If so, what the allocating a giganitc page from the designated CMA area would fail? > > So we'd want to allow long-term pinning a gigantic page but we'd not > want to allow long-term pinning an ordinary huge page. We'd want to > migrate the latter away. Sure. Gigantic page was already CMA claimed page so there is no user in the future to claim the memory again so fine to allow longterm pin but ordinary huge page shouldn't be allowed since CMA owner could claim the memory some day. > > > The general rules are: > > ZONE_MOVABLE: nobody is allowed to place unmovable allocations there; it > could prevent memory offlining/unplug. > > CMA: nobody *but the designated owner* is allowed to place unmovable > memory there; it could prevent the actual owner to allocate contiguous > memory. I am confused what's the meaning of designated owner and actuall owner in your context. What I thought about the issue based on you explanation: HugeTLB allocates its page by two types of allocation 1. alloc_pages(GFP_MOVABLE) It could allocate the hugetlb page from CMA area but longterm pin should migrate them out of cma before the pinning so allowing the pinning on the page is no problem and current code works like that. check_and_migrate_movable_pages 2. cma_alloc The cma_alloc is used only for *gigantic page* and the hugetlbfs is the very owner of the page. IOW, if the hugetlbfs was succeeded to allocate the gigantic page by cma_alloc, there is no other owner to be able to claim the page any longer so it's fine to allow longterm pinning againt the gingantic page but current. However, current code doesn't work like that due to is_pinnable_page. IOW, hugetlbfs need a way to distinguish whether the page owner is hugetlbfs or not. Are we on same page?