From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C500CC433EF for ; Sat, 21 May 2022 16:46:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 400FE6B0088; Sat, 21 May 2022 12:46:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 388836B0089; Sat, 21 May 2022 12:46:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2026C8D0001; Sat, 21 May 2022 12:46:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0BDE46B0088 for ; Sat, 21 May 2022 12:46:34 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id AC52A32C61 for ; Sat, 21 May 2022 16:46:33 +0000 (UTC) X-FDA: 79490328666.08.256295D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf06.hostedemail.com (Postfix) with ESMTP id 18BD718001F for ; Sat, 21 May 2022 16:46:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1653151592; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BRbo+vuq5LTP0pWUvZ2lm2FhUMJrA8InL1ab0wyx424=; b=D3fwSR4ppQjCCYdwU35BmacHqny5JYhY5wEYabFFlBtX95ooo9qCEvLBj8dJ5vx6+lUfiS DxfwNvAanWxT4o+ZX6kwi4GtvV8zKKBOD14qxJqt1kT915r2xwFUOjwFbvCGI38AE7XRMF ZcjAUxZ/yWax4t5btlGvmKOcn8nNLJc= Received: from mail-ed1-f71.google.com (mail-ed1-f71.google.com [209.85.208.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-396-v71hFF6IOuWKNOV9m5brIw-1; Sat, 21 May 2022 12:46:31 -0400 X-MC-Unique: v71hFF6IOuWKNOV9m5brIw-1 Received: by mail-ed1-f71.google.com with SMTP id n12-20020aa7c44c000000b0042ab2159b3eso7667245edr.8 for ; Sat, 21 May 2022 09:46:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent :content-language:to:cc:references:from:organization:subject :in-reply-to:content-transfer-encoding; bh=BRbo+vuq5LTP0pWUvZ2lm2FhUMJrA8InL1ab0wyx424=; b=hCjDk1sH1OQlAAEMokfwbaxSrI2cuZG4KSElDuvZGhfl8pKBluDH4GCF1rLgLNczO8 7A7iWYXSifHdzqqFNa0UnKgAepvsp53LtD6oJVLkEOUTnSmOAuDMg4Rj6dHAR9RKmmhh +EuEgQwgs2ubU++WVjRBVYhYIOaFo90g0lEFV0gf4HL0IR2qUXVx6LQ1OpRLHz1NxbZe nw/ZGNhNM2/r631Is27xPPaYPlnoMNfB9QG5vpoz2cz65k0hzMLz5nmekTDkQhBmOPD3 BdAYK9ar3vQkLNAHneg34TiYA51SPu0rzdGkcvZ9b3Af+tLPXJQkQv/1m7FdkVEySRAe j3eA== X-Gm-Message-State: AOAM533pudbXBijpCT8WsGkztGR6jeSrRYiY7RDySLkXVF2mDmKJ52+W uvjNC7zx4YKfNsqNSBSMnF7h21mk/I9yiMp0WE2IDs/WSZBtKbQvTKCegR/rnhUjkx7gFKibTU6 w1lcRZfslFuo= X-Received: by 2002:a05:6402:5190:b0:42b:b2d:19b8 with SMTP id q16-20020a056402519000b0042b0b2d19b8mr12195058edd.390.1653151590268; Sat, 21 May 2022 09:46:30 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzOr2awZAGQfDq6zgfxm1K78hPvGd4Wgc18hsuISrUainwzjTd3nsyVZq18CQK6uxycERgI0Q== X-Received: by 2002:a05:6402:5190:b0:42b:b2d:19b8 with SMTP id q16-20020a056402519000b0042b0b2d19b8mr12195036edd.390.1653151590054; Sat, 21 May 2022 09:46:30 -0700 (PDT) Received: from [172.29.4.249] ([45.90.93.190]) by smtp.gmail.com with ESMTPSA id ia16-20020a170907a07000b006f3ef214e16sm4372877ejc.124.2022.05.21.09.46.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sat, 21 May 2022 09:46:29 -0700 (PDT) Message-ID: <000a117a-694d-d3a9-a192-14d08d50c884@redhat.com> Date: Sat, 21 May 2022 18:46:27 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.8.0 To: Minchan Kim Cc: Mike Kravetz , John Hubbard , Andrew Morton , syzbot , linux-kernel@vger.kernel.org, linux-mm@kvack.org, llvm@lists.linux.dev, nathan@kernel.org, ndesaulniers@google.com, syzkaller-bugs@googlegroups.com, trix@redhat.com, Matthew Wilcox , Stephen Rothwell References: <6d281052-485c-5e17-4f1c-ef5689831450@oracle.com> <0be9132d-a928-9ebe-a9cf-6d140b907d59@nvidia.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [syzbot] WARNING in follow_hugetlb_page In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: 18BD718001F X-Stat-Signature: yxniiaoypn1fwquah56nqux8dyu531cz Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=D3fwSR4p; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf06.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com X-Rspamd-Server: rspam04 X-HE-Tag: 1653151589-965425 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: >>>>>> It should also be noted that hugetlb code sets up the CMA area from which >>>>>> hugetlb pages can be allocated. This area is never unreserved/freed. >>>>>> >>>>>> I do not think there is a reason to disallow long term pinning of hugetlb >>>>>> pages allocated from THE hugetlb CMA area. >> >> Hm. We primarily use CMA for gigantic pages only IIRC. Ordinary huge >> pages come via the buddy. >> >> Assume we allocated a (movable) 2MiB huge page ordinarily via the buddy >> and it ended up on that CMA area by pure luck (as it's movable). If we'd >> allow to pin it long-term, allocating a gigantic page from the >> designated CMA area would fail. > > If we allow the longterm pin against the hugetlb page come via buddy, > it should be migrated out of CMA before the longterm pinning by > check_and_migrate_movable_pages, IIUC. Yes. > If so, what the allocating a giganitc page from the designated CMA area > would fail? Nothing I just summarized it. > >> >> So we'd want to allow long-term pinning a gigantic page but we'd not >> want to allow long-term pinning an ordinary huge page. We'd want to >> migrate the latter away. > > Sure. Gigantic page was already CMA claimed page so there is no user > in the future to claim the memory again so fine to allow longterm pin > but ordinary huge page shouldn't be allowed since CMA owner could > claim the memory some day. > Right. >> >> >> The general rules are: >> >> ZONE_MOVABLE: nobody is allowed to place unmovable allocations there; it >> could prevent memory offlining/unplug. >> >> CMA: nobody *but the designated owner* is allowed to place unmovable >> memory there; it could prevent the actual owner to allocate contiguous >> memory. > > I am confused what's the meaning of designated owner and actuall owner > in your context. designated==actual here. I just wanted to distinguish from someone current temporary owner of the page ("allocated it via a movable allocation") but the actual designated owner (e.g., hugetlb CMA) The page/memory owner terminology is just confusing. Let's rephrase to: only the CMA area owner is allowed to place unmovable allocations there. > > What I thought about the issue based on you explanation: > > HugeTLB allocates its page by two types of allocation > > 1. alloc_pages(GFP_MOVABLE) > > It could allocate the hugetlb page from CMA area but longterm pin > should migrate them out of cma before the pinning so allowing > the pinning on the page is no problem and current code works like > that. > > check_and_migrate_movable_pages > Yes. > 2. cma_alloc > > The cma_alloc is used only for *gigantic page* and the hugetlbfs > is the very owner of the page. IOW, if the hugetlbfs was succeeded > to allocate the gigantic page by cma_alloc, there is no other > owner to be able to claim the page any longer so it's fine to > allow longterm pinning againt the gingantic page but current. > However, current code doesn't work like that due to > is_pinnable_page. IOW, hugetlbfs need a way to distinguish > whether the page owner is hugetlbfs or not. > > Are we on same page? Yes, exactly. What I wanted to express is: for huge pages we have to make a smarter decision because there are cases where we want to migrate, and cases where we don't want to migrate. -- Thanks, David / dhildenb