All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tarun Sahu <tsahu@linux.ibm.com>
To: Mike Kravetz <mike.kravetz@oracle.com>,
	Matthew Wilcox <willy@infradead.org>
Cc: linux-mm@kvack.org, akpm@linux-foundation.org,
	muchun.song@linux.dev, aneesh.kumar@linux.ibm.com,
	sidhartha.kumar@oracle.com, gerald.schaefer@linux.ibm.com,
	linux-kernel@vger.kernel.org, jaypatel@linux.ibm.com
Subject: Re: [PATCH] mm/folio: Avoid special handling for order value 0 in folio_set_order
Date: Mon, 24 Apr 2023 21:10:23 +0530	[thread overview]
Message-ID: <87354p5lw8.fsf@linux.ibm.com> (raw)
In-Reply-To: <20230418185608.GA4907@monkey>


Hi Mike,


Mike Kravetz <mike.kravetz@oracle.com> writes:

> On 04/14/23 21:12, Matthew Wilcox wrote:
>> On Sat, Apr 15, 2023 at 01:18:32AM +0530, Tarun Sahu wrote:
>> > folio_set_order(folio, 0); which is an abuse of folio_set_order as 0-order
>> > folio does not have any tail page to set order.
>> 
>> I think you're missing the point of how folio_set_order() is used.
>> When splitting a large folio, we need to zero out the folio_nr_pages
>> in the tail, so it does have a tail page, and that tail page needs to
>> be zeroed.  We even assert that there is a tail page:
>> 
>>         if (WARN_ON_ONCE(!folio_test_large(folio)))
>>                 return;
>> 
>> Or maybe you need to explain yourself better.
>> 
>> > folio->_folio_nr_pages is
>> > set to 0 for order 0 in folio_set_order. It is required because
>> > _folio_nr_pages overlapped with page->mapping and leaving it non zero
>> > caused "bad page" error while freeing gigantic hugepages. This was fixed in
>> > Commit ba9c1201beaa ("mm/hugetlb: clear compound_nr before freeing gigantic
>> > pages"). Also commit a01f43901cfb ("hugetlb: be sure to free demoted CMA
>> > pages to CMA") now explicitly clear page->mapping and hence we won't see
>> > the bad page error even if _folio_nr_pages remains unset. Also the order 0
>> > folios are not supposed to call folio_set_order, So now we can get rid of
>> > folio_set_order(folio, 0) from hugetlb code path to clear the confusion.
>> 
>> ... this is all very confusing.
>> 
>> > The patch also moves _folio_set_head and folio_set_order calls in
>> > __prep_compound_gigantic_folio() such that we avoid clearing them in the
>> > error path.
>> 
>> But don't we need those bits set while we operate on the folio to set it
>> up?  It makes me nervous if we don't have those bits set because we can
>> end up with speculative references that point to a head page while that
>> page is not marked as a head page.  It may not be a problem, but I want
>> to see some air-tight analysis of that.
>
> I am fairly certain we are 'safe'.  Here is code before setting up the
> pointer to the head page.
>
> 		 * In the case of demote, the ref count will be zero.
> 		 */
> 		if (!demote) {
> 			if (!page_ref_freeze(p, 1)) {
> 				pr_warn("HugeTLB page can not be used due to unexpected inflated ref count\n");
> 				goto out_error;
> 			}
> 		} else {
> 			VM_BUG_ON_PAGE(page_count(p), p);
> 		}
> 		if (i != 0)
> 			set_compound_head(p, &folio->page);
>
> So, before setting the pointer to head page ref count will be zero.
>
> I 'think' it would actually be better to move the calls to _folio_set_head and
> folio_set_order in __prep_compound_gigantic_folio() as suggested here.  Why?
> In the current code, the ref count on the 'head page' is still 1 (or more)
> while those calls are made.  So, someone could take a speculative ref on the
> page BEFORE the tail pages are set up.
>

Thanks, for confirming the correctness of moving these calls. Also I
didn't look at it this way while moving them. Thanks for the comment.
I will update the commit msg and send the v2.

~Tarun

> TBH, I do not have much of an opinion about potential confusion surrounding
> folio_set_compound_order(folio, 0).  IIUC, hugetlb gigantic page setup is the
> only place outside the page allocation code that sets up compound pages/large
> folios.  So, it is going to be a bit 'special'.  As mentioned,  when this was
> originally discussed I suggested folio_clear_order().  I would be happy with
> either.


> -- 
> Mike Kravetz


  reply	other threads:[~2023-04-24 15:41 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-14 19:48 [PATCH] mm/folio: Avoid special handling for order value 0 in folio_set_order Tarun Sahu
2023-04-14 20:12 ` Matthew Wilcox
2023-04-18  9:03   ` Tarun Sahu
2023-04-18 18:56   ` Mike Kravetz
2023-04-24 15:40     ` Tarun Sahu [this message]
2023-04-24 15:31   ` Tarun Sahu
2023-04-14 21:35 ` Sidhartha Kumar
2023-04-18  9:20   ` Tarun Sahu
2023-04-18  9:25     ` Tarun Sahu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87354p5lw8.fsf@linux.ibm.com \
    --to=tsahu@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=gerald.schaefer@linux.ibm.com \
    --cc=jaypatel@linux.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    --cc=sidhartha.kumar@oracle.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.