linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alistair Popple <apopple@nvidia.com>
To: David Hildenbrand <david@redhat.com>
Cc: <linux-kernel@vger.kernel.org>, <akpm@linux-foundation.org>,
	<daniel.vetter@ffwll.ch>, <dan.j.williams@intel.com>,
	<gregkh@linuxfoundation.org>, <jhubbard@nvidia.com>,
	<jglisse@redhat.com>, <linux-mm@kvack.org>
Subject: Re: [PATCH v2] kernel/resource: Fix locking in request_free_mem_region
Date: Mon, 29 Mar 2021 12:37:49 +1100	[thread overview]
Message-ID: <3158185.bARUjMUeyn@nvdebian> (raw)
In-Reply-To: <9eef1283-28a3-845e-0e3e-80b763c9ec59@redhat.com>

On Friday, 26 March 2021 7:57:51 PM AEDT David Hildenbrand wrote:
> On 26.03.21 02:20, Alistair Popple wrote:
> > request_free_mem_region() is used to find an empty range of physical
> > addresses for hotplugging ZONE_DEVICE memory. It does this by iterating
> > over the range of possible addresses using region_intersects() to see if
> > the range is free.
> 
> Just a high-level question: how does this iteract with memory
> hot(un)plug? IOW, how defines and manages the "range of possible
> addresses" ?

Both the driver and the maximum physical address bits available define the 
range of possible addresses for device private memory. From 
__request_free_mem_region():

end = min_t(unsigned long, base->end, (1UL << MAX_PHYSMEM_BITS) - 1);
addr = end - size + 1UL;

There is no lower address range bound here so it is effectively zero. The code 
will try to allocate the highest possible physical address first and continue 
searching down for a free block. Does that answer your question?

> >
> > region_intersects() obtains a read lock before walking the resource tree
> > to protect against concurrent changes. However it drops the lock prior
> > to returning. This means by the time request_mem_region() is called in
> > request_free_mem_region() another thread may have already reserved the
> > requested region resulting in unexpected failures and a message in the
> > kernel log from hitting this condition:
> 
> I am confused. Why can't we return an error to the caller and let the
> caller continue searching? This feels much simpler than what you propose
> here. What am I missing?

The search occurs as part of the allocation. To allocate memory free space 
needs to be located and allocated as a single operation. However in this case 
the lock is dropped between locating a free region and allocating it resulting 
in an extra debug check firing and subsequent failure.

I did originally consider just allowing the caller to retry, but in the end it 
didn't seem any simpler. Callers would have to differentiate between transient 
and permanent failures and figure out how often to retry and no doubt each 
caller would do this differently. There is also the issue of starvation if one 
thread constantly looses the race to allocate after the search. Overall it 
seems simpler to me to just have a call that allocates a region (or fails due 
to lack of free space).

I also don't think what I am proposing is particularly complex. I agree the 
diff makes it look complex, but at a high level all I'm doing is moving the 
locking to outer function calls. It ends up looking more complex because there 
are some memory allocations which need reordering, but I don't think if things 
were originally written this way it would be considered complex.

 - Alistair

> --
> Thanks,
> 
> David / dhildenb
> 






  reply	other threads:[~2021-03-29  1:38 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-26  1:20 [PATCH v2] kernel/resource: Fix locking in request_free_mem_region Alistair Popple
2021-03-26  5:15 ` Balbir Singh
2021-03-29  1:55   ` Alistair Popple
2021-03-29  5:39     ` Balbir Singh
2021-03-26  8:57 ` David Hildenbrand
2021-03-29  1:37   ` Alistair Popple [this message]
2021-03-29  9:27     ` David Hildenbrand
2021-03-30  9:13     ` David Hildenbrand
2021-03-31  6:19       ` Alistair Popple
2021-03-31  6:41         ` David Hildenbrand
2021-03-29  5:42 ` [kernel/resource] cf1e4e12c9: WARNING:possible_recursive_locking_detected kernel test robot
2021-03-29  7:53   ` Alistair Popple
2021-04-01  4:56 ` [PATCH v2] kernel/resource: Fix locking in request_free_mem_region Muchun Song
2021-04-01  5:03   ` Alistair Popple

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3158185.bARUjMUeyn@nvdebian \
    --to=apopple@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=daniel.vetter@ffwll.ch \
    --cc=david@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).