All of lore.kernel.org
 help / color / mirror / Atom feed
From: Carlos Llamas <cmllamas@google.com>
To: Suren Baghdasaryan <surenb@google.com>
Cc: "Greg Kroah-Hartman" <gregkh@linuxfoundation.org>,
	"Arve Hjønnevåg" <arve@android.com>,
	"Todd Kjos" <tkjos@android.com>,
	"Martijn Coenen" <maco@android.com>,
	"Joel Fernandes" <joel@joelfernandes.org>,
	"Christian Brauner" <brauner@kernel.org>,
	linux-kernel@vger.kernel.org, kernel-team@android.com,
	"David Hildenbrand" <david@redhat.com>,
	"Barry Song" <v-songbaohua@oppo.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>
Subject: Re: [PATCH v2 2/8] binder: concurrent page installation
Date: Thu, 7 Nov 2024 15:31:31 +0000	[thread overview]
Message-ID: <ZyzdUyHr7S_cpsso@google.com> (raw)
In-Reply-To: <CAJuCfpGAr=Tm=JJtnrJ0ipFo2yqYSEVAMtx5aUU-k1F6prjYjQ@mail.gmail.com>

On Thu, Nov 07, 2024 at 07:10:28AM -0800, Suren Baghdasaryan wrote:
> On Wed, Nov 6, 2024 at 8:02 PM Carlos Llamas <cmllamas@google.com> wrote:
> >
> > Allow multiple callers to install pages simultaneously by downgrading
> > the mmap_sem to non-exclusive mode. Races to the same PTE are handled
> > using get_user_pages_remote() to retrieve the already installed page.
> > This method significantly reduces contention in the mmap semaphore.
> >
> > To ensure safety, vma_lookup() is used (instead of alloc->vma) to avoid
> > operating on an isolated VMA. In addition, zap_page_range_single() is
> > called under the alloc->mutex to avoid racing with the shrinker.
> >
> > Many thanks to Barry Song who posted a similar approach [1].
> >
> > Link: https://lore.kernel.org/all/20240902225009.34576-1-21cnbao@gmail.com/ [1]
> > Cc: David Hildenbrand <david@redhat.com>
> > Cc: Barry Song <v-songbaohua@oppo.com>
> > Cc: Suren Baghdasaryan <surenb@google.com>
> > Cc: Liam R. Howlett <Liam.Howlett@oracle.com>
> > Signed-off-by: Carlos Llamas <cmllamas@google.com>
> > ---
> >  drivers/android/binder_alloc.c | 64 +++++++++++++++++++++-------------
> >  1 file changed, 40 insertions(+), 24 deletions(-)
> >
> > diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> > index 7241bf4a3ff2..acdc05552603 100644
> > --- a/drivers/android/binder_alloc.c
> > +++ b/drivers/android/binder_alloc.c
> > @@ -221,26 +221,14 @@ static int binder_install_single_page(struct binder_alloc *alloc,
> >                                       struct binder_lru_page *lru_page,
> >                                       unsigned long addr)
> >  {
> > +       struct vm_area_struct *vma;
> >         struct page *page;
> > -       int ret = 0;
> > +       long npages;
> > +       int ret;
> >
> >         if (!mmget_not_zero(alloc->mm))
> >                 return -ESRCH;
> >
> > -       /*
> > -        * Protected with mmap_sem in write mode as multiple tasks
> > -        * might race to install the same page.
> > -        */
> > -       mmap_write_lock(alloc->mm);
> > -       if (binder_get_installed_page(lru_page))
> > -               goto out;
> > -
> > -       if (!alloc->vma) {
> > -               pr_err("%d: %s failed, no vma\n", alloc->pid, __func__);
> > -               ret = -ESRCH;
> > -               goto out;
> > -       }
> > -
> >         page = alloc_page(GFP_KERNEL | __GFP_HIGHMEM | __GFP_ZERO);
> >         if (!page) {
> >                 pr_err("%d: failed to allocate page\n", alloc->pid);
> > @@ -248,19 +236,47 @@ static int binder_install_single_page(struct binder_alloc *alloc,
> >                 goto out;
> >         }
> >
> > -       ret = vm_insert_page(alloc->vma, addr, page);
> > -       if (ret) {
> > -               pr_err("%d: %s failed to insert page at offset %lx with %d\n",
> > -                      alloc->pid, __func__, addr - alloc->buffer, ret);
> > +       mmap_read_lock(alloc->mm);
> > +       vma = vma_lookup(alloc->mm, addr);
> > +       if (!vma || vma != alloc->vma) {
> > +               mmap_read_unlock(alloc->mm);
> 
> nit: instead of unlocking here you could have another label before
> mmap_read_unlock() at the end and jump to it.

Sounds good, I'll do this.

> 
> >                 __free_page(page);
> > -               ret = -ENOMEM;
> > +               pr_err("%d: %s failed, no vma\n", alloc->pid, __func__);
> > +               ret = -ESRCH;
> >                 goto out;
> >         }
> >
> > -       /* Mark page installation complete and safe to use */
> > -       binder_set_installed_page(lru_page, page);
> > +       ret = vm_insert_page(vma, addr, page);
> > +       switch (ret) {
> > +       case -EBUSY:
> > +               /*
> > +                * EBUSY is ok. Someone installed the pte first but the
> > +                * lru_page->page_ptr has not been updated yet. Discard
> > +                * our page and look up the one already installed.
> > +                */
> > +               ret = 0;
> > +               __free_page(page);
> > +               npages = get_user_pages_remote(alloc->mm, addr, 1, 0, &page, NULL);
> > +               if (npages <= 0) {
> > +                       pr_err("%d: failed to find page at offset %lx\n",
> > +                              alloc->pid, addr - alloc->buffer);
> > +                       ret = -ESRCH;
> > +                       break;
> > +               }
> > +               fallthrough;
> > +       case 0:
> > +               /* Mark page installation complete and safe to use */
> > +               binder_set_installed_page(lru_page, page);
> > +               break;
> > +       default:
> > +               __free_page(page);
> > +               pr_err("%d: %s failed to insert page at offset %lx with %d\n",
> > +                      alloc->pid, __func__, addr - alloc->buffer, ret);
> > +               ret = -ENOMEM;
> 
> vm_insert_page() can return EINVAL (see
> validate_page_before_insert()). Instead of converting other codes into
> ENOMEM why not return "ret" as is?

This is purely historical, binder has always propagated -ENOMEM to
userspace on errors from vm_insert_page() and I'm not sure why.

FWIW, I've dropped the behavior in the last patch and now I just forward
whatever vm_insert_page() returns. I had a look at libbinder code and it
doesn't really make a difference.

Perhaps, I should be explicit about this move and do it in a separate
commit.

Thanks,
--
Carlos Llamas

  reply	other threads:[~2024-11-07 15:31 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-07  4:02 [PATCH v2 0/8] binder: faster page installations Carlos Llamas
2024-11-07  4:02 ` [PATCH v2 1/8] Revert "binder: switch alloc->mutex to spinlock_t" Carlos Llamas
2024-11-07  8:56   ` Greg Kroah-Hartman
2024-11-07 14:30     ` Carlos Llamas
2024-11-07  4:02 ` [PATCH v2 2/8] binder: concurrent page installation Carlos Llamas
2024-11-07 15:10   ` Suren Baghdasaryan
2024-11-07 15:31     ` Carlos Llamas [this message]
2024-11-07  4:02 ` [PATCH v2 3/8] binder: select correct nid for pages in LRU Carlos Llamas
2024-11-07  4:02 ` [PATCH v2 4/8] binder: remove struct binder_lru_page Carlos Llamas
2024-11-07  4:02 ` [PATCH v2 5/8] binder: use alloc->mapped to save the vma state Carlos Llamas
2024-11-07  4:02 ` [PATCH v2 6/8] binder: remove cached alloc->vma pointer Carlos Llamas
2024-11-07 15:33   ` Suren Baghdasaryan
2024-11-07  4:02 ` [PATCH v2 7/8] binder: rename alloc->buffer to vm_start Carlos Llamas
2024-11-07  4:02 ` [PATCH v2 8/8] binder: use per-vma lock in page installation Carlos Llamas
2024-11-07 16:16   ` Suren Baghdasaryan
2024-11-07 17:55     ` Carlos Llamas
2024-11-07 18:04       ` Suren Baghdasaryan
2024-11-07 18:19         ` Carlos Llamas
2024-11-07 18:27           ` Suren Baghdasaryan
2024-11-07 18:52             ` Suren Baghdasaryan
2024-11-07 19:33               ` Carlos Llamas
2024-11-07 19:40                 ` Carlos Llamas
2024-11-07 16:18 ` [PATCH v2 0/8] binder: faster page installations Suren Baghdasaryan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZyzdUyHr7S_cpsso@google.com \
    --to=cmllamas@google.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=arve@android.com \
    --cc=brauner@kernel.org \
    --cc=david@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=joel@joelfernandes.org \
    --cc=kernel-team@android.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=maco@android.com \
    --cc=surenb@google.com \
    --cc=tkjos@android.com \
    --cc=v-songbaohua@oppo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.