public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 1/3] Revert "binder_alloc: add missing mmap_lock calls when using the VMA"
@ 2023-04-24 20:55 Carlos Llamas
  2023-04-24 20:55 ` [RFC PATCH 2/3] Revert "android: binder: stop saving a pointer to " Carlos Llamas
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Carlos Llamas @ 2023-04-24 20:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan
  Cc: linux-kernel, kernel-team, Liam Howlett

This reverts commit 44e602b4e52f70f04620bbbf4fe46ecb40170bde.

This caused a performance regression particularly when pages are getting
reclaimed. We don't need to acquire the mmap_lock to determine when the
binder buffer has been fully initialized. A subsequent patch will bring
back the lockless approach for this.

[cmllamas: resolved trivial conflicts with renaming of alloc->mm]

Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
---
 drivers/android/binder_alloc.c | 31 ++++++++++---------------------
 1 file changed, 10 insertions(+), 21 deletions(-)

diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 55a3c3c2409f..92c814ec44fe 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -380,15 +380,12 @@ static struct binder_buffer *binder_alloc_new_buf_locked(
 	size_t size, data_offsets_size;
 	int ret;
 
-	mmap_read_lock(alloc->mm);
 	if (!binder_alloc_get_vma(alloc)) {
-		mmap_read_unlock(alloc->mm);
 		binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
 				   "%d: binder_alloc_buf, no vma\n",
 				   alloc->pid);
 		return ERR_PTR(-ESRCH);
 	}
-	mmap_read_unlock(alloc->mm);
 
 	data_offsets_size = ALIGN(data_size, sizeof(void *)) +
 		ALIGN(offsets_size, sizeof(void *));
@@ -916,25 +913,17 @@ void binder_alloc_print_pages(struct seq_file *m,
 	 * Make sure the binder_alloc is fully initialized, otherwise we might
 	 * read inconsistent state.
 	 */
-
-	mmap_read_lock(alloc->mm);
-	if (binder_alloc_get_vma(alloc) == NULL) {
-		mmap_read_unlock(alloc->mm);
-		goto uninitialized;
-	}
-
-	mmap_read_unlock(alloc->mm);
-	for (i = 0; i < alloc->buffer_size / PAGE_SIZE; i++) {
-		page = &alloc->pages[i];
-		if (!page->page_ptr)
-			free++;
-		else if (list_empty(&page->lru))
-			active++;
-		else
-			lru++;
+	if (binder_alloc_get_vma(alloc) != NULL) {
+		for (i = 0; i < alloc->buffer_size / PAGE_SIZE; i++) {
+			page = &alloc->pages[i];
+			if (!page->page_ptr)
+				free++;
+			else if (list_empty(&page->lru))
+				active++;
+			else
+				lru++;
+		}
 	}
-
-uninitialized:
 	mutex_unlock(&alloc->mutex);
 	seq_printf(m, "  pages: %d:%d:%d\n", active, lru, free);
 	seq_printf(m, "  pages high watermark: %zu\n", alloc->pages_high);
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 2/3] Revert "android: binder: stop saving a pointer to the VMA"
  2023-04-24 20:55 [RFC PATCH 1/3] Revert "binder_alloc: add missing mmap_lock calls when using the VMA" Carlos Llamas
@ 2023-04-24 20:55 ` Carlos Llamas
  2023-04-24 22:34   ` Liam R. Howlett
  2023-04-24 20:55 ` [RFC PATCH 3/3] binder: add lockless binder_alloc_(set|get)_vma() Carlos Llamas
  2023-04-25  5:19 ` [RFC PATCH 1/3] Revert "binder_alloc: add missing mmap_lock calls when using the VMA" Greg Kroah-Hartman
  2 siblings, 1 reply; 11+ messages in thread
From: Carlos Llamas @ 2023-04-24 20:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan
  Cc: linux-kernel, kernel-team, Liam Howlett

This reverts commit a43cfc87caaf46710c8027a8c23b8a55f1078f19.

This patch fixed an issue reported by syzkaller in [1]. However, this
turned out to be only a band-aid in binder. The root cause, as bisected
by syzkaller, was fixed by commit 5789151e48ac ("mm/mmap: undo ->mmap()
when mas_preallocate() fails"). We no longer need the patch for binder.

Reverting such patch allows us to have a lockless access to alloc->vma
in specific cases where the mmap_lock is not required. This approach
avoids the contention that caused a performance regression.

[1] https://lore.kernel.org/all/0000000000004a0dbe05e1d749e0@google.com

[cmllamas: resolved conflicts with rework of alloc->mm and removal of
 binder_alloc_set_vma() also fixed comment section]

Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
---
 drivers/android/binder_alloc.c          | 17 +++++++++--------
 drivers/android/binder_alloc.h          |  4 ++--
 drivers/android/binder_alloc_selftest.c |  2 +-
 3 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 92c814ec44fe..eb082b33115b 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -213,7 +213,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
 
 	if (mm) {
 		mmap_read_lock(mm);
-		vma = vma_lookup(mm, alloc->vma_addr);
+		vma = alloc->vma;
 	}
 
 	if (!vma && need_mm) {
@@ -314,9 +314,11 @@ static inline struct vm_area_struct *binder_alloc_get_vma(
 {
 	struct vm_area_struct *vma = NULL;
 
-	if (alloc->vma_addr)
-		vma = vma_lookup(alloc->mm, alloc->vma_addr);
-
+	if (alloc->vma) {
+		/* Look at description in binder_alloc_set_vma */
+		smp_rmb();
+		vma = alloc->vma;
+	}
 	return vma;
 }
 
@@ -775,7 +777,7 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
 	buffer->free = 1;
 	binder_insert_free_buffer(alloc, buffer);
 	alloc->free_async_space = alloc->buffer_size / 2;
-	alloc->vma_addr = vma->vm_start;
+	alloc->vma = vma;
 
 	return 0;
 
@@ -805,8 +807,7 @@ void binder_alloc_deferred_release(struct binder_alloc *alloc)
 
 	buffers = 0;
 	mutex_lock(&alloc->mutex);
-	BUG_ON(alloc->vma_addr &&
-	       vma_lookup(alloc->mm, alloc->vma_addr));
+	BUG_ON(alloc->vma);
 
 	while ((n = rb_first(&alloc->allocated_buffers))) {
 		buffer = rb_entry(n, struct binder_buffer, rb_node);
@@ -958,7 +959,7 @@ int binder_alloc_get_allocated_count(struct binder_alloc *alloc)
  */
 void binder_alloc_vma_close(struct binder_alloc *alloc)
 {
-	alloc->vma_addr = 0;
+	alloc->vma = 0;
 }
 
 /**
diff --git a/drivers/android/binder_alloc.h b/drivers/android/binder_alloc.h
index 0f811ac4bcff..138d1d5af9ce 100644
--- a/drivers/android/binder_alloc.h
+++ b/drivers/android/binder_alloc.h
@@ -75,7 +75,7 @@ struct binder_lru_page {
 /**
  * struct binder_alloc - per-binder proc state for binder allocator
  * @mutex:              protects binder_alloc fields
- * @vma_addr:           vm_area_struct->vm_start passed to mmap_handler
+ * @vma:                vm_area_struct passed to mmap_handler
  *                      (invariant after mmap)
  * @mm:                 copy of task->mm (invariant after open)
  * @buffer:             base of per-proc address space mapped via mmap
@@ -99,7 +99,7 @@ struct binder_lru_page {
  */
 struct binder_alloc {
 	struct mutex mutex;
-	unsigned long vma_addr;
+	struct vm_area_struct *vma;
 	struct mm_struct *mm;
 	void __user *buffer;
 	struct list_head buffers;
diff --git a/drivers/android/binder_alloc_selftest.c b/drivers/android/binder_alloc_selftest.c
index 43a881073a42..c2b323bc3b3a 100644
--- a/drivers/android/binder_alloc_selftest.c
+++ b/drivers/android/binder_alloc_selftest.c
@@ -287,7 +287,7 @@ void binder_selftest_alloc(struct binder_alloc *alloc)
 	if (!binder_selftest_run)
 		return;
 	mutex_lock(&binder_selftest_lock);
-	if (!binder_selftest_run || !alloc->vma_addr)
+	if (!binder_selftest_run || !alloc->vma)
 		goto done;
 	pr_info("STARTED\n");
 	binder_selftest_alloc_offset(alloc, end_offset, 0);
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 3/3] binder: add lockless binder_alloc_(set|get)_vma()
  2023-04-24 20:55 [RFC PATCH 1/3] Revert "binder_alloc: add missing mmap_lock calls when using the VMA" Carlos Llamas
  2023-04-24 20:55 ` [RFC PATCH 2/3] Revert "android: binder: stop saving a pointer to " Carlos Llamas
@ 2023-04-24 20:55 ` Carlos Llamas
  2023-04-25  5:19 ` [RFC PATCH 1/3] Revert "binder_alloc: add missing mmap_lock calls when using the VMA" Greg Kroah-Hartman
  2 siblings, 0 replies; 11+ messages in thread
From: Carlos Llamas @ 2023-04-24 20:55 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
	Suren Baghdasaryan
  Cc: linux-kernel, kernel-team, Liam Howlett

Bring back the original lockless design in binder_alloc to determine
whether the buffer setup has been completed by the ->mmap() handler.
However, this time use smp_load_acquire() and smp_store_release() to
wrap all the ordering in a single macro call.

Also, add comments to make it evident that binder uses alloc->vma to
determine when the binder_alloc has been fully initialized. In these
scenarios acquiring the mmap_lock is not required.

Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
---
 drivers/android/binder_alloc.c | 22 ++++++++++++----------
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index eb082b33115b..9d166e10315e 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -309,17 +309,16 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
 	return vma ? -ENOMEM : -ESRCH;
 }
 
+static inline void binder_alloc_set_vma(struct binder_alloc *alloc,
+		struct vm_area_struct *vma)
+{
+	smp_store_release(&alloc->vma, vma);
+}
+
 static inline struct vm_area_struct *binder_alloc_get_vma(
 		struct binder_alloc *alloc)
 {
-	struct vm_area_struct *vma = NULL;
-
-	if (alloc->vma) {
-		/* Look at description in binder_alloc_set_vma */
-		smp_rmb();
-		vma = alloc->vma;
-	}
-	return vma;
+	return smp_load_acquire(&alloc->vma);
 }
 
 static bool debug_low_async_space_locked(struct binder_alloc *alloc, int pid)
@@ -382,6 +381,7 @@ static struct binder_buffer *binder_alloc_new_buf_locked(
 	size_t size, data_offsets_size;
 	int ret;
 
+	/* Check binder_alloc is fully initialized */
 	if (!binder_alloc_get_vma(alloc)) {
 		binder_alloc_debug(BINDER_DEBUG_USER_ERROR,
 				   "%d: binder_alloc_buf, no vma\n",
@@ -777,7 +777,9 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
 	buffer->free = 1;
 	binder_insert_free_buffer(alloc, buffer);
 	alloc->free_async_space = alloc->buffer_size / 2;
-	alloc->vma = vma;
+
+	/* Signal binder_alloc is fully initialized */
+	binder_alloc_set_vma(alloc, vma);
 
 	return 0;
 
@@ -959,7 +961,7 @@ int binder_alloc_get_allocated_count(struct binder_alloc *alloc)
  */
 void binder_alloc_vma_close(struct binder_alloc *alloc)
 {
-	alloc->vma = 0;
+	binder_alloc_set_vma(alloc, NULL);
 }
 
 /**
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/3] Revert "android: binder: stop saving a pointer to the VMA"
  2023-04-24 20:55 ` [RFC PATCH 2/3] Revert "android: binder: stop saving a pointer to " Carlos Llamas
@ 2023-04-24 22:34   ` Liam R. Howlett
  2023-04-24 23:11     ` Carlos Llamas
  0 siblings, 1 reply; 11+ messages in thread
From: Liam R. Howlett @ 2023-04-24 22:34 UTC (permalink / raw)
  To: Carlos Llamas
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, linux-kernel, kernel-team, linux-mm

Cc linux-mm

* Carlos Llamas <cmllamas@google.com> [230424 16:56]:
> This reverts commit a43cfc87caaf46710c8027a8c23b8a55f1078f19.
> 
> This patch fixed an issue reported by syzkaller in [1]. However, this
> turned out to be only a band-aid in binder. The root cause, as bisected
> by syzkaller, was fixed by commit 5789151e48ac ("mm/mmap: undo ->mmap()
> when mas_preallocate() fails"). We no longer need the patch for binder.
> 
> Reverting such patch allows us to have a lockless access to alloc->vma
> in specific cases where the mmap_lock is not required.

Can you elaborate on the situation where recording a VMA pointer and
later accessing it outside the mmap_lock is okay?


>This approach
> avoids the contention that caused a performance regression.
> 
> [1] https://lore.kernel.org/all/0000000000004a0dbe05e1d749e0@google.com
> 
> [cmllamas: resolved conflicts with rework of alloc->mm and removal of
>  binder_alloc_set_vma() also fixed comment section]
> 
> Cc: Liam Howlett <liam.howlett@oracle.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Carlos Llamas <cmllamas@google.com>
> ---
>  drivers/android/binder_alloc.c          | 17 +++++++++--------
>  drivers/android/binder_alloc.h          |  4 ++--
>  drivers/android/binder_alloc_selftest.c |  2 +-
>  3 files changed, 12 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> index 92c814ec44fe..eb082b33115b 100644
> --- a/drivers/android/binder_alloc.c
> +++ b/drivers/android/binder_alloc.c
> @@ -213,7 +213,7 @@ static int binder_update_page_range(struct binder_alloc *alloc, int allocate,
>  
>  	if (mm) {
>  		mmap_read_lock(mm);
> -		vma = vma_lookup(mm, alloc->vma_addr);
> +		vma = alloc->vma;
>  	}
>  
>  	if (!vma && need_mm) {
> @@ -314,9 +314,11 @@ static inline struct vm_area_struct *binder_alloc_get_vma(
>  {
>  	struct vm_area_struct *vma = NULL;
>  
> -	if (alloc->vma_addr)
> -		vma = vma_lookup(alloc->mm, alloc->vma_addr);
> -
> +	if (alloc->vma) {
> +		/* Look at description in binder_alloc_set_vma */
> +		smp_rmb();
> +		vma = alloc->vma;
> +	}
>  	return vma;
>  }
>  
> @@ -775,7 +777,7 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
>  	buffer->free = 1;
>  	binder_insert_free_buffer(alloc, buffer);
>  	alloc->free_async_space = alloc->buffer_size / 2;
> -	alloc->vma_addr = vma->vm_start;
> +	alloc->vma = vma;
>  
>  	return 0;
>  
> @@ -805,8 +807,7 @@ void binder_alloc_deferred_release(struct binder_alloc *alloc)
>  
>  	buffers = 0;
>  	mutex_lock(&alloc->mutex);
> -	BUG_ON(alloc->vma_addr &&
> -	       vma_lookup(alloc->mm, alloc->vma_addr));
> +	BUG_ON(alloc->vma);
>  
>  	while ((n = rb_first(&alloc->allocated_buffers))) {
>  		buffer = rb_entry(n, struct binder_buffer, rb_node);
> @@ -958,7 +959,7 @@ int binder_alloc_get_allocated_count(struct binder_alloc *alloc)
>   */
>  void binder_alloc_vma_close(struct binder_alloc *alloc)
>  {
> -	alloc->vma_addr = 0;
> +	alloc->vma = 0;
>  }
>  
>  /**
> diff --git a/drivers/android/binder_alloc.h b/drivers/android/binder_alloc.h
> index 0f811ac4bcff..138d1d5af9ce 100644
> --- a/drivers/android/binder_alloc.h
> +++ b/drivers/android/binder_alloc.h
> @@ -75,7 +75,7 @@ struct binder_lru_page {
>  /**
>   * struct binder_alloc - per-binder proc state for binder allocator
>   * @mutex:              protects binder_alloc fields
> - * @vma_addr:           vm_area_struct->vm_start passed to mmap_handler
> + * @vma:                vm_area_struct passed to mmap_handler
>   *                      (invariant after mmap)
>   * @mm:                 copy of task->mm (invariant after open)
>   * @buffer:             base of per-proc address space mapped via mmap
> @@ -99,7 +99,7 @@ struct binder_lru_page {
>   */
>  struct binder_alloc {
>  	struct mutex mutex;
> -	unsigned long vma_addr;
> +	struct vm_area_struct *vma;
>  	struct mm_struct *mm;
>  	void __user *buffer;
>  	struct list_head buffers;
> diff --git a/drivers/android/binder_alloc_selftest.c b/drivers/android/binder_alloc_selftest.c
> index 43a881073a42..c2b323bc3b3a 100644
> --- a/drivers/android/binder_alloc_selftest.c
> +++ b/drivers/android/binder_alloc_selftest.c
> @@ -287,7 +287,7 @@ void binder_selftest_alloc(struct binder_alloc *alloc)
>  	if (!binder_selftest_run)
>  		return;
>  	mutex_lock(&binder_selftest_lock);
> -	if (!binder_selftest_run || !alloc->vma_addr)
> +	if (!binder_selftest_run || !alloc->vma)
>  		goto done;
>  	pr_info("STARTED\n");
>  	binder_selftest_alloc_offset(alloc, end_offset, 0);
> -- 
> 2.40.0.634.g4ca3ef3211-goog
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/3] Revert "android: binder: stop saving a pointer to the VMA"
  2023-04-24 22:34   ` Liam R. Howlett
@ 2023-04-24 23:11     ` Carlos Llamas
  2023-04-25  1:43       ` Liam R. Howlett
  0 siblings, 1 reply; 11+ messages in thread
From: Carlos Llamas @ 2023-04-24 23:11 UTC (permalink / raw)
  To: Liam R. Howlett, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, linux-kernel, kernel-team, linux-mm

O Mon, Apr 24, 2023 at 06:34:19PM -0400, Liam R. Howlett wrote:
> Cc linux-mm
> 
> * Carlos Llamas <cmllamas@google.com> [230424 16:56]:
> > This reverts commit a43cfc87caaf46710c8027a8c23b8a55f1078f19.
> > 
> > This patch fixed an issue reported by syzkaller in [1]. However, this
> > turned out to be only a band-aid in binder. The root cause, as bisected
> > by syzkaller, was fixed by commit 5789151e48ac ("mm/mmap: undo ->mmap()
> > when mas_preallocate() fails"). We no longer need the patch for binder.
> > 
> > Reverting such patch allows us to have a lockless access to alloc->vma
> > in specific cases where the mmap_lock is not required.
> 
> Can you elaborate on the situation where recording a VMA pointer and
> later accessing it outside the mmap_lock is okay?

The specifics are in the third patch of this patchset but the gist of it
is that during ->mmap() handler, binder will complete the initialization
of the binder_alloc structure. With the last step of this process being
the caching of the vma pointer. Since the ordering is protected with a
barrier we can then check alloc->vma to determine if the initialization
has been completed.

Since this check is part of the critical path for every single binder
transaction, the performance plummeted when we started contending for
the mmap_lock. In this particular case, binder doesn't actually use the
vma. It only needs to know if the internal structure has been fully
initialized and it is safe to use it.

FWIW, this had been the design for ~15 years. The original patch is
this: https://git.kernel.org/torvalds/c/457b9a6f09f0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/3] Revert "android: binder: stop saving a pointer to the VMA"
  2023-04-24 23:11     ` Carlos Llamas
@ 2023-04-25  1:43       ` Liam R. Howlett
  2023-04-26 21:17         ` Carlos Llamas
  0 siblings, 1 reply; 11+ messages in thread
From: Liam R. Howlett @ 2023-04-25  1:43 UTC (permalink / raw)
  To: Carlos Llamas
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, linux-kernel, kernel-team, linux-mm

* Carlos Llamas <cmllamas@google.com> [230424 19:11]:
> O Mon, Apr 24, 2023 at 06:34:19PM -0400, Liam R. Howlett wrote:
> > Cc linux-mm
> > 
> > * Carlos Llamas <cmllamas@google.com> [230424 16:56]:
> > > This reverts commit a43cfc87caaf46710c8027a8c23b8a55f1078f19.
> > > 
> > > This patch fixed an issue reported by syzkaller in [1]. However, this
> > > turned out to be only a band-aid in binder. The root cause, as bisected
> > > by syzkaller, was fixed by commit 5789151e48ac ("mm/mmap: undo ->mmap()
> > > when mas_preallocate() fails"). We no longer need the patch for binder.
> > > 
> > > Reverting such patch allows us to have a lockless access to alloc->vma
> > > in specific cases where the mmap_lock is not required.
> > 
> > Can you elaborate on the situation where recording a VMA pointer and
> > later accessing it outside the mmap_lock is okay?
> 
> The specifics are in the third patch of this patchset but the gist of it
> is that during ->mmap() handler, binder will complete the initialization
> of the binder_alloc structure. With the last step of this process being
> the caching of the vma pointer. Since the ordering is protected with a
> barrier we can then check alloc->vma to determine if the initialization
> has been completed.
> 
> Since this check is part of the critical path for every single binder
> transaction, the performance plummeted when we started contending for
> the mmap_lock. In this particular case, binder doesn't actually use the
> vma.

So why does binder_update_page_range() take the mmap_read_lock then use
the cached vma in the reverted patch?

If you want to use it as a flag to see if the driver is initialized, why
not use the cached address != 0?

Or better yet,

>It only needs to know if the internal structure has been fully
> initialized and it is safe to use it.

This seems like a good reason to use your own rwsem.  This is,
essentially, rolling your own lock with
smp_store_release()/smp_load_acquire() and a pointer which should not be
cached.

> 
> FWIW, this had been the design for ~15 years. The original patch is
> this: https://git.kernel.org/torvalds/c/457b9a6f09f0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/3] Revert "binder_alloc: add missing mmap_lock calls when using the VMA"
  2023-04-24 20:55 [RFC PATCH 1/3] Revert "binder_alloc: add missing mmap_lock calls when using the VMA" Carlos Llamas
  2023-04-24 20:55 ` [RFC PATCH 2/3] Revert "android: binder: stop saving a pointer to " Carlos Llamas
  2023-04-24 20:55 ` [RFC PATCH 3/3] binder: add lockless binder_alloc_(set|get)_vma() Carlos Llamas
@ 2023-04-25  5:19 ` Greg Kroah-Hartman
  2023-04-26 21:22   ` Carlos Llamas
  2 siblings, 1 reply; 11+ messages in thread
From: Greg Kroah-Hartman @ 2023-04-25  5:19 UTC (permalink / raw)
  To: Carlos Llamas
  Cc: Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Christian Brauner, Suren Baghdasaryan,
	linux-kernel, kernel-team, Liam Howlett

On Mon, Apr 24, 2023 at 08:55:46PM +0000, Carlos Llamas wrote:
> This reverts commit 44e602b4e52f70f04620bbbf4fe46ecb40170bde.
> 
> This caused a performance regression particularly when pages are getting
> reclaimed. We don't need to acquire the mmap_lock to determine when the
> binder buffer has been fully initialized. A subsequent patch will bring
> back the lockless approach for this.
> 
> [cmllamas: resolved trivial conflicts with renaming of alloc->mm]
> 
> Cc: Liam Howlett <liam.howlett@oracle.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Carlos Llamas <cmllamas@google.com>
> ---
>  drivers/android/binder_alloc.c | 31 ++++++++++---------------------
>  1 file changed, 10 insertions(+), 21 deletions(-)

Why is this series "RFC"?  What needs to be done to be able to submit it
as a real patch series?

Also, as the commits you are reverting are in older kernels, please
properly cc: stable in the signed-off-by area, and add a fixes: tag for
the commit you are reverting when you resend these as a real series.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/3] Revert "android: binder: stop saving a pointer to the VMA"
  2023-04-25  1:43       ` Liam R. Howlett
@ 2023-04-26 21:17         ` Carlos Llamas
  2023-05-18 14:40           ` Liam R. Howlett
  0 siblings, 1 reply; 11+ messages in thread
From: Carlos Llamas @ 2023-04-26 21:17 UTC (permalink / raw)
  To: Liam R. Howlett, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, linux-kernel, kernel-team, linux-mm

On Mon, Apr 24, 2023 at 09:43:28PM -0400, Liam R. Howlett wrote:
> * Carlos Llamas <cmllamas@google.com> [230424 19:11]:
> > 
> > The specifics are in the third patch of this patchset but the gist of it
> > is that during ->mmap() handler, binder will complete the initialization
> > of the binder_alloc structure. With the last step of this process being
> > the caching of the vma pointer. Since the ordering is protected with a
> > barrier we can then check alloc->vma to determine if the initialization
> > has been completed.
> > 
> > Since this check is part of the critical path for every single binder
> > transaction, the performance plummeted when we started contending for
> > the mmap_lock. In this particular case, binder doesn't actually use the
> > vma.
> 
> So why does binder_update_page_range() take the mmap_read_lock then use
> the cached vma in the reverted patch?
> 
> If you want to use it as a flag to see if the driver is initialized, why
> not use the cached address != 0?
> 
> Or better yet,
> 
> >It only needs to know if the internal structure has been fully
> > initialized and it is safe to use it.
> 
> This seems like a good reason to use your own rwsem.  This is,
> essentially, rolling your own lock with
> smp_store_release()/smp_load_acquire() and a pointer which should not be
> cached.

We can't use an rwsem to protect the initialization. We already have an
alloc->mutex which would be an option. However, using it under ->mmap()
would only lead to dead-locks with the mmap_lock.

I agree with you that we could use some other flag instead of the vma
pointer to signal the initialization. I've actually tried several times
to come up with a scenario in which caching the vma pointer becomes an
issue to stop doing this altogether. However, I can't find anything
concrete.

I don't think the current solution in which we do all these unnecessary
vma lookups is correct. Instead, I'm currently working on a redesign of
this section in which binder stops to allocate/insert pages manually. We
should be making use of the page-fault handler and let the infra handle
all the work. The overall idea is here:
https://lore.kernel.org/all/ZEGh4mliGHvyWIvo@google.com/

It's hard to make the case for just dropping the vma pointer after ~15
years and take the performance hit without having an actual issue to
support this idea. So I'll revert this for now and keep working on the
page-fault solution.

Thanks Liam, I'll keep you in the loop.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 1/3] Revert "binder_alloc: add missing mmap_lock calls when using the VMA"
  2023-04-25  5:19 ` [RFC PATCH 1/3] Revert "binder_alloc: add missing mmap_lock calls when using the VMA" Greg Kroah-Hartman
@ 2023-04-26 21:22   ` Carlos Llamas
  0 siblings, 0 replies; 11+ messages in thread
From: Carlos Llamas @ 2023-04-26 21:22 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Arve Hjønnevåg, Todd Kjos, Martijn Coenen,
	Joel Fernandes, Christian Brauner, Suren Baghdasaryan,
	linux-kernel, kernel-team, Liam Howlett

On Tue, Apr 25, 2023 at 07:19:20AM +0200, Greg Kroah-Hartman wrote:
> On Mon, Apr 24, 2023 at 08:55:46PM +0000, Carlos Llamas wrote:
> > This reverts commit 44e602b4e52f70f04620bbbf4fe46ecb40170bde.
> > 
> > This caused a performance regression particularly when pages are getting
> > reclaimed. We don't need to acquire the mmap_lock to determine when the
> > binder buffer has been fully initialized. A subsequent patch will bring
> > back the lockless approach for this.
> > 
> > [cmllamas: resolved trivial conflicts with renaming of alloc->mm]
> > 
> > Cc: Liam Howlett <liam.howlett@oracle.com>
> > Cc: Suren Baghdasaryan <surenb@google.com>
> > Signed-off-by: Carlos Llamas <cmllamas@google.com>
> > ---
> >  drivers/android/binder_alloc.c | 31 ++++++++++---------------------
> >  1 file changed, 10 insertions(+), 21 deletions(-)
> 
> Why is this series "RFC"?  What needs to be done to be able to submit it
> as a real patch series?

No real reason. Just wanted to get feedback from the mm folks first.
I'll go ahead and submit the v1.

> 
> Also, as the commits you are reverting are in older kernels, please
> properly cc: stable in the signed-off-by area, and add a fixes: tag for
> the commit you are reverting when you resend these as a real series.

Sounds good, thanks!

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/3] Revert "android: binder: stop saving a pointer to the VMA"
  2023-04-26 21:17         ` Carlos Llamas
@ 2023-05-18 14:40           ` Liam R. Howlett
  2023-05-18 17:03             ` Carlos Llamas
  0 siblings, 1 reply; 11+ messages in thread
From: Liam R. Howlett @ 2023-05-18 14:40 UTC (permalink / raw)
  To: Carlos Llamas
  Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
	Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, linux-kernel, kernel-team, linux-mm

* Carlos Llamas <cmllamas@google.com> [230426 17:17]:
> On Mon, Apr 24, 2023 at 09:43:28PM -0400, Liam R. Howlett wrote:
> > * Carlos Llamas <cmllamas@google.com> [230424 19:11]:
> > > 
> > > The specifics are in the third patch of this patchset but the gist of it
> > > is that during ->mmap() handler, binder will complete the initialization
> > > of the binder_alloc structure. With the last step of this process being
> > > the caching of the vma pointer. Since the ordering is protected with a
> > > barrier we can then check alloc->vma to determine if the initialization
> > > has been completed.
> > > 
> > > Since this check is part of the critical path for every single binder
> > > transaction, the performance plummeted when we started contending for
> > > the mmap_lock. In this particular case, binder doesn't actually use the
> > > vma.
> > 
> > So why does binder_update_page_range() take the mmap_read_lock then use
> > the cached vma in the reverted patch?
> > 
> > If you want to use it as a flag to see if the driver is initialized, why
> > not use the cached address != 0?
> > 
> > Or better yet,
> > 
> > >It only needs to know if the internal structure has been fully
> > > initialized and it is safe to use it.
> > 
> > This seems like a good reason to use your own rwsem.  This is,
> > essentially, rolling your own lock with
> > smp_store_release()/smp_load_acquire() and a pointer which should not be
> > cached.
> 
> We can't use an rwsem to protect the initialization. We already have an
> alloc->mutex which would be an option. However, using it under ->mmap()
> would only lead to dead-locks with the mmap_lock.
> 
> I agree with you that we could use some other flag instead of the vma
> pointer to signal the initialization. I've actually tried several times
> to come up with a scenario in which caching the vma pointer becomes an
> issue to stop doing this altogether. However, I can't find anything
> concrete.
> 
> I don't think the current solution in which we do all these unnecessary
> vma lookups is correct. Instead, I'm currently working on a redesign of
> this section in which binder stops to allocate/insert pages manually. We
> should be making use of the page-fault handler and let the infra handle
> all the work. The overall idea is here:
> https://lore.kernel.org/all/ZEGh4mliGHvyWIvo@google.com/
> 
> It's hard to make the case for just dropping the vma pointer after ~15
> years and take the performance hit without having an actual issue to
> support this idea. So I'll revert this for now and keep working on the
> page-fault solution.
> 

I came across this [1] when I was looking into something else and
thought I'd double back and make sure your fix for this UAF is also
included, since your revert will restore this bug.

I do still see the mmap_read_lock() in binder_update_page_range() vs the
required mmap_write_lock(), at least in my branch.

[1] https://lore.kernel.org/all/20221104175450.306810-1-cmllamas@google.com/

Thanks,
Liam

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 2/3] Revert "android: binder: stop saving a pointer to the VMA"
  2023-05-18 14:40           ` Liam R. Howlett
@ 2023-05-18 17:03             ` Carlos Llamas
  0 siblings, 0 replies; 11+ messages in thread
From: Carlos Llamas @ 2023-05-18 17:03 UTC (permalink / raw)
  To: Liam R. Howlett, Greg Kroah-Hartman, Arve Hjønnevåg,
	Todd Kjos, Martijn Coenen, Joel Fernandes, Christian Brauner,
	Suren Baghdasaryan, linux-kernel, kernel-team, linux-mm

On Thu, May 18, 2023 at 10:40:52AM -0400, Liam R. Howlett wrote:
> 
> I came across this [1] when I was looking into something else and
> thought I'd double back and make sure your fix for this UAF is also
> included, since your revert will restore this bug.
> 
> I do still see the mmap_read_lock() in binder_update_page_range() vs the
> required mmap_write_lock(), at least in my branch.
> 
> [1] https://lore.kernel.org/all/20221104175450.306810-1-cmllamas@google.com/
> 

Thanks Liam, I believe you are correct. The UAF should trigger on newer
releases after the revert of your patch. I'll try to reproduce the issue
to confirm and will send the fix afterwards. This was a nice find!

Thanks,
--
Carlos Llamas

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-05-18 17:03 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-24 20:55 [RFC PATCH 1/3] Revert "binder_alloc: add missing mmap_lock calls when using the VMA" Carlos Llamas
2023-04-24 20:55 ` [RFC PATCH 2/3] Revert "android: binder: stop saving a pointer to " Carlos Llamas
2023-04-24 22:34   ` Liam R. Howlett
2023-04-24 23:11     ` Carlos Llamas
2023-04-25  1:43       ` Liam R. Howlett
2023-04-26 21:17         ` Carlos Llamas
2023-05-18 14:40           ` Liam R. Howlett
2023-05-18 17:03             ` Carlos Llamas
2023-04-24 20:55 ` [RFC PATCH 3/3] binder: add lockless binder_alloc_(set|get)_vma() Carlos Llamas
2023-04-25  5:19 ` [RFC PATCH 1/3] Revert "binder_alloc: add missing mmap_lock calls when using the VMA" Greg Kroah-Hartman
2023-04-26 21:22   ` Carlos Llamas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox