* [PATCH v3] binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issues
@ 2024-06-14 4:09 Lei Liu
2024-06-14 18:38 ` Carlos Llamas
0 siblings, 1 reply; 9+ messages in thread
From: Lei Liu @ 2024-06-14 4:09 UTC (permalink / raw)
To: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
Martijn Coenen, Joel Fernandes, Christian Brauner, Carlos Llamas,
Suren Baghdasaryan, linux-kernel
Cc: opensource.kernel, Lei Liu
1.In binder_alloc, there is a frequent need for order3 memory
allocation, especially on small-memory mobile devices, which can lead
to OOM and cause foreground applications to be killed, resulting in
flashbacks.The kernel call stack after the issue occurred is as follows:
dumpsys invoked oom-killer:
gfp_mask=0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), order=3,
oom_score_adj=-950
CPU: 6 PID: 31329 Comm: dumpsys Tainted: G WC O
5.10.168-android12-9-00003-gc873b6b86254-ab10823632 #1
Call trace:
dump_backtrace.cfi_jt+0x0/0x8
dump_stack_lvl+0xdc/0x138
dump_header+0x5c/0x2ac
oom_kill_process+0x124/0x304
out_of_memory+0x25c/0x5e0
__alloc_pages_slowpath+0x690/0xf6c
__alloc_pages_nodemask+0x1f4/0x3dc
kmalloc_order+0x54/0x338
kmalloc_order_trace+0x34/0x1bc
__kmalloc+0x5e8/0x9c0
binder_alloc_mmap_handler+0x88/0x1f8
binder_mmap+0x90/0x10c
mmap_region+0x44c/0xc14
do_mmap+0x518/0x680
vm_mmap_pgoff+0x15c/0x378
ksys_mmap_pgoff+0x80/0x108
__arm64_sys_mmap+0x38/0x48
el0_svc_common+0xd4/0x270
el0_svc+0x28/0x98
el0_sync_handler+0x8c/0xf0
el0_sync+0x1b4/0x1c0
Mem-Info:
active_anon:47096 inactive_anon:57927 isolated_anon:100
active_file:43790 inactive_file:44434 isolated_file:0
unevictable:14693 dirty:171 writeback:0\x0a slab_reclaimable:21676
slab_unreclaimable:81771\x0a mapped:84485 shmem:4275 pagetables:33367
bounce:0\x0a free:3772 free_pcp:198 free_cma:11
Node 0 active_anon:188384kB inactive_anon:231708kB active_file:175160kB
inactive_file:177736kB unevictable:58772kB isolated(anon):400kB
isolated(file):0kB mapped:337940kB dirty:684kB writeback:0kB
shmem:17100kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB
writeback_tmp:0kB kernel_stack:84960kB shadow_call_stack:21340kB
Normal free:15088kB min:8192kB low:42616kB high:46164kB
reserved_highatomic:4096KB active_anon:187644kB inactive_anon:231608kB
active_file:174552kB inactive_file:178012kB unevictable:58772kB
writepending:684kB present:3701440kB managed:3550144kB mlocked:58508kB
pagetables:133468kB bounce:0kB free_pcp:1048kB local_pcp:12kB
free_cma:44kB
Normal: 3313*4kB (UMEH) 165*8kB (UMEH) 35*16kB (H) 15*32kB (H) 0*64kB
0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 15612kB
108356 total pagecache pages
2.We use kvcalloc to allocate memory, which can reduce system OOM
occurrences, as well as decrease the time and probability of failure
for order3 memory allocations. Additionally, it can also improve the
throughput of binder (as verified by Google's binder_benchmark testing
tool).
3.We have conducted multiple tests on an 12GB memory phone, and the
performance of kvcalloc is better. Below is a partial excerpt of the
test data.
throughput = (size * Iterations)/Time
kvcalloc->kvmalloc:
Benchmark-kvcalloc Time CPU Iterations throughput(Gb/s)
----------------------------------------------------------------
BM_sendVec_binder-4096 30926 ns 20481 ns 34457 4563.66↑
BM_sendVec_binder-8192 42667 ns 30837 ns 22631 4345.11↑
BM_sendVec_binder-16384 67586 ns 52381 ns 13318 3228.51↑
BM_sendVec_binder-32768 116496 ns 94893 ns 7416 2085.97↑
BM_sendVec_binder-65536 265482 ns 209214 ns 3530 871.40↑
kcalloc->kmalloc
Benchmark-kcalloc Time CPU Iterations throughput(Gb/s)
----------------------------------------------------------------
BM_sendVec_binder-4096 39070 ns 24207 ns 31063 3256.56
BM_sendVec_binder-8192 49476 ns 35099 ns 18817 3115.62
BM_sendVec_binder-16384 76866 ns 58924 ns 11883 2532.86
BM_sendVec_binder-32768 134022 ns 102788 ns 6535 1597.78
BM_sendVec_binder-65536 281004 ns 220028 ns 3135 731.14
Signed-off-by: Lei Liu <liulei.rjpt@vivo.com>
---
Changelog:
v2->v3:
1.Modify the commit message description as the description for the V2
version is unclear.
---
drivers/android/binder_alloc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 2e1f261ec5c8..5dcab4a5e341 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -836,7 +836,7 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
alloc->buffer = vma->vm_start;
- alloc->pages = kcalloc(alloc->buffer_size / PAGE_SIZE,
+ alloc->pages = kvcalloc(alloc->buffer_size / PAGE_SIZE,
sizeof(alloc->pages[0]),
GFP_KERNEL);
if (alloc->pages == NULL) {
@@ -869,7 +869,7 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
return 0;
err_alloc_buf_struct_failed:
- kfree(alloc->pages);
+ kvfree(alloc->pages);
alloc->pages = NULL;
err_alloc_pages_failed:
alloc->buffer = 0;
@@ -939,7 +939,7 @@ void binder_alloc_deferred_release(struct binder_alloc *alloc)
__free_page(alloc->pages[i].page_ptr);
page_count++;
}
- kfree(alloc->pages);
+ kvfree(alloc->pages);
}
spin_unlock(&alloc->lock);
if (alloc->mm)
--
2.34.1
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH v3] binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issues
2024-06-14 4:09 [PATCH v3] binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issues Lei Liu
@ 2024-06-14 18:38 ` Carlos Llamas
2024-06-17 4:01 ` Lei Liu
0 siblings, 1 reply; 9+ messages in thread
From: Carlos Llamas @ 2024-06-14 18:38 UTC (permalink / raw)
To: Lei Liu
Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
Martijn Coenen, Joel Fernandes, Christian Brauner,
Suren Baghdasaryan, linux-kernel, opensource.kernel
On Fri, Jun 14, 2024 at 12:09:29PM +0800, Lei Liu wrote:
> 1.In binder_alloc, there is a frequent need for order3 memory
> allocation, especially on small-memory mobile devices, which can lead
> to OOM and cause foreground applications to be killed, resulting in
> flashbacks.The kernel call stack after the issue occurred is as follows:
> dumpsys invoked oom-killer:
> gfp_mask=0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), order=3,
> oom_score_adj=-950
> CPU: 6 PID: 31329 Comm: dumpsys Tainted: G WC O
> 5.10.168-android12-9-00003-gc873b6b86254-ab10823632 #1
> Call trace:
> dump_backtrace.cfi_jt+0x0/0x8
> dump_stack_lvl+0xdc/0x138
> dump_header+0x5c/0x2ac
> oom_kill_process+0x124/0x304
> out_of_memory+0x25c/0x5e0
> __alloc_pages_slowpath+0x690/0xf6c
> __alloc_pages_nodemask+0x1f4/0x3dc
> kmalloc_order+0x54/0x338
> kmalloc_order_trace+0x34/0x1bc
> __kmalloc+0x5e8/0x9c0
> binder_alloc_mmap_handler+0x88/0x1f8
> binder_mmap+0x90/0x10c
> mmap_region+0x44c/0xc14
> do_mmap+0x518/0x680
> vm_mmap_pgoff+0x15c/0x378
> ksys_mmap_pgoff+0x80/0x108
> __arm64_sys_mmap+0x38/0x48
> el0_svc_common+0xd4/0x270
> el0_svc+0x28/0x98
> el0_sync_handler+0x8c/0xf0
> el0_sync+0x1b4/0x1c0
> Mem-Info:
> active_anon:47096 inactive_anon:57927 isolated_anon:100
> active_file:43790 inactive_file:44434 isolated_file:0
> unevictable:14693 dirty:171 writeback:0\x0a slab_reclaimable:21676
> slab_unreclaimable:81771\x0a mapped:84485 shmem:4275 pagetables:33367
> bounce:0\x0a free:3772 free_pcp:198 free_cma:11
> Node 0 active_anon:188384kB inactive_anon:231708kB active_file:175160kB
> inactive_file:177736kB unevictable:58772kB isolated(anon):400kB
> isolated(file):0kB mapped:337940kB dirty:684kB writeback:0kB
> shmem:17100kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB
> writeback_tmp:0kB kernel_stack:84960kB shadow_call_stack:21340kB
> Normal free:15088kB min:8192kB low:42616kB high:46164kB
> reserved_highatomic:4096KB active_anon:187644kB inactive_anon:231608kB
> active_file:174552kB inactive_file:178012kB unevictable:58772kB
> writepending:684kB present:3701440kB managed:3550144kB mlocked:58508kB
> pagetables:133468kB bounce:0kB free_pcp:1048kB local_pcp:12kB
> free_cma:44kB
> Normal: 3313*4kB (UMEH) 165*8kB (UMEH) 35*16kB (H) 15*32kB (H) 0*64kB
> 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 15612kB
> 108356 total pagecache pages
Think about indenting this stacktrace. IMO, the v1 had a commit log that
was much easier to follow.
>
> 2.We use kvcalloc to allocate memory, which can reduce system OOM
> occurrences, as well as decrease the time and probability of failure
> for order3 memory allocations. Additionally, it can also improve the
> throughput of binder (as verified by Google's binder_benchmark testing
> tool).
>
> 3.We have conducted multiple tests on an 12GB memory phone, and the
> performance of kvcalloc is better. Below is a partial excerpt of the
> test data.
> throughput = (size * Iterations)/Time
Huh? Do you have an explanation for this performance improvement?
Did you test this under memory pressure?
My understanding is that kvcalloc() == kcalloc() if there is enough
contiguous memory no?
I would expect the performance to be the same at best.
> kvcalloc->kvmalloc:
> Benchmark-kvcalloc Time CPU Iterations throughput(Gb/s)
> ----------------------------------------------------------------
> BM_sendVec_binder-4096 30926 ns 20481 ns 34457 4563.66↑
> BM_sendVec_binder-8192 42667 ns 30837 ns 22631 4345.11↑
> BM_sendVec_binder-16384 67586 ns 52381 ns 13318 3228.51↑
> BM_sendVec_binder-32768 116496 ns 94893 ns 7416 2085.97↑
> BM_sendVec_binder-65536 265482 ns 209214 ns 3530 871.40↑
>
> kcalloc->kmalloc
> Benchmark-kcalloc Time CPU Iterations throughput(Gb/s)
> ----------------------------------------------------------------
> BM_sendVec_binder-4096 39070 ns 24207 ns 31063 3256.56
> BM_sendVec_binder-8192 49476 ns 35099 ns 18817 3115.62
> BM_sendVec_binder-16384 76866 ns 58924 ns 11883 2532.86
> BM_sendVec_binder-32768 134022 ns 102788 ns 6535 1597.78
> BM_sendVec_binder-65536 281004 ns 220028 ns 3135 731.14
>
> Signed-off-by: Lei Liu <liulei.rjpt@vivo.com>
> ---
> Changelog:
> v2->v3:
> 1.Modify the commit message description as the description for the V2
> version is unclear.
The complete history of the changelog would be better.
> ---
> drivers/android/binder_alloc.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
> index 2e1f261ec5c8..5dcab4a5e341 100644
> --- a/drivers/android/binder_alloc.c
> +++ b/drivers/android/binder_alloc.c
> @@ -836,7 +836,7 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
>
> alloc->buffer = vma->vm_start;
>
> - alloc->pages = kcalloc(alloc->buffer_size / PAGE_SIZE,
> + alloc->pages = kvcalloc(alloc->buffer_size / PAGE_SIZE,
> sizeof(alloc->pages[0]),
> GFP_KERNEL);
I believe Greg had asked for these to be aligned to the parenthesis.
You can double check by running checkpatch with the -strict flag.
> if (alloc->pages == NULL) {
> @@ -869,7 +869,7 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
> return 0;
>
> err_alloc_buf_struct_failed:
> - kfree(alloc->pages);
> + kvfree(alloc->pages);
> alloc->pages = NULL;
> err_alloc_pages_failed:
> alloc->buffer = 0;
> @@ -939,7 +939,7 @@ void binder_alloc_deferred_release(struct binder_alloc *alloc)
> __free_page(alloc->pages[i].page_ptr);
> page_count++;
> }
> - kfree(alloc->pages);
> + kvfree(alloc->pages);
> }
> spin_unlock(&alloc->lock);
> if (alloc->mm)
> --
> 2.34.1
>
I'm not so sure about the results and performance improvements that are
claimed here. However, the switch to kvcalloc() itself seems reasonable
to me.
I'll run these tests myself as the results might have some noise. I'll
get back with the results.
Thanks,
Carlos Llamas
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3] binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issues
2024-06-14 18:38 ` Carlos Llamas
@ 2024-06-17 4:01 ` Lei Liu
2024-06-17 18:43 ` Carlos Llamas
0 siblings, 1 reply; 9+ messages in thread
From: Lei Liu @ 2024-06-17 4:01 UTC (permalink / raw)
To: Carlos Llamas
Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
Martijn Coenen, Joel Fernandes, Christian Brauner,
Suren Baghdasaryan, linux-kernel, opensource.kernel
On 6/15/2024 at 2:38, Carlos Llamas wrote:
> On Fri, Jun 14, 2024 at 12:09:29PM +0800, Lei Liu wrote:
>> 1.In binder_alloc, there is a frequent need for order3 memory
>> allocation, especially on small-memory mobile devices, which can lead
>> to OOM and cause foreground applications to be killed, resulting in
>> flashbacks.The kernel call stack after the issue occurred is as follows:
>> dumpsys invoked oom-killer:
>> gfp_mask=0x40dc0(GFP_KERNEL|__GFP_COMP|__GFP_ZERO), order=3,
>> oom_score_adj=-950
>> CPU: 6 PID: 31329 Comm: dumpsys Tainted: G WC O
>> 5.10.168-android12-9-00003-gc873b6b86254-ab10823632 #1
>> Call trace:
>> dump_backtrace.cfi_jt+0x0/0x8
>> dump_stack_lvl+0xdc/0x138
>> dump_header+0x5c/0x2ac
>> oom_kill_process+0x124/0x304
>> out_of_memory+0x25c/0x5e0
>> __alloc_pages_slowpath+0x690/0xf6c
>> __alloc_pages_nodemask+0x1f4/0x3dc
>> kmalloc_order+0x54/0x338
>> kmalloc_order_trace+0x34/0x1bc
>> __kmalloc+0x5e8/0x9c0
>> binder_alloc_mmap_handler+0x88/0x1f8
>> binder_mmap+0x90/0x10c
>> mmap_region+0x44c/0xc14
>> do_mmap+0x518/0x680
>> vm_mmap_pgoff+0x15c/0x378
>> ksys_mmap_pgoff+0x80/0x108
>> __arm64_sys_mmap+0x38/0x48
>> el0_svc_common+0xd4/0x270
>> el0_svc+0x28/0x98
>> el0_sync_handler+0x8c/0xf0
>> el0_sync+0x1b4/0x1c0
>> Mem-Info:
>> active_anon:47096 inactive_anon:57927 isolated_anon:100
>> active_file:43790 inactive_file:44434 isolated_file:0
>> unevictable:14693 dirty:171 writeback:0\x0a slab_reclaimable:21676
>> slab_unreclaimable:81771\x0a mapped:84485 shmem:4275 pagetables:33367
>> bounce:0\x0a free:3772 free_pcp:198 free_cma:11
>> Node 0 active_anon:188384kB inactive_anon:231708kB active_file:175160kB
>> inactive_file:177736kB unevictable:58772kB isolated(anon):400kB
>> isolated(file):0kB mapped:337940kB dirty:684kB writeback:0kB
>> shmem:17100kB shmem_thp: 0kB shmem_pmdmapped: 0kB anon_thp: 0kB
>> writeback_tmp:0kB kernel_stack:84960kB shadow_call_stack:21340kB
>> Normal free:15088kB min:8192kB low:42616kB high:46164kB
>> reserved_highatomic:4096KB active_anon:187644kB inactive_anon:231608kB
>> active_file:174552kB inactive_file:178012kB unevictable:58772kB
>> writepending:684kB present:3701440kB managed:3550144kB mlocked:58508kB
>> pagetables:133468kB bounce:0kB free_pcp:1048kB local_pcp:12kB
>> free_cma:44kB
>> Normal: 3313*4kB (UMEH) 165*8kB (UMEH) 35*16kB (H) 15*32kB (H) 0*64kB
>> 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 15612kB
>> 108356 total pagecache pages
> Think about indenting this stacktrace. IMO, the v1 had a commit log that
> was much easier to follow.
Hmm, okay, your suggestion is good. I will consider updating another
version later as per your suggestion, and trim the stack.
>> 2.We use kvcalloc to allocate memory, which can reduce system OOM
>> occurrences, as well as decrease the time and probability of failure
>> for order3 memory allocations. Additionally, it can also improve the
>> throughput of binder (as verified by Google's binder_benchmark testing
>> tool).
>>
>> 3.We have conducted multiple tests on an 12GB memory phone, and the
>> performance of kvcalloc is better. Below is a partial excerpt of the
>> test data.
>> throughput = (size * Iterations)/Time
> Huh? Do you have an explanation for this performance improvement?
> Did you test this under memory pressure?
Hmm, in our mobile project, we often encounter OOM and application
crashes under stress testing.
> My understanding is that kvcalloc() == kcalloc() if there is enough
> contiguous memory no?
>
> I would expect the performance to be the same at best.
1.The main reason is memory fragmentation, where we are unable to
allocate contiguous order3 memory. Additionally, using the GFP_KERNEL
allocation flag in the kernel's __alloc_pages_slowpath function results
in multiple retry attempts, and if direct_reclaim and memory_compact are
unsuccessful, OOM occurs.
2.When fragmentation is severe, we observed that kvmalloc is faster than
kmalloc, as it eliminates the need for multiple retry attempts to
allocate order3. In such cases, falling back to order0 may result in
higher allocation efficiency.
3.Another crucial point is that in the kernel, allocations greater than
order3 are considered PAGE_ALLOC_COSTLY_ORDER. This leads to a reduced
number of retry attempts in __alloc_pages_slowpath, which explains the
increased time for order3 allocation in fragmented scenarios.
In summary, under high memory pressure scenarios, the system is prone to
fragmentation. Instead of waiting for order3 allocation, it is more
efficient to allow kvmalloc to automatically select between order0 and
order3, reducing wait times in high memory pressure scenarios. This is
also the reason why kvmalloc can improve throughput.
>> kvcalloc->kvmalloc:
>> Benchmark-kvcalloc Time CPU Iterations throughput(Gb/s)
>> ----------------------------------------------------------------
>> BM_sendVec_binder-4096 30926 ns 20481 ns 34457 4563.66↑
>> BM_sendVec_binder-8192 42667 ns 30837 ns 22631 4345.11↑
>> BM_sendVec_binder-16384 67586 ns 52381 ns 13318 3228.51↑
>> BM_sendVec_binder-32768 116496 ns 94893 ns 7416 2085.97↑
>> BM_sendVec_binder-65536 265482 ns 209214 ns 3530 871.40↑
>>
>> kcalloc->kmalloc
>> Benchmark-kcalloc Time CPU Iterations throughput(Gb/s)
>> ----------------------------------------------------------------
>> BM_sendVec_binder-4096 39070 ns 24207 ns 31063 3256.56
>> BM_sendVec_binder-8192 49476 ns 35099 ns 18817 3115.62
>> BM_sendVec_binder-16384 76866 ns 58924 ns 11883 2532.86
>> BM_sendVec_binder-32768 134022 ns 102788 ns 6535 1597.78
>> BM_sendVec_binder-65536 281004 ns 220028 ns 3135 731.14
>>
>> Signed-off-by: Lei Liu <liulei.rjpt@vivo.com>
>> ---
>> Changelog:
>> v2->v3:
>> 1.Modify the commit message description as the description for the V2
>> version is unclear.
> The complete history of the changelog would be better.
>
>> ---
>> drivers/android/binder_alloc.c | 6 +++---
>> 1 file changed, 3 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
>> index 2e1f261ec5c8..5dcab4a5e341 100644
>> --- a/drivers/android/binder_alloc.c
>> +++ b/drivers/android/binder_alloc.c
>> @@ -836,7 +836,7 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
>>
>> alloc->buffer = vma->vm_start;
>>
>> - alloc->pages = kcalloc(alloc->buffer_size / PAGE_SIZE,
>> + alloc->pages = kvcalloc(alloc->buffer_size / PAGE_SIZE,
>> sizeof(alloc->pages[0]),
>> GFP_KERNEL);
> I believe Greg had asked for these to be aligned to the parenthesis.
> You can double check by running checkpatch with the -strict flag.
Okay, I'll double check the format of the paths again.
>> if (alloc->pages == NULL) {
>> @@ -869,7 +869,7 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
>> return 0;
>>
>> err_alloc_buf_struct_failed:
>> - kfree(alloc->pages);
>> + kvfree(alloc->pages);
>> alloc->pages = NULL;
>> err_alloc_pages_failed:
>> alloc->buffer = 0;
>> @@ -939,7 +939,7 @@ void binder_alloc_deferred_release(struct binder_alloc *alloc)
>> __free_page(alloc->pages[i].page_ptr);
>> page_count++;
>> }
>> - kfree(alloc->pages);
>> + kvfree(alloc->pages);
>> }
>> spin_unlock(&alloc->lock);
>> if (alloc->mm)
>> --
>> 2.34.1
>>
> I'm not so sure about the results and performance improvements that are
> claimed here. However, the switch to kvcalloc() itself seems reasonable
> to me.
>
> I'll run these tests myself as the results might have some noise. I'll
> get back with the results.
>
> Thanks,
> Carlos Llamas
Okay, thank you for the suggestion. I look forward to receiving your
test results and continuing our discussion.
My testing tool is the binder throughput testing tool provided by
Google. You can give it a try here:
https://source.android.com/docs/core/tests/vts/performance
Thanks,
Lei liu
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3] binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issues
2024-06-17 4:01 ` Lei Liu
@ 2024-06-17 18:43 ` Carlos Llamas
2024-06-18 2:50 ` Lei Liu
0 siblings, 1 reply; 9+ messages in thread
From: Carlos Llamas @ 2024-06-17 18:43 UTC (permalink / raw)
To: Lei Liu
Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
Martijn Coenen, Joel Fernandes, Christian Brauner,
Suren Baghdasaryan, linux-kernel, opensource.kernel
On Mon, Jun 17, 2024 at 12:01:26PM +0800, Lei Liu wrote:
> On 6/15/2024 at 2:38, Carlos Llamas wrote:
> > My understanding is that kvcalloc() == kcalloc() if there is enough
> > contiguous memory no?
> >
> > I would expect the performance to be the same at best.
>
> 1.The main reason is memory fragmentation, where we are unable to
> allocate contiguous order3 memory. Additionally, using the GFP_KERNEL
> allocation flag in the kernel's __alloc_pages_slowpath function results
> in multiple retry attempts, and if direct_reclaim and memory_compact
> are unsuccessful, OOM occurs.
>
> 2.When fragmentation is severe, we observed that kvmalloc is faster
> than kmalloc, as it eliminates the need for multiple retry attempts to
> allocate order3. In such cases, falling back to order0 may result in
> higher allocation efficiency.
>
> 3.Another crucial point is that in the kernel, allocations greater than
> order3 are considered PAGE_ALLOC_COSTLY_ORDER. This leads to a reduced
> number of retry attempts in __alloc_pages_slowpath, which explains the
> increased time for order3 allocation in fragmented scenarios.
>
> In summary, under high memory pressure scenarios, the system is prone
> to fragmentation. Instead of waiting for order3 allocation, it is more
> efficient to allow kvmalloc to automatically select between order0 and
> order3, reducing wait times in high memory pressure scenarios. This is
> also the reason why kvmalloc can improve throughput.
Yes, all this makes sense. What I don't understand is how "performance
of kvcalloc is better". This is not supposed to be.
> > I'm not so sure about the results and performance improvements that are
> > claimed here. However, the switch to kvcalloc() itself seems reasonable
> > to me.
> >
> > I'll run these tests myself as the results might have some noise. I'll
> > get back with the results.
> >
> > Thanks,
> > Carlos Llamas
>
> Okay, thank you for the suggestion. I look forward to receiving your
> test results and continuing our discussion.
>
I ran several iterations of the benchmark test on a Pixel device and as
expected I didn't see any significant differences. This is a good thing,
but either we need to understand how you obtained a better performance
from using kvcalloc(), or it would be better to drop this claim from the
commit log.
The following are two individual samples of each form. However, if we
could average the output and get rid of the noise it seems the numbers
are pretty much the same.
Sample with kcalloc():
------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------
BM_sendVec_binder/4 19983 ns 9832 ns 60255
BM_sendVec_binder/8 19766 ns 9690 ns 71699
BM_sendVec_binder/16 19785 ns 9722 ns 72086
BM_sendVec_binder/32 20067 ns 9864 ns 71535
BM_sendVec_binder/64 20077 ns 9941 ns 69141
BM_sendVec_binder/128 20147 ns 9944 ns 71016
BM_sendVec_binder/256 20424 ns 10044 ns 69451
BM_sendVec_binder/512 20518 ns 10064 ns 69179
BM_sendVec_binder/1024 21073 ns 10319 ns 67599
BM_sendVec_binder/2048 21482 ns 10502 ns 66767
BM_sendVec_binder/4096 22308 ns 10809 ns 63841
BM_sendVec_binder/8192 24022 ns 11649 ns 60795
BM_sendVec_binder/16384 27172 ns 13426 ns 51940
BM_sendVec_binder/32768 32853 ns 16345 ns 42211
BM_sendVec_binder/65536 80177 ns 39787 ns 17557
Sample with kvalloc():
------------------------------------------------------------------
Benchmark Time CPU Iterations
------------------------------------------------------------------
BM_sendVec_binder/4 19900 ns 9711 ns 68626
BM_sendVec_binder/8 19903 ns 9756 ns 71524
BM_sendVec_binder/16 19601 ns 9541 ns 71069
BM_sendVec_binder/32 19514 ns 9530 ns 72469
BM_sendVec_binder/64 20042 ns 10006 ns 69753
BM_sendVec_binder/128 20142 ns 9965 ns 70392
BM_sendVec_binder/256 20274 ns 9958 ns 70173
BM_sendVec_binder/512 20305 ns 9966 ns 70347
BM_sendVec_binder/1024 20883 ns 10250 ns 67813
BM_sendVec_binder/2048 21364 ns 10455 ns 67366
BM_sendVec_binder/4096 22350 ns 10888 ns 65689
BM_sendVec_binder/8192 24113 ns 11707 ns 58149
BM_sendVec_binder/16384 27122 ns 13346 ns 52515
BM_sendVec_binder/32768 32158 ns 15901 ns 44139
BM_sendVec_binder/65536 87594 ns 43627 ns 16040
To reiterate, the switch to kvcalloc() sounds good to me. Let's just fix
the commit log and Greg's suggestions too.
Thanks,
Carlos Llamas
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3] binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issues
2024-06-17 18:43 ` Carlos Llamas
@ 2024-06-18 2:50 ` Lei Liu
2024-06-18 4:37 ` Carlos Llamas
0 siblings, 1 reply; 9+ messages in thread
From: Lei Liu @ 2024-06-18 2:50 UTC (permalink / raw)
To: Carlos Llamas
Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
Martijn Coenen, Joel Fernandes, Christian Brauner,
Suren Baghdasaryan, linux-kernel, opensource.kernel
On 2024/6/18 2:43, Carlos Llamas wrote:
> On Mon, Jun 17, 2024 at 12:01:26PM +0800, Lei Liu wrote:
>> On 6/15/2024 at 2:38, Carlos Llamas wrote:
>>> My understanding is that kvcalloc() == kcalloc() if there is enough
>>> contiguous memory no?
>>>
>>> I would expect the performance to be the same at best.
>> 1.The main reason is memory fragmentation, where we are unable to
>> allocate contiguous order3 memory. Additionally, using the GFP_KERNEL
>> allocation flag in the kernel's __alloc_pages_slowpath function results
>> in multiple retry attempts, and if direct_reclaim and memory_compact
>> are unsuccessful, OOM occurs.
>>
>> 2.When fragmentation is severe, we observed that kvmalloc is faster
>> than kmalloc, as it eliminates the need for multiple retry attempts to
>> allocate order3. In such cases, falling back to order0 may result in
>> higher allocation efficiency.
>>
>> 3.Another crucial point is that in the kernel, allocations greater than
>> order3 are considered PAGE_ALLOC_COSTLY_ORDER. This leads to a reduced
>> number of retry attempts in __alloc_pages_slowpath, which explains the
>> increased time for order3 allocation in fragmented scenarios.
>>
>> In summary, under high memory pressure scenarios, the system is prone
>> to fragmentation. Instead of waiting for order3 allocation, it is more
>> efficient to allow kvmalloc to automatically select between order0 and
>> order3, reducing wait times in high memory pressure scenarios. This is
>> also the reason why kvmalloc can improve throughput.
> Yes, all this makes sense. What I don't understand is how "performance
> of kvcalloc is better". This is not supposed to be.
Based on my current understanding:
1.kvmalloc may allocate memory faster than kmalloc in cases of memory
fragmentation, which could potentially improve the performance of binder.
2.Memory allocated by kvmalloc may not be contiguous, which could
potentially degrade the data read and write speed of binder.
I'm uncertain about the relative impact of the points mentioned above.
I'm interested in hearing your perspective on this matter.
>>> I'm not so sure about the results and performance improvements that are
>>> claimed here. However, the switch to kvcalloc() itself seems reasonable
>>> to me.
>>>
>>> I'll run these tests myself as the results might have some noise. I'll
>>> get back with the results.
>>>
>>> Thanks,
>>> Carlos Llamas
>> Okay, thank you for the suggestion. I look forward to receiving your
>> test results and continuing our discussion.
>>
> I ran several iterations of the benchmark test on a Pixel device and as
> expected I didn't see any significant differences. This is a good thing,
> but either we need to understand how you obtained a better performance
> from using kvcalloc(), or it would be better to drop this claim from the
> commit log.
>
> The following are two individual samples of each form. However, if we
> could average the output and get rid of the noise it seems the numbers
> are pretty much the same.
>
> Sample with kcalloc():
> ------------------------------------------------------------------
> Benchmark Time CPU Iterations
> ------------------------------------------------------------------
> BM_sendVec_binder/4 19983 ns 9832 ns 60255
> BM_sendVec_binder/8 19766 ns 9690 ns 71699
> BM_sendVec_binder/16 19785 ns 9722 ns 72086
> BM_sendVec_binder/32 20067 ns 9864 ns 71535
> BM_sendVec_binder/64 20077 ns 9941 ns 69141
> BM_sendVec_binder/128 20147 ns 9944 ns 71016
> BM_sendVec_binder/256 20424 ns 10044 ns 69451
> BM_sendVec_binder/512 20518 ns 10064 ns 69179
> BM_sendVec_binder/1024 21073 ns 10319 ns 67599
> BM_sendVec_binder/2048 21482 ns 10502 ns 66767
> BM_sendVec_binder/4096 22308 ns 10809 ns 63841
> BM_sendVec_binder/8192 24022 ns 11649 ns 60795
> BM_sendVec_binder/16384 27172 ns 13426 ns 51940
> BM_sendVec_binder/32768 32853 ns 16345 ns 42211
> BM_sendVec_binder/65536 80177 ns 39787 ns 17557
>
> Sample with kvalloc():
> ------------------------------------------------------------------
> Benchmark Time CPU Iterations
> ------------------------------------------------------------------
> BM_sendVec_binder/4 19900 ns 9711 ns 68626
> BM_sendVec_binder/8 19903 ns 9756 ns 71524
> BM_sendVec_binder/16 19601 ns 9541 ns 71069
> BM_sendVec_binder/32 19514 ns 9530 ns 72469
> BM_sendVec_binder/64 20042 ns 10006 ns 69753
> BM_sendVec_binder/128 20142 ns 9965 ns 70392
> BM_sendVec_binder/256 20274 ns 9958 ns 70173
> BM_sendVec_binder/512 20305 ns 9966 ns 70347
> BM_sendVec_binder/1024 20883 ns 10250 ns 67813
> BM_sendVec_binder/2048 21364 ns 10455 ns 67366
> BM_sendVec_binder/4096 22350 ns 10888 ns 65689
> BM_sendVec_binder/8192 24113 ns 11707 ns 58149
> BM_sendVec_binder/16384 27122 ns 13346 ns 52515
> BM_sendVec_binder/32768 32158 ns 15901 ns 44139
> BM_sendVec_binder/65536 87594 ns 43627 ns 16040
>
> To reiterate, the switch to kvcalloc() sounds good to me. Let's just fix
> the commit log and Greg's suggestions too.
>
> Thanks,
> Carlos Llamas
Hmm, this is really good news. From the current test results, it seems
that kvmalloc does not degrade performance for binder.
I will retest the data on our phone to see if we reach the same
conclusion. If kvmalloc still proves to be better, we will provide you
with the reproduction method.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3] binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issues
2024-06-18 2:50 ` Lei Liu
@ 2024-06-18 4:37 ` Carlos Llamas
2024-06-19 8:35 ` Lei Liu
2024-06-19 8:44 ` Lei Liu
0 siblings, 2 replies; 9+ messages in thread
From: Carlos Llamas @ 2024-06-18 4:37 UTC (permalink / raw)
To: Lei Liu
Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
Martijn Coenen, Joel Fernandes, Christian Brauner,
Suren Baghdasaryan, linux-kernel, opensource.kernel
On Tue, Jun 18, 2024 at 10:50:17AM +0800, Lei Liu wrote:
>
> On 2024/6/18 2:43, Carlos Llamas wrote:
> > On Mon, Jun 17, 2024 at 12:01:26PM +0800, Lei Liu wrote:
> > > On 6/15/2024 at 2:38, Carlos Llamas wrote:
> > Yes, all this makes sense. What I don't understand is how "performance
> > of kvcalloc is better". This is not supposed to be.
>
> Based on my current understanding:
> 1.kvmalloc may allocate memory faster than kmalloc in cases of memory
> fragmentation, which could potentially improve the performance of binder.
I think there is a misunderstanding of the allocations performed in this
benchmark test. Yes, in general when there is heavy memory pressure it
can be faster to use kvmalloc() and not try too hard to reclaim
contiguous memory.
In the case of binder though, this is the mmap() allocation. This call
is part of the "initial setup". In the test, there should only be two
calls to kvmalloc(), since the benchmark is done across two processes.
That's it.
So the time it takes to allocate this memory is irrelevant to the
performance results. Does this make sense?
> 2.Memory allocated by kvmalloc may not be contiguous, which could
> potentially degrade the data read and write speed of binder.
This _is_ what is being considered in the benchmark test instead. There
are repeated accesses to alloc->pages[n]. Your point is then the reason
why I was expecting "same performance at best".
> Hmm, this is really good news. From the current test results, it seems that
> kvmalloc does not degrade performance for binder.
Yeah, not in the "happy" case anyways. I'm not sure what the numbers
look like under some memory pressure.
> I will retest the data on our phone to see if we reach the same conclusion.
> If kvmalloc still proves to be better, we will provide you with the
> reproduction method.
>
Ok, thanks. I would suggest you do an "adb shell stop" before running
these test. This might help with the noise.
Thanks,
Carlos Llamas
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3] binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issues
2024-06-18 4:37 ` Carlos Llamas
@ 2024-06-19 8:35 ` Lei Liu
2024-06-19 8:44 ` Lei Liu
1 sibling, 0 replies; 9+ messages in thread
From: Lei Liu @ 2024-06-19 8:35 UTC (permalink / raw)
To: Carlos Llamas
Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
Martijn Coenen, Joel Fernandes, Christian Brauner,
Suren Baghdasaryan, linux-kernel, opensource.kernel
On 2024/6/18 12:37, Carlos Llamas wrote:
> On Tue, Jun 18, 2024 at 10:50:17AM +0800, Lei Liu wrote:
>> On 2024/6/18 2:43, Carlos Llamas wrote:
>>> On Mon, Jun 17, 2024 at 12:01:26PM +0800, Lei Liu wrote:
>>>> On 6/15/2024 at 2:38, Carlos Llamas wrote:
>>> Yes, all this makes sense. What I don't understand is how "performance
>>> of kvcalloc is better". This is not supposed to be.
>> Based on my current understanding:
>> 1.kvmalloc may allocate memory faster than kmalloc in cases of memory
>> fragmentation, which could potentially improve the performance of binder.
> I think there is a misunderstanding of the allocations performed in this
> benchmark test. Yes, in general when there is heavy memory pressure it
> can be faster to use kvmalloc() and not try too hard to reclaim
> contiguous memory.
>
> In the case of binder though, this is the mmap() allocation. This call
> is part of the "initial setup". In the test, there should only be two
> calls to kvmalloc(), since the benchmark is done across two processes.
> That's it.
>
> So the time it takes to allocate this memory is irrelevant to the
> performance results. Does this make sense?
>
>> 2.Memory allocated by kvmalloc may not be contiguous, which could
>> potentially degrade the data read and write speed of binder.
> This _is_ what is being considered in the benchmark test instead. There
> are repeated accesses to alloc->pages[n]. Your point is then the reason
> why I was expecting "same performance at best".
>
>> Hmm, this is really good news. From the current test results, it seems that
>> kvmalloc does not degrade performance for binder.
> Yeah, not in the "happy" case anyways. I'm not sure what the numbers
> look like under some memory pressure.
>
>> I will retest the data on our phone to see if we reach the same conclusion.
>> If kvmalloc still proves to be better, we will provide you with the
>> reproduction method.
>>
> Ok, thanks. I would suggest you do an "adb shell stop" before running
> these test. This might help with the noise.
>
> Thanks,
> Carlos Llamas
We used the "adb shell stop" command to retest the data. Now, the test
data for kmalloc and vmalloc are basically consistent. There are a few
instances where vmalloc may be slightly inferior, but the difference is
not significant, within 3%. adb shell stop/ kmalloc /8+256G
----------------------------------------------------------------------
Benchmark Time CPU Iterations OUTPUT OUTPUTCPU
----------------------------------------------------------------------
BM_sendVec_binder4 39126 18550 38894 3.976282 8.38684 BM_sendVec_binder8
38924 18542 37786 7.766108 16.3028 BM_sendVec_binder16 38328 18228 36700
15.32039 32.2141 BM_sendVec_binder32 38154 18215 38240 32.07213 67.1798
BM_sendVec_binder64 39093 18809 36142 59.16885 122.977
BM_sendVec_binder128 40169 19188 36461 116.1843 243.2253
BM_sendVec_binder256 40695 19559 35951 226.1569 470.5484
BM_sendVec_binder512 41446 20211 34259 423.2159 867.8743
BM_sendVec_binder1024 44040 22939 28904 672.0639 1290.278
BM_sendVec_binder2048 47817 25821 26595 1139.063 2109.393
BM_sendVec_binder4096 54749 30905 22742 1701.423 3014.115
BM_sendVec_binder8192 68316 42017 16684 2000.634 3252.858
BM_sendVec_binder16384 95435 64081 10961 1881.752 2802.469
BM_sendVec_binder32768 148232 107504 6510 1439.093 1984.295
BM_sendVec_binder65536 326499 229874 3178 637.8991 906.0329 NORAML TEST
SUM 10355.79 17188.15 stressapptest eat 2G SUM 10088.39 16625.97 adb
shell stop/ kvmalloc /8+256G
-----------------------------------------------------------------------
Benchmark Time CPU Iterations OUTPUT OUTPUTCPU
-----------------------------------------------------------------------
BM_sendVec_binder4 39673 18832 36598 3.689965 7.773577
BM_sendVec_binder8 39869 18969 37188 7.462038 15.68369
BM_sendVec_binder16 39774 18896 36627 14.73405 31.01355
BM_sendVec_binder32 40225 19125 36995 29.43045 61.90013
BM_sendVec_binder64 40549 19529 35148 55.47544 115.1862
BM_sendVec_binder128 41580 19892 35384 108.9262 227.6871
BM_sendVec_binder256 41584 20059 34060 209.6806 434.6857
BM_sendVec_binder512 42829 20899 32493 388.4381 796.0389
BM_sendVec_binder1024 45037 23360 29251 665.0759 1282.236
BM_sendVec_binder2048 47853 25761 27091 1159.433 2153.735
BM_sendVec_binder4096 55574 31745 22405 1651.328 2890.877
BM_sendVec_binder8192 70706 43693 16400 1900.105 3074.836
BM_sendVec_binder16384 96161 64362 10793 1838.921 2747.468
BM_sendVec_binder32768 147875 107292 6296 1395.147 1922.858
BM_sendVec_binder65536 330324 232296 3053 605.7126 861.3209 NORAML TEST
SUM 10033.56 16623.35 stressapptest eat 2G SUM 9958.43 16497.55 Can I
prepare the V4 version of the patch now? Do I need to modify anything
else in the V4 version, in addition to addressing the following two
points? 1.Shorten the "backtrace" in the commit message. 2.Modify the
code indentation to comply with the community's code style requirements.
Thanks,
Lei Liu
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3] binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issues
2024-06-18 4:37 ` Carlos Llamas
2024-06-19 8:35 ` Lei Liu
@ 2024-06-19 8:44 ` Lei Liu
2024-06-19 23:41 ` Carlos Llamas
1 sibling, 1 reply; 9+ messages in thread
From: Lei Liu @ 2024-06-19 8:44 UTC (permalink / raw)
To: Carlos Llamas
Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
Martijn Coenen, Joel Fernandes, Christian Brauner,
Suren Baghdasaryan, linux-kernel, opensource.kernel
On 2024/6/18 12:37, Carlos Llamas wrote:
> On Tue, Jun 18, 2024 at 10:50:17AM +0800, Lei Liu wrote:
>> On 2024/6/18 2:43, Carlos Llamas wrote:
>>> On Mon, Jun 17, 2024 at 12:01:26PM +0800, Lei Liu wrote:
>>>> On 6/15/2024 at 2:38, Carlos Llamas wrote:
>>> Yes, all this makes sense. What I don't understand is how "performance
>>> of kvcalloc is better". This is not supposed to be.
>> Based on my current understanding:
>> 1.kvmalloc may allocate memory faster than kmalloc in cases of memory
>> fragmentation, which could potentially improve the performance of binder.
> I think there is a misunderstanding of the allocations performed in this
> benchmark test. Yes, in general when there is heavy memory pressure it
> can be faster to use kvmalloc() and not try too hard to reclaim
> contiguous memory.
>
> In the case of binder though, this is the mmap() allocation. This call
> is part of the "initial setup". In the test, there should only be two
> calls to kvmalloc(), since the benchmark is done across two processes.
> That's it.
>
> So the time it takes to allocate this memory is irrelevant to the
> performance results. Does this make sense?
>
>> 2.Memory allocated by kvmalloc may not be contiguous, which could
>> potentially degrade the data read and write speed of binder.
> This _is_ what is being considered in the benchmark test instead. There
> are repeated accesses to alloc->pages[n]. Your point is then the reason
> why I was expecting "same performance at best".
>
>> Hmm, this is really good news. From the current test results, it seems that
>> kvmalloc does not degrade performance for binder.
> Yeah, not in the "happy" case anyways. I'm not sure what the numbers
> look like under some memory pressure.
>
>> I will retest the data on our phone to see if we reach the same conclusion.
>> If kvmalloc still proves to be better, we will provide you with the
>> reproduction method.
>>
> Ok, thanks. I would suggest you do an "adb shell stop" before running
> these test. This might help with the noise.
>
> Thanks,
> Carlos Llamas
We used the "adb shell stop" command to retest the data.
Now, the test data for kmalloc and vmalloc are basically consistent.
There are a few instances where vmalloc may be slightly inferior, but
the difference is not significant, within 3%.
adb shell stop/ kmalloc /8+256G
----------------------------------------------------------------------
Benchmark Time CPU Iterations OUTPUT OUTPUTCPU
----------------------------------------------------------------------
BM_sendVec_binder4 39126 18550 38894 3.976282 8.38684
BM_sendVec_binder8 38924 18542 37786 7.766108 16.3028
BM_sendVec_binder16 38328 18228 36700 15.32039 32.2141
BM_sendVec_binder32 38154 18215 38240 32.07213 67.1798
BM_sendVec_binder64 39093 18809 36142 59.16885 122.977
BM_sendVec_binder128 40169 19188 36461 116.1843 243.2253
BM_sendVec_binder256 40695 19559 35951 226.1569 470.5484
BM_sendVec_binder512 41446 20211 34259 423.2159 867.8743
BM_sendVec_binder1024 44040 22939 28904 672.0639 1290.278
BM_sendVec_binder2048 47817 25821 26595 1139.063 2109.393
BM_sendVec_binder4096 54749 30905 22742 1701.423 3014.115
BM_sendVec_binder8192 68316 42017 16684 2000.634 3252.858
BM_sendVec_binder16384 95435 64081 10961 1881.752 2802.469
BM_sendVec_binder32768 148232 107504 6510 1439.093 1984.295
BM_sendVec_binder65536 326499 229874 3178 637.8991 906.0329
NORAML TEST SUM 10355.79 17188.15
stressapptest eat 2G SUM 10088.39 16625.97
adb shell stop/ kvmalloc /8+256G
-----------------------------------------------------------------------
Benchmark Time CPU Iterations OUTPUT OUTPUTCPU
-----------------------------------------------------------------------
BM_sendVec_binder4 39673 18832 36598 3.689965 7.773577
BM_sendVec_binder8 39869 18969 37188 7.462038 15.68369
BM_sendVec_binder16 39774 18896 36627 14.73405 31.01355
BM_sendVec_binder32 40225 19125 36995 29.43045 61.90013
BM_sendVec_binder64 40549 19529 35148 55.47544 115.1862
BM_sendVec_binder128 41580 19892 35384 108.9262 227.6871
BM_sendVec_binder256 41584 20059 34060 209.6806 434.6857
BM_sendVec_binder512 42829 20899 32493 388.4381 796.0389
BM_sendVec_binder1024 45037 23360 29251 665.0759 1282.236
BM_sendVec_binder2048 47853 25761 27091 1159.433 2153.735
BM_sendVec_binder4096 55574 31745 22405 1651.328 2890.877
BM_sendVec_binder8192 70706 43693 16400 1900.105 3074.836
BM_sendVec_binder16384 96161 64362 10793 1838.921 2747.468
BM_sendVec_binder32768 147875 107292 6296 1395.147 1922.858
BM_sendVec_binder65536 330324 232296 3053 605.7126 861.3209
NORAML TEST SUM 10033.56 16623.35
stressapptest eat 2G SUM 9958.43 16497.55
Can I prepare the V4 version of the patch now? Do I need to modify
anything else in the V4 version, in addition to addressing the following
two points?
1.Shorten the "backtrace" in the commit message.
2.Modify the code indentation to comply with the community's code style
requirements.
thanks
Lei liu
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH v3] binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issues
2024-06-19 8:44 ` Lei Liu
@ 2024-06-19 23:41 ` Carlos Llamas
0 siblings, 0 replies; 9+ messages in thread
From: Carlos Llamas @ 2024-06-19 23:41 UTC (permalink / raw)
To: Lei Liu
Cc: Greg Kroah-Hartman, Arve Hjønnevåg, Todd Kjos,
Martijn Coenen, Joel Fernandes, Christian Brauner,
Suren Baghdasaryan, linux-kernel, opensource.kernel
On Wed, Jun 19, 2024 at 04:44:07PM +0800, Lei Liu wrote:
> We used the "adb shell stop" command to retest the data.
>
> Now, the test data for kmalloc and vmalloc are basically consistent.
Ok, this matches my observations too.
> Can I prepare the V4 version of the patch now? Do I need to modify anything
> else in the V4 version, in addition to addressing the following two points?
>
> 1.Shorten the "backtrace" in the commit message.
>
> 2.Modify the code indentation to comply with the community's code style
> requirements.
Yeap, that would be all. Thanks.
Carlos Llamas
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2024-06-19 23:41 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-14 4:09 [PATCH v3] binder_alloc: Replace kcalloc with kvcalloc to mitigate OOM issues Lei Liu
2024-06-14 18:38 ` Carlos Llamas
2024-06-17 4:01 ` Lei Liu
2024-06-17 18:43 ` Carlos Llamas
2024-06-18 2:50 ` Lei Liu
2024-06-18 4:37 ` Carlos Llamas
2024-06-19 8:35 ` Lei Liu
2024-06-19 8:44 ` Lei Liu
2024-06-19 23:41 ` Carlos Llamas
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox