* Regression on linux-next (next-20240625)
@ 2024-06-26 15:22 Borah, Chaitanya Kumar
2024-06-28 4:45 ` Borah, Chaitanya Kumar
0 siblings, 1 reply; 4+ messages in thread
From: Borah, Chaitanya Kumar @ 2024-06-26 15:22 UTC (permalink / raw)
To: sidhartha.kumar@oracle.com
Cc: Liam.Howlett@oracle.com, akpm@linux-foundation.org,
linux-mm@kvack.org, maple-tree@lists.infradead.org, Nikula, Jani,
Saarinen, Jani, Kurmi, Suresh Kumar
[-- Attachment #1: Type: text/plain, Size: 3344 bytes --]
Hello Sidhartha,
Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.
Since the version next-20240625 [2], we are seeing the following regression
`````````````````````````````````````````````````````````````````````````````````
<3>[ 2.336948] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:337
<3>[ 2.336974] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 95, name: kdevtmpfs
<3>[ 2.336989] preempt_count: 1, expected: 0
<3>[ 2.336998] RCU nest depth: 0, expected: 0
<4>[ 2.337006] 3 locks held by kdevtmpfs/95:
<4>[ 2.337015] #0: ffff888100d2c3f0 (sb_writers){.+.+}-{0:0}, at: filename_create+0x5d/0x160
<4>[ 2.337041] #1: ffff888100800840 (&type->i_mutex_dir_key/1){+.+.}-{3:3}, at: filename_create+0x9d/0x160
<4>[ 2.337065] #2: ffff888100800658 (&simple_offset_lock_class){+.+.}-{2:2}, at: mtree_alloc_cyclic+0x71/0xf0
<3>[ 2.337089] Preemption disabled at:
<3>[ 2.337091] [<0000000000000000>] 0x0
<4>[ 2.337105] CPU: 13 UID: 0 PID: 95 Comm: kdevtmpfs Not tainted 6.10.0-rc5-next-20240625-next-20240625-g0fc4bfab2cd4+ #1
<4>[ 2.337126] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023
<4>[ 2.337141] Call Trace:
<4>[ 2.337147] <TASK>
<4>[ 2.337152] dump_stack_lvl+0xb0/0xd0
<4>[ 2.337163] __might_resched+0x194/0x2b0
<4>[ 2.337175] kmem_cache_alloc_noprof+0x20c/0x280
<4>[ 2.337186] ? mas_alloc_nodes+0x173/0x230
<4>[ 2.337197] mas_alloc_nodes+0x173/0x230
<4>[ 2.337207] mas_alloc_cyclic+0x27b/0x550
<4>[ 2.337220] mtree_alloc_cyclic+0x92/0xf0
`````````````````````````````````````````````````````````````````````````````````
Details log can be found in [3].
After bisecting the tree, the following patch [4] seems to be the first "bad"
commit
`````````````````````````````````````````````````````````````````````````````````````````````````````````
maple_tree: remove mas_destroy() from mas_nomem()
Separate call to mas_destroy() from mas_nomem() so we can check for no
memory errors without destroying the current maple state in
mas_store_gfp(). We then add calls to mas_destroy() to callers of
mas_nomem().
Link: https://lkml.kernel.org/r/20240618204750.79512-6-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar sidhartha.kumar@oracle.com<mailto:sidhartha.kumar@oracle.com>
`````````````````````````````````````````````````````````````````````````````````````````````````````````
We could not revert the patch because of merge conflicts but resetting to the parent of the commit seems to fix the issue.
Could you please check why the patch causes this regression and provide a fix if necessary?
Thank you.
Regards
Chaitanya
[1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
[2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20240625
[3] https://intel-gfx-ci.01.org/tree/linux-next/next-20240625/bat-rpls-4/boot0.txt
[4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=187827d2dc3749d66546696b78584ee4c54687b0
[-- Attachment #2: Type: text/html, Size: 11486 bytes --]
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: Regression on linux-next (next-20240625)
2024-06-26 15:22 Regression on linux-next (next-20240625) Borah, Chaitanya Kumar
@ 2024-06-28 4:45 ` Borah, Chaitanya Kumar
2024-06-28 14:53 ` Sidhartha Kumar
0 siblings, 1 reply; 4+ messages in thread
From: Borah, Chaitanya Kumar @ 2024-06-28 4:45 UTC (permalink / raw)
To: sidhartha.kumar@oracle.com
Cc: Liam.Howlett@oracle.com, akpm@linux-foundation.org,
linux-mm@kvack.org, maple-tree@lists.infradead.org, Nikula, Jani,
Saarinen, Jani, Kurmi, Suresh Kumar,
intel-gfx@lists.freedesktop.org
[converted to plain text]
+intel-gfx
Gentle Reminder.
From: Borah, Chaitanya Kumar
Sent: Wednesday, June 26, 2024 8:52 PM
To: sidhartha.kumar@oracle.com
Cc: Liam.Howlett@oracle.com; akpm@linux-foundation.org; linux-mm@kvack.org; maple-tree@lists.infradead.org; Nikula, Jani <jani.nikula@intel.com>; Saarinen, Jani <jani.saarinen@intel.com>; Kurmi, Suresh Kumar <Suresh.Kumar.Kurmi@intel.com>
Subject: Regression on linux-next (next-20240625)
Hello Sidhartha,
Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.
Since the version next-20240625 [2], we are seeing the following regression
`````````````````````````````````````````````````````````````````````````````````
<3>[ 2.336948] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:337
<3>[ 2.336974] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 95, name: kdevtmpfs
<3>[ 2.336989] preempt_count: 1, expected: 0
<3>[ 2.336998] RCU nest depth: 0, expected: 0
<4>[ 2.337006] 3 locks held by kdevtmpfs/95:
<4>[ 2.337015] #0: ffff888100d2c3f0 (sb_writers){.+.+}-{0:0}, at: filename_create+0x5d/0x160
<4>[ 2.337041] #1: ffff888100800840 (&type->i_mutex_dir_key/1){+.+.}-{3:3}, at: filename_create+0x9d/0x160
<4>[ 2.337065] #2: ffff888100800658 (&simple_offset_lock_class){+.+.}-{2:2}, at: mtree_alloc_cyclic+0x71/0xf0
<3>[ 2.337089] Preemption disabled at:
<3>[ 2.337091] [<0000000000000000>] 0x0
<4>[ 2.337105] CPU: 13 UID: 0 PID: 95 Comm: kdevtmpfs Not tainted 6.10.0-rc5-next-20240625-next-20240625-g0fc4bfab2cd4+ #1
<4>[ 2.337126] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023
<4>[ 2.337141] Call Trace:
<4>[ 2.337147] <TASK>
<4>[ 2.337152] dump_stack_lvl+0xb0/0xd0
<4>[ 2.337163] __might_resched+0x194/0x2b0
<4>[ 2.337175] kmem_cache_alloc_noprof+0x20c/0x280
<4>[ 2.337186] ? mas_alloc_nodes+0x173/0x230
<4>[ 2.337197] mas_alloc_nodes+0x173/0x230
<4>[ 2.337207] mas_alloc_cyclic+0x27b/0x550
<4>[ 2.337220] mtree_alloc_cyclic+0x92/0xf0
`````````````````````````````````````````````````````````````````````````````````
Details log can be found in [3].
After bisecting the tree, the following patch [4] seems to be the first "bad"
commit
`````````````````````````````````````````````````````````````````````````````````````````````````````````
maple_tree: remove mas_destroy() from mas_nomem()
Separate call to mas_destroy() from mas_nomem() so we can check for no
memory errors without destroying the current maple state in
mas_store_gfp(). We then add calls to mas_destroy() to callers of
mas_nomem().
Link: https://lkml.kernel.org/r/20240618204750.79512-6-sidhartha.kumar@oracle.com
Signed-off-by: Sidhartha Kumar mailto:sidhartha.kumar@oracle.com
`````````````````````````````````````````````````````````````````````````````````````````````````````````
We could not revert the patch because of merge conflicts but resetting to the parent of the commit seems to fix the issue.
Could you please check why the patch causes this regression and provide a fix if necessary?
Thank you.
Regards
Chaitanya
[1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
[2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20240625
[3] https://intel-gfx-ci.01.org/tree/linux-next/next-20240625/bat-rpls-4/boot0.txt
[4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=187827d2dc3749d66546696b78584ee4c54687b0
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Regression on linux-next (next-20240625)
2024-06-28 4:45 ` Borah, Chaitanya Kumar
@ 2024-06-28 14:53 ` Sidhartha Kumar
2024-06-29 4:11 ` Borah, Chaitanya Kumar
0 siblings, 1 reply; 4+ messages in thread
From: Sidhartha Kumar @ 2024-06-28 14:53 UTC (permalink / raw)
To: Borah, Chaitanya Kumar
Cc: Liam.Howlett@oracle.com, akpm@linux-foundation.org,
linux-mm@kvack.org, maple-tree@lists.infradead.org, Nikula, Jani,
Saarinen, Jani, Kurmi, Suresh Kumar,
intel-gfx@lists.freedesktop.org
On 6/27/24 9:45 PM, Borah, Chaitanya Kumar wrote:
> [converted to plain text]
> +intel-gfx
>
> Gentle Reminder.
>
Hello,
This patch will be dropped from mm-unstable and will not be in linux-next after
that. I am working on a fix to include for the next version of this series.
Thanks,
Sid
> From: Borah, Chaitanya Kumar
> Sent: Wednesday, June 26, 2024 8:52 PM
> To: sidhartha.kumar@oracle.com
> Cc: Liam.Howlett@oracle.com; akpm@linux-foundation.org; linux-mm@kvack.org; maple-tree@lists.infradead.org; Nikula, Jani <jani.nikula@intel.com>; Saarinen, Jani <jani.saarinen@intel.com>; Kurmi, Suresh Kumar <Suresh.Kumar.Kurmi@intel.com>
> Subject: Regression on linux-next (next-20240625)
>
> Hello Sidhartha,
>
> Hope you are doing well. I am Chaitanya from the linux graphics team in Intel.
>
> This mail is regarding a regression we are seeing in our CI runs[1] on linux-next repository.
>
> Since the version next-20240625 [2], we are seeing the following regression
>
> `````````````````````````````````````````````````````````````````````````````````
> <3>[ 2.336948] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:337
> <3>[ 2.336974] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 95, name: kdevtmpfs
> <3>[ 2.336989] preempt_count: 1, expected: 0
> <3>[ 2.336998] RCU nest depth: 0, expected: 0
> <4>[ 2.337006] 3 locks held by kdevtmpfs/95:
> <4>[ 2.337015] #0: ffff888100d2c3f0 (sb_writers){.+.+}-{0:0}, at: filename_create+0x5d/0x160
> <4>[ 2.337041] #1: ffff888100800840 (&type->i_mutex_dir_key/1){+.+.}-{3:3}, at: filename_create+0x9d/0x160
> <4>[ 2.337065] #2: ffff888100800658 (&simple_offset_lock_class){+.+.}-{2:2}, at: mtree_alloc_cyclic+0x71/0xf0
> <3>[ 2.337089] Preemption disabled at:
> <3>[ 2.337091] [<0000000000000000>] 0x0
> <4>[ 2.337105] CPU: 13 UID: 0 PID: 95 Comm: kdevtmpfs Not tainted 6.10.0-rc5-next-20240625-next-20240625-g0fc4bfab2cd4+ #1
> <4>[ 2.337126] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI, BIOS 0812 02/24/2023
> <4>[ 2.337141] Call Trace:
> <4>[ 2.337147] <TASK>
> <4>[ 2.337152] dump_stack_lvl+0xb0/0xd0
> <4>[ 2.337163] __might_resched+0x194/0x2b0
> <4>[ 2.337175] kmem_cache_alloc_noprof+0x20c/0x280
> <4>[ 2.337186] ? mas_alloc_nodes+0x173/0x230
> <4>[ 2.337197] mas_alloc_nodes+0x173/0x230
> <4>[ 2.337207] mas_alloc_cyclic+0x27b/0x550
> <4>[ 2.337220] mtree_alloc_cyclic+0x92/0xf0
> `````````````````````````````````````````````````````````````````````````````````
> Details log can be found in [3].
>
> After bisecting the tree, the following patch [4] seems to be the first "bad"
> commit
>
> `````````````````````````````````````````````````````````````````````````````````````````````````````````
> maple_tree: remove mas_destroy() from mas_nomem()
>
> Separate call to mas_destroy() from mas_nomem() so we can check for no
> memory errors without destroying the current maple state in
> mas_store_gfp(). We then add calls to mas_destroy() to callers of
> mas_nomem().
>
> Link: https://lkml.kernel.org/r/20240618204750.79512-6-sidhartha.kumar@oracle.com
> Signed-off-by: Sidhartha Kumar mailto:sidhartha.kumar@oracle.com
>
> `````````````````````````````````````````````````````````````````````````````````````````````````````````
>
> We could not revert the patch because of merge conflicts but resetting to the parent of the commit seems to fix the issue.
>
> Could you please check why the patch causes this regression and provide a fix if necessary?
>
> Thank you.
>
> Regards
>
> Chaitanya
>
> [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20240625
> [3] https://intel-gfx-ci.01.org/tree/linux-next/next-20240625/bat-rpls-4/boot0.txt
> [4] https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=187827d2dc3749d66546696b78584ee4c54687b0
^ permalink raw reply [flat|nested] 4+ messages in thread
* RE: Regression on linux-next (next-20240625)
2024-06-28 14:53 ` Sidhartha Kumar
@ 2024-06-29 4:11 ` Borah, Chaitanya Kumar
0 siblings, 0 replies; 4+ messages in thread
From: Borah, Chaitanya Kumar @ 2024-06-29 4:11 UTC (permalink / raw)
To: Sidhartha Kumar
Cc: Liam.Howlett@oracle.com, akpm@linux-foundation.org,
linux-mm@kvack.org, maple-tree@lists.infradead.org, Nikula, Jani,
Saarinen, Jani, Kurmi, Suresh Kumar,
intel-gfx@lists.freedesktop.org
> -----Original Message-----
> From: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> Sent: Friday, June 28, 2024 8:24 PM
> To: Borah, Chaitanya Kumar <chaitanya.kumar.borah@intel.com>
> Cc: Liam.Howlett@oracle.com; akpm@linux-foundation.org; linux-
> mm@kvack.org; maple-tree@lists.infradead.org; Nikula, Jani
> <jani.nikula@intel.com>; Saarinen, Jani <jani.saarinen@intel.com>; Kurmi,
> Suresh Kumar <suresh.kumar.kurmi@intel.com>; intel-
> gfx@lists.freedesktop.org
> Subject: Re: Regression on linux-next (next-20240625)
>
> On 6/27/24 9:45 PM, Borah, Chaitanya Kumar wrote:
> > [converted to plain text]
> > +intel-gfx
> >
> > Gentle Reminder.
> >
>
> Hello,
>
> This patch will be dropped from mm-unstable and will not be in linux-next
> after that. I am working on a fix to include for the next version of this series.
Thank you Siddartha, We don't see the regression anymore.
Regards
Chaitanya
>
> Thanks,
> Sid
>
> > From: Borah, Chaitanya Kumar
> > Sent: Wednesday, June 26, 2024 8:52 PM
> > To: sidhartha.kumar@oracle.com
> > Cc: Liam.Howlett@oracle.com; akpm@linux-foundation.org;
> > linux-mm@kvack.org; maple-tree@lists.infradead.org; Nikula, Jani
> > <jani.nikula@intel.com>; Saarinen, Jani <jani.saarinen@intel.com>;
> > Kurmi, Suresh Kumar <Suresh.Kumar.Kurmi@intel.com>
> > Subject: Regression on linux-next (next-20240625)
> >
> > Hello Sidhartha,
> >
> > Hope you are doing well. I am Chaitanya from the linux graphics team in
> Intel.
> >
> > This mail is regarding a regression we are seeing in our CI runs[1] on linux-
> next repository.
> >
> > Since the version next-20240625 [2], we are seeing the following
> > regression
> >
> > ``````````````````````````````````````````````````````````````````````
> > ``````````` <3>[ 2.336948] BUG: sleeping function called from
> > invalid context at include/linux/sched/mm.h:337 <3>[ 2.336974]
> > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 95, name:
> > kdevtmpfs <3>[ 2.336989] preempt_count: 1, expected: 0 <3>[
> > 2.336998] RCU nest depth: 0, expected: 0 <4>[ 2.337006] 3 locks
> > held by kdevtmpfs/95:
> > <4>[ 2.337015] #0: ffff888100d2c3f0 (sb_writers){.+.+}-{0:0}, at:
> > filename_create+0x5d/0x160 <4>[ 2.337041] #1: ffff888100800840
> > (&type->i_mutex_dir_key/1){+.+.}-{3:3}, at: filename_create+0x9d/0x160
> > <4>[ 2.337065] #2: ffff888100800658 (&simple_offset_lock_class){+.+.}-
> {2:2}, at: mtree_alloc_cyclic+0x71/0xf0 <3>[ 2.337089] Preemption disabled
> at:
> > <3>[ 2.337091] [<0000000000000000>] 0x0 <4>[ 2.337105] CPU: 13
> > UID: 0 PID: 95 Comm: kdevtmpfs Not tainted
> > 6.10.0-rc5-next-20240625-next-20240625-g0fc4bfab2cd4+ #1 <4>[
> > 2.337126] Hardware name: ASUS System Product Name/PRIME Z790-P WIFI,
> BIOS 0812 02/24/2023 <4>[ 2.337141] Call Trace:
> > <4>[ 2.337147] <TASK>
> > <4>[ 2.337152] dump_stack_lvl+0xb0/0xd0 <4>[ 2.337163]
> > __might_resched+0x194/0x2b0 <4>[ 2.337175]
> > kmem_cache_alloc_noprof+0x20c/0x280
> > <4>[ 2.337186] ? mas_alloc_nodes+0x173/0x230 <4>[ 2.337197]
> > mas_alloc_nodes+0x173/0x230 <4>[ 2.337207]
> > mas_alloc_cyclic+0x27b/0x550 <4>[ 2.337220]
> > mtree_alloc_cyclic+0x92/0xf0
> > ``````````````````````````````````````````````````````````````````````
> > ```````````
> > Details log can be found in [3].
> >
> > After bisecting the tree, the following patch [4] seems to be the first "bad"
> > commit
> >
> > ``````````````````````````````````````````````````````````````````````
> > ```````````````````````````````````
> > maple_tree: remove mas_destroy() from mas_nomem()
> >
> > Separate call to mas_destroy() from mas_nomem() so we can check
> > for no
> > memory errors without destroying the current maple state in
> > mas_store_gfp(). We then add calls to mas_destroy() to callers
> > of
> > mas_nomem().
> >
> > Link:
> > https://lkml.kernel.org/r/20240618204750.79512-6-sidhartha.kumar@oracl
> > e.com
> > Signed-off-by: Sidhartha Kumar mailto:sidhartha.kumar@oracle.com
> >
> > ``````````````````````````````````````````````````````````````````````
> > ```````````````````````````````````
> >
> > We could not revert the patch because of merge conflicts but resetting to
> the parent of the commit seems to fix the issue.
> >
> > Could you please check why the patch causes this regression and provide a
> fix if necessary?
> >
> > Thank you.
> >
> > Regards
> >
> > Chaitanya
> >
> > [1] https://intel-gfx-ci.01.org/tree/linux-next/combined-alt.html?
> > [2]
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/co
> > mmit/?h=next-20240625 [3]
> > https://intel-gfx-ci.01.org/tree/linux-next/next-20240625/bat-rpls-4/b
> > oot0.txt [4]
> > https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/co
> > mmit/?id=187827d2dc3749d66546696b78584ee4c54687b0
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-06-29 4:11 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-06-26 15:22 Regression on linux-next (next-20240625) Borah, Chaitanya Kumar
2024-06-28 4:45 ` Borah, Chaitanya Kumar
2024-06-28 14:53 ` Sidhartha Kumar
2024-06-29 4:11 ` Borah, Chaitanya Kumar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).