* Deadlock in Ubuntu 5.4 kernel
@ 2022-12-29 3:26 Shyam Prasad N
2022-12-29 12:16 ` Paulo Alcantara
0 siblings, 1 reply; 3+ messages in thread
From: Shyam Prasad N @ 2022-12-29 3:26 UTC (permalink / raw)
To: Paulo Alcantara, CIFS, Enzo Matsumiya
Hi Paulo/Enzo,
A customer reported this deadlock in a Kubernetes setup running on Ubuntu-18.04.
This must be a 5.4 kernel, running this code:
https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-azure/+git/bionic
Based on the stack, it appears to be a hang in DFS reconnect codepath,
trying to access the DFS cache lock in dfs_cache_update_vol.
Can you tell if this is a known issue that has been fixed since?
And if Ubuntu should backport any fix to 5.4?
I could not find the function in the mainline codebase.
dmesg:
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.066765] INFO: task cifsd:981715 blocked for more than 604
seconds.
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.073365] Not tainted 5.4.0-1091-azure #96~18.04.1-Ubuntu
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.080279] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E
node-problem-detector-startup.sh[562959]: I1210 09:07:48.142691
562992 log_monitor.go:160] New status generated:
&{Source:kernel-monitor Events:[{Severity:warn Timestamp:2022-12-10
09:07:47.636272174 +0000 UTC m=+63284.134685121 Reason:TaskHung
Message:INFO: task cifsd:981715 blocked for more than 604 seconds.}]
Conditions:[{Type:KernelDeadlock Status:False Transition:2022-12-09
15:33:03.569676476 +0000 UTC m=+0.068089323 Reason:KernelHasNoDeadlock
Message:kernel has no deadlock} {Type:ReadonlyFilesystem Status:False
Transition:2022-12-09 15:33:03.569676576 +0000 UTC m=+0.068089423
Reason:FilesystemIsNotReadOnly Message:Filesystem is not read-only}]}
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086811] cifsd D 0 981715 2 0x80004002
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086814] Call Trace:
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086826] __schedule+0x277/0x710
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086829] ? __next_timer_interrupt+0xe0/0xe0
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086836] schedule+0x33/0xa0
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086838] schedule_preempt_disabled+0xe/0x10
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086840] __mutex_lock.isra.10+0x24c/0x4a0
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086870] ? do_dfs_cache_find+0x1be/0xea0 [cifs]
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086873] __mutex_lock_slowpath+0x13/0x20
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086874] ? __mutex_lock_slowpath+0x13/0x20
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086875] mutex_lock+0x2f/0x40
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086891] dfs_cache_update_vol+0x4a/0x290 [cifs]
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086904] cifs_reconnect+0x597/0xd50 [cifs]
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086916] cifs_handle_standard+0x198/0x1c0 [cifs]
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086928] cifs_demultiplex_thread+0x9ed/0xc70 [cifs]
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086931] kthread+0x121/0x140
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086942] ? cifs_handle_standard+0x1c0/0x1c0 [cifs]
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086944] ? kthread_park+0x90/0x90
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.086946] ret_from_fork+0x35/0x40
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.087014] INFO: task kworker/0:2:2562230 blocked for more than
604 seconds.
Dec 10 09:07:48 aks-corew26-13626357-vmss00000E kernel:
[5653610.092927] Not tainted 5.4.0-1091-azure #96~18.04.1-Ubuntu
--
Regards,
Shyam
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Deadlock in Ubuntu 5.4 kernel
2022-12-29 3:26 Deadlock in Ubuntu 5.4 kernel Shyam Prasad N
@ 2022-12-29 12:16 ` Paulo Alcantara
2022-12-30 3:33 ` Shyam Prasad N
0 siblings, 1 reply; 3+ messages in thread
From: Paulo Alcantara @ 2022-12-29 12:16 UTC (permalink / raw)
To: Shyam Prasad N, CIFS, Enzo Matsumiya
Shyam Prasad N <nspmangalore@gmail.com> writes:
> A customer reported this deadlock in a Kubernetes setup running on Ubuntu-18.04.
> This must be a 5.4 kernel, running this code:
> https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-azure/+git/bionic
>
> Based on the stack, it appears to be a hang in DFS reconnect codepath,
> trying to access the DFS cache lock in dfs_cache_update_vol.
>
> Can you tell if this is a known issue that has been fixed since?
Looks like this has been fixed by
06d57378bcc9 ("cifs: Fix potential deadlock when updating vol in cifs_reconnect()")
> And if Ubuntu should backport any fix to 5.4?
I would say so. It would probably also require others dfs related
patches to be backported in addition to the above.
> I could not find the function in the mainline codebase.
Yes, it has changed alot.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: Deadlock in Ubuntu 5.4 kernel
2022-12-29 12:16 ` Paulo Alcantara
@ 2022-12-30 3:33 ` Shyam Prasad N
0 siblings, 0 replies; 3+ messages in thread
From: Shyam Prasad N @ 2022-12-30 3:33 UTC (permalink / raw)
To: Paulo Alcantara; +Cc: CIFS, Enzo Matsumiya
On Thu, Dec 29, 2022 at 5:47 PM Paulo Alcantara <pc@cjr.nz> wrote:
>
> Shyam Prasad N <nspmangalore@gmail.com> writes:
>
> > A customer reported this deadlock in a Kubernetes setup running on Ubuntu-18.04.
> > This must be a 5.4 kernel, running this code:
> > https://git.launchpad.net/~canonical-kernel/ubuntu/+source/linux-azure/+git/bionic
> >
> > Based on the stack, it appears to be a hang in DFS reconnect codepath,
> > trying to access the DFS cache lock in dfs_cache_update_vol.
> >
> > Can you tell if this is a known issue that has been fixed since?
>
> Looks like this has been fixed by
>
> 06d57378bcc9 ("cifs: Fix potential deadlock when updating vol in cifs_reconnect()")
>
Thanks for this.
> > And if Ubuntu should backport any fix to 5.4?
>
> I would say so. It would probably also require others dfs related
> patches to be backported in addition to the above.
If you could point me to a list of other patches that could be
backported to a 5.4 kernel, that would be great.
>
> > I could not find the function in the mainline codebase.
>
> Yes, it has changed alot.
--
Regards,
Shyam
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2022-12-30 3:34 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-12-29 3:26 Deadlock in Ubuntu 5.4 kernel Shyam Prasad N
2022-12-29 12:16 ` Paulo Alcantara
2022-12-30 3:33 ` Shyam Prasad N
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox