public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/2] ceph_check_delayed_caps() softlockup
@ 2021-06-29  9:47 Luis Henriques
  2021-06-29  9:47 ` [RFC PATCH v2 1/2] ceph: allow schedule_delayed() callers to set delay for workqueue Luis Henriques
  2021-06-29  9:47 ` [RFC PATCH v2 2/2] ceph: reduce contention in ceph_check_delayed_caps() Luis Henriques
  0 siblings, 2 replies; 5+ messages in thread
From: Luis Henriques @ 2021-06-29  9:47 UTC (permalink / raw)
  To: Jeff Layton, Ilya Dryomov; +Cc: ceph-devel, linux-kernel, Luis Henriques

This is an attempt to fix the softlock on the delayed_work workqueue.  As
stated in 0002 patch:

  Function ceph_check_delayed_caps() is called from the mdsc->delayed_work
  workqueue and it can be kept looping for quite some time if caps keep being
  added back to the mdsc->cap_delay_list.  This may result in the watchdog
  tainting the kernel with the softlockup flag.

v2 of this fix modifies the approach by time-bounding the loop in this
function, so that any caps added to the list *after* the loop starts will
be postponed to the next wq run.

An extra change in 0001 (suggested by Jeff) allows scheduling runs for
periods smaller than the default (5 secs) period.  This way,
delayed_work() can have the next run scheduled for the next list element
ci->i_hold_caps_max instead of 5 secs.

This patchset should fix the issue reported here [1], although a quick
search for "ceph_check_delayed_caps" in the tracker returns a few more
bugs, possibly duplicates.

[1] https://tracker.ceph.com/issues/46284

Luis Henriques (2):
  ceph: allow schedule_delayed() callers to set delay for workqueue
  ceph: reduce contention in ceph_check_delayed_caps()

 fs/ceph/caps.c       | 17 ++++++++++++++++-
 fs/ceph/mds_client.c | 24 +++++++++++++++---------
 fs/ceph/super.h      |  2 +-
 3 files changed, 32 insertions(+), 11 deletions(-)


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2021-06-29 10:54 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-06-29  9:47 [RFC PATCH v2 0/2] ceph_check_delayed_caps() softlockup Luis Henriques
2021-06-29  9:47 ` [RFC PATCH v2 1/2] ceph: allow schedule_delayed() callers to set delay for workqueue Luis Henriques
2021-06-29 10:14   ` Ilya Dryomov
2021-06-29 10:53     ` Luis Henriques
2021-06-29  9:47 ` [RFC PATCH v2 2/2] ceph: reduce contention in ceph_check_delayed_caps() Luis Henriques

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox