linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "djwong@kernel.org" <djwong@kernel.org>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	"hch@infradead.org" <hch@infradead.org>,
	"trondmy@kernel.org" <trondmy@kernel.org>
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>
Subject: Re: [PATCH] iomap: Address soft lockup in iomap_finish_ioend()
Date: Thu, 30 Dec 2021 22:25:14 +0000	[thread overview]
Message-ID: <a42acb06152b0ecba3e99aec38349e1f29304b1e.camel@hammerspace.com> (raw)
In-Reply-To: <05bac8cb-e36a-b043-5ac3-82c585f76bbe@kernel.dk>

On Thu, 2021-12-30 at 13:24 -0800, Jens Axboe wrote:
> On 12/30/21 11:35 AM, trondmy@kernel.org wrote:
> > From: Trond Myklebust <trond.myklebust@hammerspace.com>
> > 
> > We're observing the following stack trace using various kernels
> > when
> > running in the Azure cloud.
> > 
> >  watchdog: BUG: soft lockup - CPU#12 stuck for 23s!
> > [kworker/12:1:3106]
> >  Modules linked in: raid0 ipt_MASQUERADE nf_conntrack_netlink
> > xt_addrtype nft_chain_nat nf_nat br_netfilter bridge stp llc ext4
> > mbcache jbd2 overlay xt_conntrack nf_conntrack nf_defrag_ipv6
> > nf_defrag_ipv4 nft_counter rpcrdma rdma_ucm xt_owner ib_srpt
> > nft_compat intel_rapl_msr ib_isert intel_rapl_common nf_tables
> > iscsi_target_mod isst_if_mbox_msr isst_if_common nfnetlink
> > target_core_mod nfit ib_iser libnvdimm libiscsi
> > scsi_transport_iscsi ib_umad kvm_intel ib_ipoib rdma_cm iw_cm vfat
> > ib_cm fat kvm irqbypass crct10dif_pclmul crc32_pclmul mlx5_ib
> > ghash_clmulni_intel rapl ib_uverbs ib_core i2c_piix4 pcspkr
> > hyperv_fb hv_balloon hv_utils joydev nfsd auth_rpcgss nfs_acl lockd
> > grace sunrpc ip_tables xfs libcrc32c mlx5_core mlxfw tls pci_hyperv
> > pci_hyperv_intf sd_mod t10_pi sg ata_generic hv_storvsc hv_netvsc
> > scsi_transport_fc hyperv_keyboard hid_hyperv ata_piix libata
> > crc32c_intel hv_vmbus serio_raw fuse
> >  CPU: 12 PID: 3106 Comm: kworker/12:1 Not tainted 4.18.0-
> > 305.10.2.el8_4.x86_64 #1
> >  Hardware name: Microsoft Corporation Virtual Machine/Virtual
> > Machine, BIOS 090008  12/07/2018
> >  Workqueue: xfs-conv/md127 xfs_end_io [xfs]
> >  RIP: 0010:_raw_spin_unlock_irqrestore+0x11/0x20
> >  Code: 7c ff 48 29 e8 4c 39 e0 76 cf 80 0b 08 eb 8c 90 90 90 90 90
> > 90 90 90 90 90 0f 1f 44 00 00 e8 e6 db 7e ff 66 90 48 89 f7 57 9d
> > <0f> 1f 44 00 00 c3 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 8b 07
> >  RSP: 0018:ffffac51d26dfd18 EFLAGS: 00000202 ORIG_RAX:
> > ffffffffffffff12
> >  RAX: 0000000000000001 RBX: ffffffff980085a0 RCX: dead000000000200
> >  RDX: ffffac51d3893c40 RSI: 0000000000000202 RDI: 0000000000000202
> >  RBP: 0000000000000202 R08: ffffac51d3893c40 R09: 0000000000000000
> >  R10: 00000000000000b9 R11: 00000000000004b3 R12: 0000000000000a20
> >  R13: ffffd228f3e5a200 R14: ffff963cf7f58d10 R15: ffffd228f3e5a200
> >  FS:  0000000000000000(0000) GS:ffff9625bfb00000(0000)
> > knlGS:0000000000000000
> >  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >  CR2: 00007f5035487500 CR3: 0000000432810004 CR4: 00000000003706e0
> >  DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> >  DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> >  Call Trace:
> >   wake_up_page_bit+0x8a/0x110
> >   iomap_finish_ioend+0xd7/0x1c0
> >   iomap_finish_ioends+0x7f/0xb0
> >   xfs_end_ioend+0x6b/0x100 [xfs]
> >   ? xfs_setfilesize_ioend+0x60/0x60 [xfs]
> >   xfs_end_io+0xb9/0xe0 [xfs]
> >   process_one_work+0x1a7/0x360
> >   worker_thread+0x1fa/0x390
> >   ? create_worker+0x1a0/0x1a0
> >   kthread+0x116/0x130
> >   ? kthread_flush_work_fn+0x10/0x10
> >   ret_from_fork+0x35/0x40
> > 
> > Jens suggested adding a latency-reducing cond_resched() to the loop
> > in
> > iomap_finish_ioends().
> 
> The patch doesn't add it there though, I was suggesting:
> 
> diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
> index 71a36ae120ee..4ad2436a936a 100644
> --- a/fs/iomap/buffered-io.c
> +++ b/fs/iomap/buffered-io.c
> @@ -1078,6 +1078,7 @@ iomap_finish_ioends(struct iomap_ioend *ioend,
> int error)
>                 ioend = list_first_entry(&tmp, struct iomap_ioend,
> io_list);
>                 list_del_init(&ioend->io_list);
>                 iomap_finish_ioend(ioend, error);
> +               cond_resched();
>         }
>  }
>  EXPORT_SYMBOL_GPL(iomap_finish_ioends);
> 
> as I don't think you need it once-per-vec. But not sure if you tested
> that variant or not...
> 

Yes, we did test that variant, but were still seeing the soft lockups
on Azure, hence why I moved it into the inner loop.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  reply	other threads:[~2021-12-30 22:25 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-30 19:35 [PATCH] iomap: Address soft lockup in iomap_finish_ioend() trondmy
2021-12-30 21:24 ` Jens Axboe
2021-12-30 22:25   ` Trond Myklebust [this message]
2021-12-30 22:27     ` Jens Axboe
2021-12-30 22:55       ` Trond Myklebust
2021-12-31  1:42 ` Matthew Wilcox
2021-12-31  6:16   ` Trond Myklebust
2022-01-01  3:55     ` Dave Chinner
2022-01-01 17:39       ` Trond Myklebust
2022-01-03 22:03         ` Dave Chinner
2022-01-04  0:04           ` Trond Myklebust
2022-01-04  1:22             ` Dave Chinner
2022-01-04  3:01               ` Trond Myklebust
2022-01-04  7:08               ` hch
2022-01-04 18:08                 ` Matthew Wilcox
2022-01-04 18:14                   ` hch
2022-01-04 19:22                     ` Darrick J. Wong
2022-01-04 21:52                       ` Dave Chinner
2022-01-04 23:12                         ` Darrick J. Wong
2022-01-05  2:10                           ` Dave Chinner
2022-01-05 13:56                             ` Brian Foster
2022-01-05 22:04                               ` Dave Chinner
2022-01-06 16:44                                 ` Brian Foster
2022-01-10  8:18                                   ` Dave Chinner
2022-01-10 17:45                                     ` Brian Foster
2022-01-10 18:11                                       ` hch
2022-01-11 14:33                                       ` Trond Myklebust
2022-01-05 13:42                           ` hch
2022-01-04 21:16                 ` Dave Chinner
2022-01-05 13:43                   ` hch
2022-01-05 22:34                     ` Dave Chinner
2022-01-05  2:09               ` Trond Myklebust
2022-01-05 20:45                 ` Trond Myklebust
2022-01-05 22:48                   ` Dave Chinner
2022-01-05 23:29                     ` Trond Myklebust
2022-01-06  0:01                     ` Darrick J. Wong
2022-01-09 23:09                       ` Dave Chinner
2022-01-06 18:36                     ` Trond Myklebust
2022-01-06 18:38                       ` Trond Myklebust
2022-01-06 20:07                       ` Brian Foster
2022-01-07  3:08                         ` Trond Myklebust
2022-01-07 15:15                           ` Brian Foster
2022-01-09 23:34                       ` Dave Chinner
2022-01-10 23:37                       ` Dave Chinner
2022-01-11  0:08                         ` Dave Chinner
2022-01-13 17:01                         ` Trond Myklebust
2022-01-17 17:24                           ` Trond Myklebust
2022-01-17 17:36                             ` Darrick J. Wong
2022-01-04 13:36         ` Brian Foster
2022-01-04 19:23           ` Darrick J. Wong
2022-01-05  2:31 ` [iomap] f5934dda54: BUG:sleeping_function_called_from_invalid_context_at_fs/iomap/buffered-io.c kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a42acb06152b0ecba3e99aec38349e1f29304b1e.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=axboe@kernel.dk \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=trondmy@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).