linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Trond Myklebust <trondmy@hammerspace.com>
To: "bfoster@redhat.com" <bfoster@redhat.com>
Cc: "david@fromorbit.com" <david@fromorbit.com>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"trondmy@kernel.org" <trondmy@kernel.org>,
	"hch@infradead.org" <hch@infradead.org>,
	"axboe@kernel.dk" <axboe@kernel.dk>,
	"linux-xfs@vger.kernel.org" <linux-xfs@vger.kernel.org>,
	"djwong@kernel.org" <djwong@kernel.org>,
	"willy@infradead.org" <willy@infradead.org>
Subject: Re: [PATCH] iomap: Address soft lockup in iomap_finish_ioend()
Date: Fri, 7 Jan 2022 03:08:48 +0000	[thread overview]
Message-ID: <b7eb3f2cf7a2c819f38c647f4247ff1de80e19b9.camel@hammerspace.com> (raw)
In-Reply-To: <YddMGRQrYOWr6V9A@bfoster>

On Thu, 2022-01-06 at 15:07 -0500, Brian Foster wrote:
> On Thu, Jan 06, 2022 at 06:36:52PM +0000, Trond Myklebust wrote:
> > On Thu, 2022-01-06 at 09:48 +1100, Dave Chinner wrote:
> > > On Wed, Jan 05, 2022 at 08:45:05PM +0000, Trond Myklebust wrote:
> > > > On Tue, 2022-01-04 at 21:09 -0500, Trond Myklebust wrote:
> > > > > On Tue, 2022-01-04 at 12:22 +1100, Dave Chinner wrote:
> > > > > > On Tue, Jan 04, 2022 at 12:04:23AM +0000, Trond Myklebust
> > > > > > wrote:
> > > > > > > We have different reproducers. The common feature appears
> > > > > > > to
> > > > > > > be
> > > > > > > the
> > > > > > > need for a decently fast box with fairly large memory
> > > > > > > (128GB
> > > > > > > in
> > > > > > > one
> > > > > > > case, 400GB in the other). It has been reproduced with
> > > > > > > HDs,
> > > > > > > SSDs
> > > > > > > and
> > > > > > > NVME systems.
> > > > > > > 
> > > > > > > On the 128GB box, we had it set up with 10+ disks in a
> > > > > > > JBOD
> > > > > > > configuration and were running the AJA system tests.
> > > > > > > 
> > > > > > > On the 400GB box, we were just serially creating large (>
> > > > > > > 6GB)
> > > > > > > files
> > > > > > > using fio and that was occasionally triggering the issue.
> > > > > > > However
> > > > > > > doing
> > > > > > > an strace of that workload to disk reproduced the problem
> > > > > > > faster
> > > > > > > :-
> > > > > > > ).
> > > > > > 
> > > > > > Ok, that matches up with the "lots of logically sequential
> > > > > > dirty
> > > > > > data on a single inode in cache" vector that is required to
> > > > > > create
> > > > > > really long bio chains on individual ioends.
> > > > > > 
> > > > > > Can you try the patch below and see if addresses the issue?
> > > > > > 
> > > > > 
> > > > > That patch does seem to fix the soft lockups.
> > > > > 
> > > > 
> > > > Oops... Strike that, apparently our tests just hit the
> > > > following
> > > > when
> > > > running on AWS with that patch.
> > > 
> > > OK, so there are also large contiguous physical extents being
> > > allocated in some cases here.
> > > 
> > > > So it was harder to hit, but we still did eventually.
> > > 
> > > Yup, that's what I wanted to know - it indicates that both the
> > > filesystem completion processing and the iomap page processing
> > > play
> > > a role in the CPU usage. More complex patch for you to try
> > > below...
> > > 
> > > Cheers,
> > > 
> > > Dave.
> > 
> > Hi Dave,
> > 
> > This patch got further than the previous one. However it too failed
> > on
> > the same AWS setup after we started creating larger (in this case
> > 52GB)
> > files. The previous patch failed at 15GB.
> > 
> 
> Care to try my old series [1] that attempted to address this,
> assuming
> it still applies to your kernel? You should only need patches 1 and
> 2.
> You can toss in patch 3 if you'd like, but as Dave's earlier patch
> has
> shown, this can just make it harder to reproduce.
> 
> I don't know if this will go anywhere as is, but I was never able to
> get
> any sort of confirmation from the previous reporter to understand at
> least whether it is effective. I agree with Jens' earlier concern
> that
> the per-page yields are probably overkill, but if it were otherwise
> effective it shouldn't be that hard to add filtering. Patch 3 could
> also
> technically be used in place of patch 1 if we really wanted to go
> that
> route, but I wouldn't take that step until there was some
> verification
> that the yielding heuristic is effective.
> 
> Brian
> 
> [1]
> https://lore.kernel.org/linux-xfs/20210517171722.1266878-1-bfoster@redhat.com/
> 
> 
> 

Hi Brian,

I would expect those to work, since the first patch is essentially
identical to the one I wrote and tested before trying Dave's first
patch version (at least for the special case of XFS). However we never
did test that patch on the AWS setup, so let me try your patches 1 & 2
and see if they get us further than 52GB.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



  reply	other threads:[~2022-01-07  3:08 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-30 19:35 [PATCH] iomap: Address soft lockup in iomap_finish_ioend() trondmy
2021-12-30 21:24 ` Jens Axboe
2021-12-30 22:25   ` Trond Myklebust
2021-12-30 22:27     ` Jens Axboe
2021-12-30 22:55       ` Trond Myklebust
2021-12-31  1:42 ` Matthew Wilcox
2021-12-31  6:16   ` Trond Myklebust
2022-01-01  3:55     ` Dave Chinner
2022-01-01 17:39       ` Trond Myklebust
2022-01-03 22:03         ` Dave Chinner
2022-01-04  0:04           ` Trond Myklebust
2022-01-04  1:22             ` Dave Chinner
2022-01-04  3:01               ` Trond Myklebust
2022-01-04  7:08               ` hch
2022-01-04 18:08                 ` Matthew Wilcox
2022-01-04 18:14                   ` hch
2022-01-04 19:22                     ` Darrick J. Wong
2022-01-04 21:52                       ` Dave Chinner
2022-01-04 23:12                         ` Darrick J. Wong
2022-01-05  2:10                           ` Dave Chinner
2022-01-05 13:56                             ` Brian Foster
2022-01-05 22:04                               ` Dave Chinner
2022-01-06 16:44                                 ` Brian Foster
2022-01-10  8:18                                   ` Dave Chinner
2022-01-10 17:45                                     ` Brian Foster
2022-01-10 18:11                                       ` hch
2022-01-11 14:33                                       ` Trond Myklebust
2022-01-05 13:42                           ` hch
2022-01-04 21:16                 ` Dave Chinner
2022-01-05 13:43                   ` hch
2022-01-05 22:34                     ` Dave Chinner
2022-01-05  2:09               ` Trond Myklebust
2022-01-05 20:45                 ` Trond Myklebust
2022-01-05 22:48                   ` Dave Chinner
2022-01-05 23:29                     ` Trond Myklebust
2022-01-06  0:01                     ` Darrick J. Wong
2022-01-09 23:09                       ` Dave Chinner
2022-01-06 18:36                     ` Trond Myklebust
2022-01-06 18:38                       ` Trond Myklebust
2022-01-06 20:07                       ` Brian Foster
2022-01-07  3:08                         ` Trond Myklebust [this message]
2022-01-07 15:15                           ` Brian Foster
2022-01-09 23:34                       ` Dave Chinner
2022-01-10 23:37                       ` Dave Chinner
2022-01-11  0:08                         ` Dave Chinner
2022-01-13 17:01                         ` Trond Myklebust
2022-01-17 17:24                           ` Trond Myklebust
2022-01-17 17:36                             ` Darrick J. Wong
2022-01-04 13:36         ` Brian Foster
2022-01-04 19:23           ` Darrick J. Wong
2022-01-05  2:31 ` [iomap] f5934dda54: BUG:sleeping_function_called_from_invalid_context_at_fs/iomap/buffered-io.c kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b7eb3f2cf7a2c819f38c647f4247ff1de80e19b9.camel@hammerspace.com \
    --to=trondmy@hammerspace.com \
    --cc=axboe@kernel.dk \
    --cc=bfoster@redhat.com \
    --cc=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=hch@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=trondmy@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).