public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Andre Noll <maan@tuebingen.mpg.de>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Sasha Levin <sashal@kernel.org>,
	Dave Chinner <david@fromorbit.com>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	linux-xfs@vger.kernel.org, stable@vger.kernel.org
Subject: Re: xfs: Assertion failed in xfs_ag_resv_init()
Date: Thu, 2 May 2019 19:45:16 +0200	[thread overview]
Message-ID: <20190502174516.GY2780@tuebingen.mpg.de> (raw)
In-Reply-To: <20190502165244.GB14995@kroah.com>

[-- Attachment #1: Type: text/plain, Size: 2275 bytes --]

On Thu, May 02, 18:52, Greg Kroah-Hartman wrote
> On Thu, May 02, 2019 at 05:27:36PM +0200, Andre Noll wrote:
> > On Thu, May 02, 16:10, Greg Kroah-Hartman wrote
> > > Ok, then how about we hold off on this patch for 4.9.y then.  "no one"
> > > should be using 4.9.y in a "server system" anymore, unless you happen to
> > > have an enterprise kernel based on it.  So we should be fine as the
> > > users of the older kernels don't run xfs.
> > 
> > Well, we do run xfs on top of bcache on vanilla 4.9 kernels on a few
> > dozen production servers here. Mainly because we ran into all sorts
> > of issues with newer kernels (not necessary related to xfs). 4.9,
> > OTOH, appears to be rock solid for our workload.
> 
> Great, but what is wrong with 4.14.y or better yet, 4.19.y?  Do those
> also work for your workload?  If not, we should fix that, and soon :)

Some months ago we tried 4.14 and it was a real disaster: random
crashes with nothing in the logs on the file servers and unkillable
hung processes on the compute machines. The thing is, I can't afford
an extended downtime of these production systems, or test patches, or
enable debugging options which slow down the systems too much. Also,
10 of the compute nodes load the nvidia module, so all bets are off
anyway. But we've seen the hung processes also on the non-gpu nodes
where the nvidia module is not loaded.

As for 4.19, xfs on bcache was broken until a couple of weeks
ago. Meanwhile the fix (e578f90d8a9c) went in, so I benchmarked 4.19.x
on one system briefly. To my surprise the results were *worse* than
with 4.9. This seems to be another cache bypass issue, but I need to
have a closer look, and more reliable numbers.

> I would _STRONGLY_ recommend moving of of 4.9 on any non-SoC-based
> system at this point in time, there should not be any reason to stick
> with it, unless you are paying a company to provide support for it.

That's really bad news :(

Thanks for sharing your thoughts about the future of 4.9, though. I'll
try to spend some time on the bcache issue on 4.19.

Best
Andre
-- 
Max Planck Institute for Developmental Biology
Max-Planck-Ring 5, 72076 Tübingen, Germany. Phone: (+49) 7071 601 829
http://people.tuebingen.mpg.de/maan/

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

  reply	other threads:[~2019-05-02 17:45 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-04-30 12:14 xfs: Assertion failed in xfs_ag_resv_init() Andre Noll
2019-04-30 15:11 ` Darrick J. Wong
2019-04-30 16:25   ` Andre Noll
2019-04-30 17:40     ` Darrick J. Wong
2019-04-30 19:05       ` Andre Noll
2019-04-30 19:18         ` Darrick J. Wong
2019-04-30 21:07           ` Andre Noll
2019-05-01 15:36             ` Darrick J. Wong
2019-05-01 16:59               ` Andre Noll
2019-05-01 17:15                 ` Greg Kroah-Hartman
2019-05-01 17:51                   ` Andre Noll
2019-05-01 19:28                     ` Darrick J. Wong
2019-05-01 22:11                       ` Dave Chinner
2019-05-02 11:44                         ` Greg Kroah-Hartman
2019-05-02 11:45                           ` Greg Kroah-Hartman
2019-05-02 13:20                           ` Sasha Levin
2019-05-02 14:10                             ` Greg Kroah-Hartman
2019-05-02 15:27                               ` Andre Noll
2019-05-02 16:52                                 ` Greg Kroah-Hartman
2019-05-02 17:45                                   ` Andre Noll [this message]
2019-05-02 17:55                                     ` Sasha Levin
2019-05-02 12:34                         ` Sasha Levin
2019-05-01 17:00               ` Andre Noll

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190502174516.GY2780@tuebingen.mpg.de \
    --to=maan@tuebingen.mpg.de \
    --cc=darrick.wong@oracle.com \
    --cc=david@fromorbit.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=sashal@kernel.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox