Re: XFS: possible memory allocation deadlock in kmem_alloc on glusterfs setup

linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Dave Chinner <david@fromorbit.com>
To: Cyril Peponnet <cyril.peponnet@nuagenetworks.net>
Cc: linux-xfs@vger.kernel.org
Subject: Re: XFS: possible memory allocation deadlock in kmem_alloc on glusterfs setup
Date: Mon, 5 Dec 2016 09:46:04 +1100	[thread overview]
Message-ID: <20161204224604.GN31101@dastard> (raw)
In-Reply-To: <9B23CFED-4AFC-46FC-8E35-AD85B11FEA02@nuagenetworks.net>

On Sun, Dec 04, 2016 at 02:07:18PM -0800, Cyril Peponnet wrote:
> Hi here is the details. The issue is on the scratch RAID array
> (used to store kvm snapshots). The other raid array is fine (no
> snapshot storage).

How do you know that? There's no indication which filesystem is
generating the warnings....

FWIW, the only gluster snapshot proposal that I was aware of
was this one:

https://lists.gnu.org/archive/html/gluster-devel/2013-08/msg00004.html

Which used LVM snapshots to take snapshots of the entire brick. I
don't see any LVM in your config, so I'm not sure what snapshot
implementation you are using here. What are you using to take the
snapshots of your VM image files? Are you actually using the
qemu qcow2 snapshot functionality rather than anything native to
gluster?

Also, can you attach the 'xfs_bmap -vp' output of some of these
image files and their snapshots?

> MemTotal:       65699268 kB

64GB RAM...

> MemFree:         2058304 kB
> MemAvailable:   62753028 kB
> Buffers:              12 kB
> Cached:         57664044 kB

56GB of cached file data. If you're getting high order allocation
failures (which I suspect is the problem) then this is a memory
fragmentation problem more than anything.

> ----------------------------------------------------------------
> DG/VD TYPE  State Access Consist Cache Cac sCC     Size Name
> ----------------------------------------------------------------
> 0/0   RAID0 Optl  RW     Yes     RAWBC -   ON  7.275 TB scratch
> ----------------------------------------------------------------
> 
> Cac=CacheCade|Rec=Recovery|OfLn=OffLine|Pdgd=Partially Degraded|dgrd=Degraded
> Optl=Optimal|RO=Read Only|RW=Read Write|HD=Hidden|B=Blocked|Consist=Consistent|
> R=Read Ahead Always|NR=No Read Ahead|WB=WriteBack|
> AWB=Always WriteBack|WT=WriteThrough|C=Cached IO|D=Direct IO|sCC=Scheduled

IIRC, AWB means that if the cache goes into degraded/offline mode,
you're vulnerable to corruption/loss on power failure...

> xfs_info /export/raid/scratch/
> meta-data=/dev/sdb               isize=256    agcount=32, agsize=61030368 blks
>          =                       sectsz=512   attr=2, projid32bit=1
>          =                       crc=0
> data     =                       bsize=4096   blocks=1952971776, imaxpct=5
>          =                       sunit=32     swidth=128 blks
> naming   =version 2              bsize=4096   ascii-ci=0 ftype=0
> log      =internal               bsize=4096   blocks=521728, version=2
>          =                       sectsz=512   sunit=32 blks, lazy-count=1
> realtime =none                   extsz=4096   blocks=0, rtextents=0

Nothing unusual there.

> Nothing relevant in dmesg except several occurences of the following.
> 
> [7649583.386283] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
> [7649585.370830] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
> [7649587.241290] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)
> [7649589.243881] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250)

Ah, the kernel is old enough it doesn't have the added reporting to
tell us what the process and size of the allocation being requested
is.

Hmm - it's an xfs_err() call, thet means we should be able to get a
stack trace out of the kernel if we turn the error level up to 11.

# echo 11 > /proc/sys/fs/xfs/error_level

And wait for it to happena again. that should give a stack trace
telling us where the issue is.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

next prev parent reply	other threads:[~2016-12-04 22:46 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-03 19:08 XFS: possible memory allocation deadlock in kmem_alloc on glusterfs setup Cyril Peponnet
2016-12-04 21:49 ` Dave Chinner
2016-12-04 22:07   ` Cyril Peponnet
2016-12-04 22:46     ` Dave Chinner [this message]
2016-12-04 23:24       ` Cyril Peponnet
2016-12-04 23:50         ` Dave Chinner
2016-12-05  1:14           ` Cyril Peponnet
2016-12-05  1:22             ` Dave Chinner
2016-12-05  1:48               ` Cyril Peponnet
     [not found]               ` <C07DD929-5600-4934-A6B0-C0A7D83D7247@nuagenetworks.net>
2016-12-05  7:46                 ` Dave Chinner
2016-12-05 15:51                   ` Cyril Peponnet
2016-12-05 21:45                     ` Dave Chinner
2016-12-06 17:54                       ` Cyril Peponnet
2016-12-07  6:16                         ` Dave Chinner
     [not found]                         ` <473936408.4772.1481091425441@itfw6.prod.google.com>
     [not found]                           ` <8176484246282250577@unknownmsgid>
2016-12-07 19:44                             ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20161204224604.GN31101@dastard \
    --to=david@fromorbit.com \
    --cc=cyril.peponnet@nuagenetworks.net \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).