From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from relay.sgi.com (relay3.corp.sgi.com [198.149.34.15]) by oss.sgi.com (Postfix) with ESMTP id 8741A7F5A for ; Mon, 1 Jun 2015 09:57:47 -0500 (CDT) Received: from cuda.sgi.com (cuda2.sgi.com [192.48.176.25]) by relay3.corp.sgi.com (Postfix) with ESMTP id ED886AC006 for ; Mon, 1 Jun 2015 07:57:46 -0700 (PDT) Received: from emea01-db3-obe.outbound.protection.outlook.com (mail-db3on0132.outbound.protection.outlook.com [157.55.234.132]) by cuda.sgi.com with ESMTP id QexZByZeMmJ2ANZU (version=TLSv1 cipher=AES256-SHA bits=256 verify=NO) for ; Mon, 01 Jun 2015 07:57:44 -0700 (PDT) Received: from otto.localdomain (otto.nzcorp.net [10.194.93.44]) by sloth.nzcorp.net (Postfix) with ESMTP id C714C7280059 for ; Mon, 1 Jun 2015 16:57:41 +0200 (CEST) Date: Mon, 1 Jun 2015 16:57:41 +0200 From: Anders Ossowicki Subject: "XFS: possible memory allocation deadlock in kmem_alloc" on high memory machine Message-ID: <20150601145741.GA16608@otto> MIME-Version: 1.0 Content-Disposition: inline Reply-To: aowi@novozymes.com List-Id: XFS Filesystem from SGI List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: xfs-bounces@oss.sgi.com Sender: xfs-bounces@oss.sgi.com To: xfs@oss.sgi.com Hi, We've started seeing a slew of these messages in dmesg: XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) First question: Is this cause for alarm at all? Should we expect the disk to blow up in our faces? Should we expect loss of performance? This is from a machine under heavy load (database server, large dataset, lots of I/O). It seems to happen only when we hit 15k-20k+ iops on the disk. We're running on 3.18.13, built from kernel.org git. The machine has 3TB of memory and after googling the message for a while, I guess memory fragmentation could be a likely cause. Looking at /proc/buddyinfo when these messages show up, we see that there are almost no fragments of order 1 and none of higher orders. My completely uneducated guess would be that the kernel can't reap pages fast enough, so XFS gets impatient waiting for them. That seems like an issue for mm though but I'd like to confirm if my understanding of what XFS does is correct. Most of the memory is used by disk cache: $ free -g total used free shared buffers cached Mem: 3023 3001 22 0 0 2840 Let me know if there is any more info I should provide. -- Anders Ossowicki _______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs