public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Keith Busch <keith.busch@intel.com>
To: Thomas Fjellstrom <thomas@fjellstrom.ca>
Cc: linux-kernel <linux-kernel@vger.kernel.org>,
	linux-nvme@lists.infradead.org
Subject: Re: nvme, allocation failures, and aborted commands
Date: Mon, 6 Mar 2017 17:46:33 -0500	[thread overview]
Message-ID: <20170306224633.GA21202@localhost.localdomain> (raw)
In-Reply-To: <24015519.RaUGNvtkS7@natasha>

On Sun, Mar 05, 2017 at 11:11:45PM -0700, Thomas Fjellstrom wrote:
> Tonight I decided to try kernel 4.11-rc1. Still getting page allocation
> failures and aborted nvme commands once iozone gets to the fwrite/fread
> testing.
> 
> The taint seems to be comming from previos warnings from the radeon kernel 
> driver.
> 
> Its a pretty simple: iozone -a -s 64G  run. 
> 
> new log:
> 
> [ 5710.312195] kworker/u16:1: page allocation failure: order:2, mode:0x1084020(GFP_ATOMIC|__GFP_COMP), nodemask=(null)
> [ 5710.312202] kworker/u16:1 cpuset=/ mems_allowed=0
> [ 5710.312208] CPU: 1 PID: 30166 Comm: kworker/u16:1 Tainted: G        W       4.11.0-rc1 #2
> [ 5710.312209] Hardware name: To be filled by O.E.M. To be filled by O.E.M./970 PRO GAMING/AURA, BIOS 0901 11/07/2016
> [ 5710.312214] Workqueue: writeback wb_workfn (flush-259:0)
> [ 5710.312217] Call Trace:
> [ 5710.312223]  dump_stack+0x4f/0x73
> [ 5710.312225]  warn_alloc+0xde/0x170
> [ 5710.312227]  ? wake_all_kswapds+0x60/0xa0
> [ 5710.312229]  __alloc_pages_slowpath+0xe21/0xe50
> [ 5710.312232]  ? __inc_zone_state+0x42/0xb0
> [ 5710.312235]  ? preempt_count_add+0xa3/0xc0
> [ 5710.312236]  __alloc_pages_nodemask+0x1c6/0x1e0
> [ 5710.312239]  alloc_pages_current+0x90/0x140
> [ 5710.312241]  kmalloc_order_trace+0x29/0xf0
> [ 5710.312243]  ? alloc_pages_current+0x90/0x140
> [ 5710.312245]  __kmalloc+0x17a/0x1a0
> [ 5710.312250]  nvme_queue_rq+0x12f/0x860 [nvme]
> [ 5710.312253]  ? __sbitmap_queue_get+0x2b/0xa0
> [ 5710.312255]  ? __blk_mq_get_tag.isra.7+0x62/0x70
> [ 5710.312256]  ? blk_mq_get_tag+0x82/0x2b0
> [ 5710.312258]  blk_mq_try_issue_directly+0x8f/0xe0
> [ 5710.312260]  blk_mq_make_request+0x1ef/0x750

I'm not familiar with iozone, but I'll offer what the nvme driver is
doing here.

The error says we failed to allocate 2 physically continguous pages from
nvme_queue_rq. The only thing we allocate is a driver IO descriptor. Order
2 means your data transfer for this IO is pretty large, somewhere between
1 and 4MB. I don't think we need contiguous memory here, though; I'll
see if we can use vmalloc in case memory is heavily fragmented.

On the timeout/abort, this appears to be happening on a functioning
device completing commands just slower than the kernel's timeout.
If your io test creates a very high outstanding depth with very large
transfer sizes, it's possible to create tail latencies higher than the
default 30 second timeout (YMMV, check your device's capabilities).

Could you throttle the device's max transfer and queue depth prior to
running your IO test and see if the error messages clear up? For example:

  echo 128 > /sys/block/nvme0n1/queue/max_hw_sectors_kb
  echo 128 > /sys/block/nvme0n1/queue/max_nr_requests

  reply	other threads:[~2017-03-06 22:41 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-03-03 15:27 nvme, allocation failures, and aborted commands Thomas Fjellstrom
2017-03-06  2:57 ` Thomas Fjellstrom
2017-03-06  6:11   ` Thomas Fjellstrom
2017-03-06 22:46     ` Keith Busch [this message]
2017-03-09 22:01       ` Thomas Fjellstrom
2017-03-09 23:03         ` Keith Busch
2017-03-11 16:31           ` Thomas Fjellstrom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170306224633.GA21202@localhost.localdomain \
    --to=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=thomas@fjellstrom.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox