From: Jens Axboe <axboe@kernel.dk>
To: Keith Busch <kbusch@kernel.org>
Cc: kernel test robot <oliver.sang@intel.com>,
lkp@lists.01.org, lkp@intel.com, linux-block@vger.kernel.org,
hch@lst.de
Subject: Re: [nvme] f9c499bbbf: nvme nvme0: Identify Controller failed (16641)
Date: Wed, 3 Nov 2021 17:28:25 -0600 [thread overview]
Message-ID: <cf4c8341-591c-8207-98af-82bdfb2c1054@kernel.dk> (raw)
In-Reply-To: <20211103214748.GA2654474@dhcp-10-100-145-180.wdc.com>
On 11/3/21 3:47 PM, Keith Busch wrote:
> On Wed, Nov 03, 2021 at 02:38:53PM -0700, Keith Busch wrote:
>> On Wed, Nov 03, 2021 at 01:51:18PM -0600, Jens Axboe wrote:
>>> On 11/3/21 8:14 AM, kernel test robot wrote:
>>>>
>>>>
>>>> Greeting,
>>>>
>>>> FYI, we noticed the following commit (built with gcc-9):
>>>>
>>>> commit: f9c499bbbf603389abad60d1931c16b2f96dee06 ("[PATCH 1/2] nvme: move command clear into the various setup helpers")
>>>> url: https://github.com/0day-ci/linux/commits/Jens-Axboe/nvme-move-command-clear-into-the-various-setup-helpers/20211018-214956
>>>> base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 519d81956ee277b4419c723adfb154603c2565ba
>>>> patch link: https://lore.kernel.org/linux-block/20211018124934.235658-2-axboe@kernel.dk
>>>>
>>>> in testcase: will-it-scale
>>>> version: will-it-scale-x86_64-a34a85c-1_20211029
>>>> with following parameters:
>>>>
>>>> nr_task: 50%
>>>> mode: process
>>>> test: readseek1
>>>> cpufreq_governor: performance
>>>> ucode: 0x700001e
>>>>
>>>> test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
>>>> test-url: https://github.com/antonblanchard/will-it-scale
>>>>
>>>>
>>>> on test machine: 144 threads 4 sockets Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz with 128G memory
>>>>
>>>> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>>>>
>>>>
>>>>
>>>>
>>>> If you fix the issue, kindly add following tag
>>>> Reported-by: kernel test robot <oliver.sang@intel.com>
>>>>
>>>>
>>>> [ 38.907274][ T868] nvme nvme0: pci function 0000:24:00.0
>>>> [ 38.924627][ T1103] scsi host0: ahci
>>>> 0m.
>>>> [ 38.948010][ T773] nvme nvme0: Identify Controller failed (16641)
>>>> [ 38.951220][ T1103] scsi host1: ahci
>>>> [ 38.954193][ T773] nvme nvme0: Removing after probe failure status: -5
>>>
>>> This is odd, looks like it's saying invalid opcode. Looking at the probe
>>> path, it's pretty standard and the command passed in is cleared already.
>>> So not quite sure why the patch would make a difference here. I'll
>>> poke at it.
>>
>> It's actually an Invalid Queue Identifier error (0x4101). That error
>> makes no sense for an Identify command, so it sounds like the controller
>> observed a different opcode than the driver intended to send, which
>> seems odd; I didn't observe any problems and I'm pretty sure I'm running
>> the same code. I'll take a second look as well.
>
> The git url that was used in this test points to commit:
>
> https://github.com/0day-ci/linux/commit/f9c499bbbf603389abad60d1931c16b2f96dee06
>
> And that commit has an extra memset in the REQ_OP_DRV_IN/OUT case, and
> it doesn't belong there. I don't see that memset in the upstream commit,
> Did the bot pick up the wrong patch?
Ah good catch, it's picking up a previous broken version. Good question on
why that might be, that's counter productive...
In any case, we can ignore it.
--
Jens Axboe
WARNING: multiple messages have this Message-ID (diff)
From: Jens Axboe <axboe@kernel.dk>
To: lkp@lists.01.org
Subject: Re: [nvme] f9c499bbbf: nvme nvme0: Identify Controller failed (16641)
Date: Wed, 03 Nov 2021 17:28:25 -0600 [thread overview]
Message-ID: <cf4c8341-591c-8207-98af-82bdfb2c1054@kernel.dk> (raw)
In-Reply-To: <20211103214748.GA2654474@dhcp-10-100-145-180.wdc.com>
[-- Attachment #1: Type: text/plain, Size: 3180 bytes --]
On 11/3/21 3:47 PM, Keith Busch wrote:
> On Wed, Nov 03, 2021 at 02:38:53PM -0700, Keith Busch wrote:
>> On Wed, Nov 03, 2021 at 01:51:18PM -0600, Jens Axboe wrote:
>>> On 11/3/21 8:14 AM, kernel test robot wrote:
>>>>
>>>>
>>>> Greeting,
>>>>
>>>> FYI, we noticed the following commit (built with gcc-9):
>>>>
>>>> commit: f9c499bbbf603389abad60d1931c16b2f96dee06 ("[PATCH 1/2] nvme: move command clear into the various setup helpers")
>>>> url: https://github.com/0day-ci/linux/commits/Jens-Axboe/nvme-move-command-clear-into-the-various-setup-helpers/20211018-214956
>>>> base: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git 519d81956ee277b4419c723adfb154603c2565ba
>>>> patch link: https://lore.kernel.org/linux-block/20211018124934.235658-2-axboe(a)kernel.dk
>>>>
>>>> in testcase: will-it-scale
>>>> version: will-it-scale-x86_64-a34a85c-1_20211029
>>>> with following parameters:
>>>>
>>>> nr_task: 50%
>>>> mode: process
>>>> test: readseek1
>>>> cpufreq_governor: performance
>>>> ucode: 0x700001e
>>>>
>>>> test-description: Will It Scale takes a testcase and runs it from 1 through to n parallel copies to see if the testcase will scale. It builds both a process and threads based test in order to see any differences between the two.
>>>> test-url: https://github.com/antonblanchard/will-it-scale
>>>>
>>>>
>>>> on test machine: 144 threads 4 sockets Intel(R) Xeon(R) Gold 5318H CPU @ 2.50GHz with 128G memory
>>>>
>>>> caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
>>>>
>>>>
>>>>
>>>>
>>>> If you fix the issue, kindly add following tag
>>>> Reported-by: kernel test robot <oliver.sang@intel.com>
>>>>
>>>>
>>>> [ 38.907274][ T868] nvme nvme0: pci function 0000:24:00.0
>>>> [ 38.924627][ T1103] scsi host0: ahci
>>>> 0m.
>>>> [ 38.948010][ T773] nvme nvme0: Identify Controller failed (16641)
>>>> [ 38.951220][ T1103] scsi host1: ahci
>>>> [ 38.954193][ T773] nvme nvme0: Removing after probe failure status: -5
>>>
>>> This is odd, looks like it's saying invalid opcode. Looking at the probe
>>> path, it's pretty standard and the command passed in is cleared already.
>>> So not quite sure why the patch would make a difference here. I'll
>>> poke at it.
>>
>> It's actually an Invalid Queue Identifier error (0x4101). That error
>> makes no sense for an Identify command, so it sounds like the controller
>> observed a different opcode than the driver intended to send, which
>> seems odd; I didn't observe any problems and I'm pretty sure I'm running
>> the same code. I'll take a second look as well.
>
> The git url that was used in this test points to commit:
>
> https://github.com/0day-ci/linux/commit/f9c499bbbf603389abad60d1931c16b2f96dee06
>
> And that commit has an extra memset in the REQ_OP_DRV_IN/OUT case, and
> it doesn't belong there. I don't see that memset in the upstream commit,
> Did the bot pick up the wrong patch?
Ah good catch, it's picking up a previous broken version. Good question on
why that might be, that's counter productive...
In any case, we can ignore it.
--
Jens Axboe
next prev parent reply other threads:[~2021-11-03 23:28 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-18 12:49 [PATCHSET v2] nvme: don't do full memset() for command setup Jens Axboe
2021-10-18 12:49 ` [PATCH 1/2] nvme: move command clear into the various setup helpers Jens Axboe
2021-10-18 12:53 ` [PATCH v2 " Jens Axboe
2021-10-19 18:28 ` Keith Busch
2021-11-03 14:14 ` [nvme] f9c499bbbf: nvme nvme0: Identify Controller failed (16641) kernel test robot
2021-11-03 19:51 ` Jens Axboe
2021-11-03 19:51 ` Jens Axboe
2021-11-03 20:52 ` Chaitanya Kulkarni
2021-11-03 20:52 ` Chaitanya Kulkarni
2021-11-03 21:38 ` Keith Busch
2021-11-03 21:38 ` Keith Busch
2021-11-03 21:47 ` Keith Busch
2021-11-03 21:47 ` Keith Busch
2021-11-03 23:28 ` Jens Axboe [this message]
2021-11-03 23:28 ` Jens Axboe
2021-10-18 12:49 ` [PATCH 2/2] nvme: don't memset() the normal read/write command Jens Axboe
2021-10-19 18:29 ` Keith Busch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cf4c8341-591c-8207-98af-82bdfb2c1054@kernel.dk \
--to=axboe@kernel.dk \
--cc=hch@lst.de \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=lkp@intel.com \
--cc=lkp@lists.01.org \
--cc=oliver.sang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.