* Re: [block] 4e366a7962: blktests.nvme.005.fail [not found] <20190325010409.GB6379@shao2-debian> @ 2019-03-28 16:31 ` Johannes Thumshirn 2019-03-28 17:48 ` Christoph Hellwig 2019-03-29 0:33 ` Ming Lei 0 siblings, 2 replies; 6+ messages in thread From: Johannes Thumshirn @ 2019-03-28 16:31 UTC (permalink / raw) To: Linux Block Layer Mailinglist Cc: Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux FSDEVEL Mailinglist, Jens Axboe, lkp, Ming Lei On 25/03/2019 02:04, kernel test robot wrote: [...] > nvme/005 (reset local loopback target) > runtime ... > nvme/005 (reset local loopback target) [failed] > runtime ... 0.596s > something found in dmesg: > [ 24.160182] run blktests nvme/005 at 2019-03-24 07:35:16 > [ 24.346189] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 > [ 24.373089] nvmet: creating controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:dbd85962-80d1-4872-ac0f-d0214f3b1131. > [ 24.394247] WARNING: CPU: 1 PID: 881 at /kbuild/src/x86_64/block/blk-merge.c:608 blk_rq_map_sg+0x5f0/0x6c0 > [ 24.396082] Modules linked in: nvme_loop nvme_fabrics nvmet nvme_core loop crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ata_generic sd_mod pata_acpi sg bochs_drm ttm ppdev drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops snd_pcm drm ata_piix snd_timer aesni_intel crypto_simd snd cryptd glue_helper libata joydev soundcore pcspkr serio_raw virtio_scsi i2c_piix4 parport_pc floppy parport ip_tables > [ 24.403009] CPU: 1 PID: 881 Comm: nvme Not tainted 5.0.0-rc6-00148-g4e366a7 #1 > [ 24.404372] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 > [ 24.405930] RIP: 0010:blk_rq_map_sg+0x5f0/0x6c0 > [ 24.406808] Code: 41 8b 91 10 01 00 00 8b 43 24 85 c2 0f 84 e3 fb ff ff f7 d0 21 d0 83 c0 01 41 01 40 0c 01 83 d0 00 00 00 e9 cd fb ff ff 0f 0b <0f> 0b e9 c3 fc ff ff 80 3d 66 f3 1d 01 00 0f 85 88 fb ff ff 48 c7 > [ 24.410274] RSP: 0000:ffffb95c409d7908 EFLAGS: 00010202 > ... > (See '/lkp/benchmarks/blktests/blktests/results/nodev/nvme/005.dmesg' for the entire message) OK we're tripping over this in blk_rq_map_sg(): 607 /* 608 * Something must have been wrong if the figured number of 609 * segment is bigger than number of req's physical segments 610 */ 611 WARN_ON(nsegs > blk_rq_nr_phys_segments(rq)); In all cases I could reproduce it I had nsegs == 2 and rq->nr_phys_segments == 1 (rq->bio->bi_phys_segments == 1). Adding this into nvme-loop "fixes" the issue but I don't think this is the correct way to do --- a/drivers/nvme/target/loop.c +++ b/drivers/nvme/target/loop.c @@ -387,6 +387,7 @@ static int nvme_loop_configure_admin_que goto out_free_tagset; } + blk_queue_segment_boundary(ctrl->ctrl.admin_q, PAGE_SIZE - 1); error = nvmf_connect_admin_queue(&ctrl->ctrl); if (error) goto out_cleanup_queue; What raises my suspicion is this code fragment from bio_add_pc_page(): 731 732 /* If we may be able to merge these biovecs, force a recount */ 733 if (bio->bi_vcnt > 1 && biovec_phys_mergeable(q, bvec - 1, bvec)) 734 bio->bi_phys_segments = -1; This -1 (from this patch) would fit the one less than 'nsegs' but after days of searching and staring at the code, I cannot find where we loose the one. Before this patch set we would only have cleared the BIO_SEG_VALID flag and thus we never run out of sync. Adding a call to blk_recount_segments() doesn't do any difference here. It can of be a read herring as well. What makes me suspicious is the fact that I can't reliably trigger it by running nvme/005 but have to run it in a loop ranging between several seconds to some minutes. Anyone any ideas? I've been staring at this for several days now but can't find out whats wrong. Thanks a lot, Johannes -- Johannes Thumshirn SUSE Labs Filesystems jthumshirn@suse.de +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [block] 4e366a7962: blktests.nvme.005.fail 2019-03-28 16:31 ` [block] 4e366a7962: blktests.nvme.005.fail Johannes Thumshirn @ 2019-03-28 17:48 ` Christoph Hellwig 2019-03-28 17:52 ` Jens Axboe 2019-03-29 0:33 ` Ming Lei 1 sibling, 1 reply; 6+ messages in thread From: Christoph Hellwig @ 2019-03-28 17:48 UTC (permalink / raw) To: Johannes Thumshirn Cc: Linux Block Layer Mailinglist, Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux FSDEVEL Mailinglist, Jens Axboe, lkp, Ming Lei On Thu, Mar 28, 2019 at 05:31:29PM +0100, Johannes Thumshirn wrote: > In all cases I could reproduce it I had nsegs == 2 and > rq->nr_phys_segments == 1 (rq->bio->bi_phys_segments == 1). And all cases are pass through as well? Can you check if this still shows up with the series from Ming which switches the passthrough requests to use multi-page biovecs aswell and gets rid of the hairy code you quote below. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [block] 4e366a7962: blktests.nvme.005.fail 2019-03-28 17:48 ` Christoph Hellwig @ 2019-03-28 17:52 ` Jens Axboe 2019-03-29 8:06 ` Johannes Thumshirn 0 siblings, 1 reply; 6+ messages in thread From: Jens Axboe @ 2019-03-28 17:52 UTC (permalink / raw) To: Christoph Hellwig, Johannes Thumshirn Cc: Linux Block Layer Mailinglist, Hannes Reinecke, Bart Van Assche, Jan Kara, Linux FSDEVEL Mailinglist, lkp, Ming Lei On 3/28/19 11:48 AM, Christoph Hellwig wrote: > On Thu, Mar 28, 2019 at 05:31:29PM +0100, Johannes Thumshirn wrote: >> In all cases I could reproduce it I had nsegs == 2 and >> rq->nr_phys_segments == 1 (rq->bio->bi_phys_segments == 1). > > And all cases are pass through as well? > > Can you check if this still shows up with the series from Ming which > switches the passthrough requests to use multi-page biovecs aswell > and gets rid of the hairy code you quote below. Since I already merged that in my 5.2 but didn't push it out yet, I've made a temporary branch available if you want to test through that: git://git.kernel.dk/linux-block 5.2-tmp -- Jens Axboe ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [block] 4e366a7962: blktests.nvme.005.fail 2019-03-28 17:52 ` Jens Axboe @ 2019-03-29 8:06 ` Johannes Thumshirn 0 siblings, 0 replies; 6+ messages in thread From: Johannes Thumshirn @ 2019-03-29 8:06 UTC (permalink / raw) To: Jens Axboe, Christoph Hellwig Cc: Linux Block Layer Mailinglist, Hannes Reinecke, Bart Van Assche, Jan Kara, Linux FSDEVEL Mailinglist, lkp, Ming Lei On 28/03/2019 18:52, Jens Axboe wrote: > On 3/28/19 11:48 AM, Christoph Hellwig wrote: >> On Thu, Mar 28, 2019 at 05:31:29PM +0100, Johannes Thumshirn wrote: >>> In all cases I could reproduce it I had nsegs == 2 and >>> rq->nr_phys_segments == 1 (rq->bio->bi_phys_segments == 1). >> >> And all cases are pass through as well? >> >> Can you check if this still shows up with the series from Ming which >> switches the passthrough requests to use multi-page biovecs aswell >> and gets rid of the hairy code you quote below. Good idea, thanks. > Since I already merged that in my 5.2 but didn't push it out yet, I've > made a temporary branch available if you want to test through that: > > git://git.kernel.dk/linux-block 5.2-tmp Awesome thanks. -- Johannes Thumshirn SUSE Labs Filesystems jthumshirn@suse.de +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850 ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [block] 4e366a7962: blktests.nvme.005.fail 2019-03-28 16:31 ` [block] 4e366a7962: blktests.nvme.005.fail Johannes Thumshirn 2019-03-28 17:48 ` Christoph Hellwig @ 2019-03-29 0:33 ` Ming Lei 2019-03-29 8:18 ` Johannes Thumshirn 1 sibling, 1 reply; 6+ messages in thread From: Ming Lei @ 2019-03-29 0:33 UTC (permalink / raw) To: Johannes Thumshirn Cc: Linux Block Layer Mailinglist, Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux FSDEVEL Mailinglist, Jens Axboe, LKP ML, Ming Lei On Fri, Mar 29, 2019 at 12:32 AM Johannes Thumshirn <jthumshirn@suse.de> wrote: > > On 25/03/2019 02:04, kernel test robot wrote: > [...] > > nvme/005 (reset local loopback target) > > runtime ... > > nvme/005 (reset local loopback target) [failed] > > runtime ... 0.596s > > something found in dmesg: > > [ 24.160182] run blktests nvme/005 at 2019-03-24 07:35:16 > > [ 24.346189] nvmet: adding nsid 1 to subsystem blktests-subsystem-1 > > [ 24.373089] nvmet: creating controller 1 for subsystem blktests-subsystem-1 for NQN nqn.2014-08.org.nvmexpress:uuid:dbd85962-80d1-4872-ac0f-d0214f3b1131. > > [ 24.394247] WARNING: CPU: 1 PID: 881 at /kbuild/src/x86_64/block/blk-merge.c:608 blk_rq_map_sg+0x5f0/0x6c0 > > [ 24.396082] Modules linked in: nvme_loop nvme_fabrics nvmet nvme_core loop crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel ata_generic sd_mod pata_acpi sg bochs_drm ttm ppdev drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops snd_pcm drm ata_piix snd_timer aesni_intel crypto_simd snd cryptd glue_helper libata joydev soundcore pcspkr serio_raw virtio_scsi i2c_piix4 parport_pc floppy parport ip_tables > > [ 24.403009] CPU: 1 PID: 881 Comm: nvme Not tainted 5.0.0-rc6-00148-g4e366a7 #1 > > [ 24.404372] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 > > [ 24.405930] RIP: 0010:blk_rq_map_sg+0x5f0/0x6c0 > > [ 24.406808] Code: 41 8b 91 10 01 00 00 8b 43 24 85 c2 0f 84 e3 fb ff ff f7 d0 21 d0 83 c0 01 41 01 40 0c 01 83 d0 00 00 00 e9 cd fb ff ff 0f 0b <0f> 0b e9 c3 fc ff ff 80 3d 66 f3 1d 01 00 0f 85 88 fb ff ff 48 c7 > > [ 24.410274] RSP: 0000:ffffb95c409d7908 EFLAGS: 00010202 > > ... > > (See '/lkp/benchmarks/blktests/blktests/results/nodev/nvme/005.dmesg' for the entire message) > > OK we're tripping over this in blk_rq_map_sg(): > > 607 /* > 608 * Something must have been wrong if the figured number of > 609 * segment is bigger than number of req's physical segments > 610 */ > 611 WARN_ON(nsegs > blk_rq_nr_phys_segments(rq)); > > In all cases I could reproduce it I had nsegs == 2 and > rq->nr_phys_segments == 1 (rq->bio->bi_phys_segments == 1). > > Adding this into nvme-loop "fixes" the issue but I don't think this is > the correct way to do > > --- a/drivers/nvme/target/loop.c > +++ b/drivers/nvme/target/loop.c > @@ -387,6 +387,7 @@ static int nvme_loop_configure_admin_que > goto out_free_tagset; > } > > + blk_queue_segment_boundary(ctrl->ctrl.admin_q, PAGE_SIZE - 1); The above isn't needed for linus tree, however it is required for 5.0 and older releases. But nvme-loop target have other problems, and I'd suggest you to apply the following patch: https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-linus&id=02db99548d3608a625cf481cff2bb7b626829b3f > error = nvmf_connect_admin_queue(&ctrl->ctrl); > if (error) > goto out_cleanup_queue; > > What raises my suspicion is this code fragment from bio_add_pc_page(): > > 731 > 732 /* If we may be able to merge these biovecs, force a recount */ > 733 if (bio->bi_vcnt > 1 && biovec_phys_mergeable(q, bvec - 1, bvec)) > 734 bio->bi_phys_segments = -1; The above change from you might cause issue given some queue's .bi_phys_segments is 255. > > This -1 (from this patch) would fit the one less than 'nsegs' but after > days of searching and staring at the code, I cannot find where we loose > the one. Before this patch set we would only have cleared the > BIO_SEG_VALID flag and thus we never run out of sync. I'd suggest you to address the following comments first before working on your patch: https://lore.kernel.org/linux-block/CACVXFVNPW1TOsCaAibc-YJHDEdK95Y8-v1rHNaab7oV=eb+8TA@mail.gmail.com/ https://lore.kernel.org/linux-block/CACVXFVOq1ngjpD62SreSZda5k_WJjxOZRHwpqt6si7bRv=O46Q@mail.gmail.com/ > > Adding a call to blk_recount_segments() doesn't do any difference here. > > It can of be a read herring as well. > > What makes me suspicious is the fact that I can't reliably trigger it by > running nvme/005 but have to run it in a loop ranging between several > seconds to some minutes. > > Anyone any ideas? I've been staring at this for several days now but > can't find out whats wrong. Can you trigger this issue on linus tree or 5.2-tmp of block tree? Thanks, Ming Lei ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [block] 4e366a7962: blktests.nvme.005.fail 2019-03-29 0:33 ` Ming Lei @ 2019-03-29 8:18 ` Johannes Thumshirn 0 siblings, 0 replies; 6+ messages in thread From: Johannes Thumshirn @ 2019-03-29 8:18 UTC (permalink / raw) To: Ming Lei Cc: Linux Block Layer Mailinglist, Hannes Reinecke, Bart Van Assche, Christoph Hellwig, Jan Kara, Linux FSDEVEL Mailinglist, Jens Axboe, LKP ML, Ming Lei On 29/03/2019 01:33, Ming Lei wrote: [...] >> + blk_queue_segment_boundary(ctrl->ctrl.admin_q, PAGE_SIZE - 1); > > The above isn't needed for linus tree, however it is required for 5.0 and older > releases. Yes but it was worth a try after days of failing to find the issue. > But nvme-loop target have other problems, and I'd suggest you to apply > the following patch: > > https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit/?h=for-linus&id=02db99548d3608a625cf481cff2bb7b626829b3f This only applies to the file backed target, doesn't it? My testcase uses bdev backed namespaces. [...] >> 731 >> 732 /* If we may be able to merge these biovecs, force a recount */ >> 733 if (bio->bi_vcnt > 1 && biovec_phys_mergeable(q, bvec - 1, bvec)) >> 734 bio->bi_phys_segments = -1; > > The above change from you might cause issue given some queue's .bi_phys_segments > is 255. 65535 but yes there are drivers setting it to USHRT_MAX. I'll have to see if I can get hold of one for testing. [...] > I'd suggest you to address the following comments first before working on > your patch: > > https://lore.kernel.org/linux-block/CACVXFVNPW1TOsCaAibc-YJHDEdK95Y8-v1rHNaab7oV=eb+8TA@mail.gmail.com/ > > https://lore.kernel.org/linux-block/CACVXFVOq1ngjpD62SreSZda5k_WJjxOZRHwpqt6si7bRv=O46Q@mail.gmail.com/ I'll have a look, thanks. [...] > Can you trigger this issue on linus tree or 5.2-tmp of block tree? I haven't been able to trigger it on vanilla so far, so it's most likely related to the patch. Will rebase the patchset to 5.2-tmp and retest from there. Thanks, Johannes -- Johannes Thumshirn SUSE Labs Filesystems jthumshirn@suse.de +49 911 74053 689 SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Felix Imendörffer, Mary Higgins, Sri Rasiah HRB 21284 (AG Nürnberg) Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850 ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-03-29 8:19 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20190325010409.GB6379@shao2-debian>
2019-03-28 16:31 ` [block] 4e366a7962: blktests.nvme.005.fail Johannes Thumshirn
2019-03-28 17:48 ` Christoph Hellwig
2019-03-28 17:52 ` Jens Axboe
2019-03-29 8:06 ` Johannes Thumshirn
2019-03-29 0:33 ` Ming Lei
2019-03-29 8:18 ` Johannes Thumshirn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox