* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
[not found] ` <084e7c5a-f98d-d61e-de81-83525851ecf9@acm.org>
@ 2022-08-12 10:48 ` Geert Uytterhoeven
2022-08-12 15:53 ` Bart Van Assche
0 siblings, 1 reply; 10+ messages in thread
From: Geert Uytterhoeven @ 2022-08-12 10:48 UTC (permalink / raw)
To: Bart Van Assche
Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
John Garry, ericspero, jason600.groome, Linux-Renesas,
Linux Kernel Mailing List, linux-ide
Hi Bart,
CC linux-ide
On Fri, Jul 22, 2022 at 7:56 PM Bart Van Assche <bvanassche@acm.org> wrote:
> On 7/22/22 01:53, Geert Uytterhoeven wrote:
> > During s2idle, the following trace data is generated:
> >
> > kworker/u16:9-325 [000] ...2. 230.478731: block_rq_issue: 8,0
> > N 0 () 0 + 0 [kworker/u16:9]
> > kworker/u16:9-325 [000] ...2. 230.478745:
> > scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=0
> > prot_sgl=0 prot_op=SCSI_PROT_NORMAL driver_tag=0 scheduler_tag=0
> > cmnd=(SYNCHRONIZE_CACHE - raw=35 00 00 00 00 00 00 00 00 00)
> > <idle>-0 [007] d.h3. 230.478832:
> > scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=0
> > prot_sgl=0 prot_op=SCSI_PROT_NORMAL driver_tag=0 scheduler_tag=0
> > cmnd=(SYNCHRONIZE_CACHE - raw=35 00 00 00 00 00 00 00 00 00)
> > result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE
> > status=SAM_STAT_GOOD)
> > <idle>-0 [000] ..s2. 230.478851: block_rq_complete:
> > 8,0 N () 18446744073709551615 + 0 [0]
> > kworker/u16:9-325 [000] ...2. 230.483134: block_rq_issue: 8,0
> > N 0 () 0 + 0 [kworker/u16:9]
> > kworker/u16:9-325 [000] ...2. 230.483136:
> > scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=0
> > prot_sgl=0 prot_op=SCSI_PROT_NORMAL driver_tag=0 scheduler_tag=1
> > cmnd=(START_STOP - raw=1b 00 00 00 00 00)
> > <idle>-0 [007] d.h3. 230.624530:
> > scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=0
> > prot_sgl=0 prot_op=SCSI_PROT_NORMAL driver_tag=0 scheduler_tag=1
> > cmnd=(START_STOP - raw=1b 00 00 00 00 00) result=(driver=DRIVER_OK
> > host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)
> > <idle>-0 [000] d.s4. 230.624634: scsi_eh_wakeup: host_no=0
> > <idle>-0 [000] ..s2. 230.624642: block_rq_complete:
> > 8,0 N () 18446744073709551615 + 0 [0]
> > kworker/u16:14-1027 [007] d..3. 231.393642: scsi_eh_wakeup: host_no=0
> >
> > When reading from hard drive after s2idle, no more trace data
> > is generated.
>
> I think the above commands come from the suspend sequence. '1b 00 00 00
> 00 00' stops a block device. The lowest bit in byte 4 needs to be set to
> start a block device.
>
> Something that is not yet clear is whether or not sd_submit_start()
> hangs during the resume process. How about verifying whether or not
> sd_submit_start() hangs by either issuing SysRq-t or by adding pr_info()
> statements in that function?
sd_submit_start() is called once during suspend, and once during
resume. It does not hang.
Reading from /dev/sda hangs after resume (not in sd_submit_start(),
which is never called for reading).
Two tasks are blocked in blk_mq_get_tag() calling io_schedule():
task:kworker/7:1 state:D stack: 0 pid: 122 ppid: 2 flags:0x00000008
Workqueue: events ata_scsi_dev_rescan
Call trace:
__switch_to+0xbc/0x124
__schedule+0x540/0x71c
schedule+0x58/0xa0
io_schedule+0x18/0x34
blk_mq_get_tag+0x138/0x244
__blk_mq_alloc_requests+0x130/0x2f0
blk_mq_alloc_request+0x74/0xa8
scsi_alloc_request+0x10/0x30
__scsi_execute+0x5c/0x18c
scsi_vpd_inquiry+0x7c/0xdc
scsi_get_vpd_size+0x34/0xa8
scsi_get_vpd_buf+0x28/0xf4
scsi_attach_vpd+0x44/0x170
scsi_rescan_device+0x30/0x98
ata_scsi_dev_rescan+0xc8/0xfc
process_one_work+0x2e0/0x474
worker_thread+0x1cc/0x270
kthread+0xd8/0xe8
ret_from_fork+0x10/0x20
task:hd state:D stack: 0 pid: 1163 ppid: 1076 flags:0x00000000
Call trace:
__switch_to+0xbc/0x124
__schedule+0x540/0x71c
schedule+0x58/0xa0
io_schedule+0x18/0x34
blk_mq_get_tag+0x138/0x244
__blk_mq_alloc_requests+0x130/0x2f0
blk_mq_submit_bio+0x44c/0x5b4
__submit_bio+0x24/0x5c
submit_bio_noacct_nocheck+0x8c/0x178
submit_bio_noacct+0x380/0x3b0
submit_bio+0x34/0x3c
mpage_bio_submit+0x28/0x38
mpage_readahead+0xa8/0x178
blkdev_readahead+0x14/0x1c
read_pages+0x4c/0x158
page_cache_ra_unbounded+0xd8/0x174
do_page_cache_ra+0x40/0x4c
page_cache_ra_order+0x14/0x1c
ondemand_readahead+0x124/0x2fc
page_cache_sync_ra+0x50/0x54
filemap_read+0x130/0x6e8
blkdev_read_iter+0xf0/0x164
new_sync_read+0x74/0xc0
vfs_read+0xbc/0xd8
ksys_read+0x6c/0xd4
__arm64_sys_read+0x14/0x1c
invoke_syscall+0x70/0xf4
el0_svc_common.constprop.0+0xbc/0xf0
do_el0_svc+0x18/0x20
el0_svc+0x30/0x84
el0t_64_sync_handler+0x90/0xf8
el0t_64_sync+0x14c/0x150
I hope this helps.
Thanks!
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
2022-08-12 10:48 ` [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support Geert Uytterhoeven
@ 2022-08-12 15:53 ` Bart Van Assche
2022-08-15 10:13 ` Geert Uytterhoeven
0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2022-08-12 15:53 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
John Garry, ericspero, jason600.groome, Linux-Renesas,
Linux Kernel Mailing List, linux-ide
On 8/12/22 03:48, Geert Uytterhoeven wrote:
> sd_submit_start() is called once during suspend, and once during
> resume. It does not hang.
>
> Reading from /dev/sda hangs after resume (not in sd_submit_start(),
> which is never called for reading).
>
> Two tasks are blocked in blk_mq_get_tag() calling io_schedule():
>
> task:kworker/7:1 state:D stack: 0 pid: 122 ppid: 2 flags:0x00000008
> Workqueue: events ata_scsi_dev_rescan
> Call trace:
> __switch_to+0xbc/0x124
> __schedule+0x540/0x71c
> schedule+0x58/0xa0
> io_schedule+0x18/0x34
> blk_mq_get_tag+0x138/0x244
> __blk_mq_alloc_requests+0x130/0x2f0
> blk_mq_alloc_request+0x74/0xa8
> scsi_alloc_request+0x10/0x30
> __scsi_execute+0x5c/0x18c
> scsi_vpd_inquiry+0x7c/0xdc
> scsi_get_vpd_size+0x34/0xa8
> scsi_get_vpd_buf+0x28/0xf4
> scsi_attach_vpd+0x44/0x170
> scsi_rescan_device+0x30/0x98
> ata_scsi_dev_rescan+0xc8/0xfc
> process_one_work+0x2e0/0x474
> worker_thread+0x1cc/0x270
> kthread+0xd8/0xe8
> ret_from_fork+0x10/0x20
>
>
> task:hd state:D stack: 0 pid: 1163 ppid: 1076 flags:0x00000000
> Call trace:
> __switch_to+0xbc/0x124
> __schedule+0x540/0x71c
> schedule+0x58/0xa0
> io_schedule+0x18/0x34
> blk_mq_get_tag+0x138/0x244
> __blk_mq_alloc_requests+0x130/0x2f0
> blk_mq_submit_bio+0x44c/0x5b4
> __submit_bio+0x24/0x5c
> submit_bio_noacct_nocheck+0x8c/0x178
> submit_bio_noacct+0x380/0x3b0
> submit_bio+0x34/0x3c
> mpage_bio_submit+0x28/0x38
> mpage_readahead+0xa8/0x178
> blkdev_readahead+0x14/0x1c
> read_pages+0x4c/0x158
> page_cache_ra_unbounded+0xd8/0x174
> do_page_cache_ra+0x40/0x4c
> page_cache_ra_order+0x14/0x1c
> ondemand_readahead+0x124/0x2fc
> page_cache_sync_ra+0x50/0x54
> filemap_read+0x130/0x6e8
> blkdev_read_iter+0xf0/0x164
> new_sync_read+0x74/0xc0
> vfs_read+0xbc/0xd8
> ksys_read+0x6c/0xd4
> __arm64_sys_read+0x14/0x1c
> invoke_syscall+0x70/0xf4
> el0_svc_common.constprop.0+0xbc/0xf0
> do_el0_svc+0x18/0x20
> el0_svc+0x30/0x84
> el0t_64_sync_handler+0x90/0xf8
> el0t_64_sync+0x14c/0x150
Hi Geert,
All that can be concluded from the above is that blk_mq_get_tag() is
waiting for other I/O request(s) to finish. One or more other requests
are in progress and either scsi_done() has not been called for these
requests or the error handler got stuck. Since the issue reported above
is not observed with other ATA interfaces, this may be related to the
ATA interface driver used in your test setup.
Bart.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
2022-08-12 15:53 ` Bart Van Assche
@ 2022-08-15 10:13 ` Geert Uytterhoeven
2022-08-15 13:49 ` Bart Van Assche
0 siblings, 1 reply; 10+ messages in thread
From: Geert Uytterhoeven @ 2022-08-15 10:13 UTC (permalink / raw)
To: Bart Van Assche
Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
John Garry, ericspero, jason600.groome, Linux-Renesas,
Linux Kernel Mailing List, linux-ide, James Bottomley
Hoi Bart,
On Fri, Aug 12, 2022 at 5:53 PM Bart Van Assche <bvanassche@acm.org> wrote:
> On 8/12/22 03:48, Geert Uytterhoeven wrote:
> > sd_submit_start() is called once during suspend, and once during
> > resume. It does not hang.
> >
> > Reading from /dev/sda hangs after resume (not in sd_submit_start(),
> > which is never called for reading).
FTR, this issue is now present in v6.0-rc1. Reverting commit
88f1669019bd62b3 ("scsi: sd: Rework asynchronous resume support")
fixes it.
> > Two tasks are blocked in blk_mq_get_tag() calling io_schedule():
> >
> > task:kworker/7:1 state:D stack: 0 pid: 122 ppid: 2 flags:0x00000008
> > Workqueue: events ata_scsi_dev_rescan
> > Call trace:
> > __switch_to+0xbc/0x124
> > __schedule+0x540/0x71c
> > schedule+0x58/0xa0
> > io_schedule+0x18/0x34
> > blk_mq_get_tag+0x138/0x244
> > __blk_mq_alloc_requests+0x130/0x2f0
> > blk_mq_alloc_request+0x74/0xa8
> > scsi_alloc_request+0x10/0x30
> > __scsi_execute+0x5c/0x18c
> > scsi_vpd_inquiry+0x7c/0xdc
> > scsi_get_vpd_size+0x34/0xa8
> > scsi_get_vpd_buf+0x28/0xf4
> > scsi_attach_vpd+0x44/0x170
> > scsi_rescan_device+0x30/0x98
> > ata_scsi_dev_rescan+0xc8/0xfc
> > process_one_work+0x2e0/0x474
> > worker_thread+0x1cc/0x270
> > kthread+0xd8/0xe8
> > ret_from_fork+0x10/0x20
> >
> >
> > task:hd state:D stack: 0 pid: 1163 ppid: 1076 flags:0x00000000
> > Call trace:
> > __switch_to+0xbc/0x124
> > __schedule+0x540/0x71c
> > schedule+0x58/0xa0
> > io_schedule+0x18/0x34
> > blk_mq_get_tag+0x138/0x244
> > __blk_mq_alloc_requests+0x130/0x2f0
> > blk_mq_submit_bio+0x44c/0x5b4
> > __submit_bio+0x24/0x5c
> > submit_bio_noacct_nocheck+0x8c/0x178
> > submit_bio_noacct+0x380/0x3b0
> > submit_bio+0x34/0x3c
> > mpage_bio_submit+0x28/0x38
> > mpage_readahead+0xa8/0x178
> > blkdev_readahead+0x14/0x1c
> > read_pages+0x4c/0x158
> > page_cache_ra_unbounded+0xd8/0x174
> > do_page_cache_ra+0x40/0x4c
> > page_cache_ra_order+0x14/0x1c
> > ondemand_readahead+0x124/0x2fc
> > page_cache_sync_ra+0x50/0x54
> > filemap_read+0x130/0x6e8
> > blkdev_read_iter+0xf0/0x164
> > new_sync_read+0x74/0xc0
> > vfs_read+0xbc/0xd8
> > ksys_read+0x6c/0xd4
> > __arm64_sys_read+0x14/0x1c
> > invoke_syscall+0x70/0xf4
> > el0_svc_common.constprop.0+0xbc/0xf0
> > do_el0_svc+0x18/0x20
> > el0_svc+0x30/0x84
> > el0t_64_sync_handler+0x90/0xf8
> > el0t_64_sync+0x14c/0x150
>
> All that can be concluded from the above is that blk_mq_get_tag() is
> waiting for other I/O request(s) to finish. One or more other requests
> are in progress and either scsi_done() has not been called for these
> requests or the error handler got stuck. Since the issue reported above
> is not observed with other ATA interfaces, this may be related to the
> ATA interface driver used in your test setup.
I have added debug prints to all ata_port_operations in
sata_rcar_port_ops. After s2idle, running "hd /dev/sda | head -70"
hangs before any of these functions are called.
Showing all locks held in the system:
1 lock held by rcu_tasks_kthre/10:
#0: ffff800009575c38 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at:
rcu_tasks_one_gp+0x34/0x4c8
4 locks held by kworker/0:10/104:
#0: ffff0004c0008738 ((wq_completion)events){+.+.}-{0:0}, at:
process_one_work+0x1f4/0x6a0
#1: ffff80000a90bde0
((work_completion)(&ap->scsi_rescan_task)){+.+.}-{0:0}, at:
process_one_work+0x1f4/0x6a0
#2: ffff0004c2b6bf60 (&ap->scsi_scan_mutex){+.+.}-{3:3}, at:
ata_scsi_dev_rescan+0x28/0x118
#3: ffff0004c2902368 (&dev->mutex){....}-{3:3}, at:
scsi_rescan_device+0x28/0x78
1 lock held by in:imklog/636:
#0: ffff0004c5ee86e8 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x54/0x68
1 lock held by hd/1013:
#0: ffff0004c06388b8 (mapping.invalidate_lock#2){.+.+}-{3:3}, at:
page_cache_ra_unbounded+0x64/0x1a8
I've just tried with a USB storage device on the same platform,
and it can be read fine after s2idle. So it looks like the issue
is related to SATA.
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
2022-08-15 10:13 ` Geert Uytterhoeven
@ 2022-08-15 13:49 ` Bart Van Assche
2022-08-15 18:26 ` Geert Uytterhoeven
2022-08-17 19:07 ` Vlastimil Babka
0 siblings, 2 replies; 10+ messages in thread
From: Bart Van Assche @ 2022-08-15 13:49 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
John Garry, ericspero, jason600.groome, Linux-Renesas,
Linux Kernel Mailing List, linux-ide, James Bottomley
On 8/15/22 03:13, Geert Uytterhoeven wrote:
> Showing all locks held in the system:
> 1 lock held by rcu_tasks_kthre/10:
> #0: ffff800009575c38 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at:
> rcu_tasks_one_gp+0x34/0x4c8
> 4 locks held by kworker/0:10/104:
> #0: ffff0004c0008738 ((wq_completion)events){+.+.}-{0:0}, at:
> process_one_work+0x1f4/0x6a0
> #1: ffff80000a90bde0
> ((work_completion)(&ap->scsi_rescan_task)){+.+.}-{0:0}, at:
> process_one_work+0x1f4/0x6a0
> #2: ffff0004c2b6bf60 (&ap->scsi_scan_mutex){+.+.}-{3:3}, at:
> ata_scsi_dev_rescan+0x28/0x118
> #3: ffff0004c2902368 (&dev->mutex){....}-{3:3}, at:
> scsi_rescan_device+0x28/0x78
> 1 lock held by in:imklog/636:
> #0: ffff0004c5ee86e8 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x54/0x68
> 1 lock held by hd/1013:
> #0: ffff0004c06388b8 (mapping.invalidate_lock#2){.+.+}-{3:3}, at:
> page_cache_ra_unbounded+0x64/0x1a8
Thank you for having shared this information. I will take a closer look
and see what I can derive from the above information.
> I've just tried with a USB storage device on the same platform,
> and it can be read fine after s2idle. So it looks like the issue
> is related to SATA.
Unfortunately the above does not learn us anything new. The code
modified by commit 88f1669019bd ("scsi: sd: Rework asynchronous resume
support") is only called if sdev->manage_start_stop != 1. Only the SATA
code, the Firewire code and the manage_start_stop sysfs attribute store
method set that member variable:
$ git grep -nH 'manage_start_stop = '
drivers/ata/libata-scsi.c:1083: sdev->manage_start_stop = 1;
drivers/firewire/sbp2.c:1521: sdev->manage_start_stop = 1;
drivers/scsi/sd.c:240: sdp->manage_start_stop = v;
Would it be possible to share the output of the command below? That
should reveal which ATA driver is active on the test setup.
find /sys -name proc_name | xargs grep -aH .
Thanks,
Bart.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
2022-08-15 13:49 ` Bart Van Assche
@ 2022-08-15 18:26 ` Geert Uytterhoeven
2022-08-16 20:21 ` Bart Van Assche
2022-08-17 19:07 ` Vlastimil Babka
1 sibling, 1 reply; 10+ messages in thread
From: Geert Uytterhoeven @ 2022-08-15 18:26 UTC (permalink / raw)
To: Bart Van Assche
Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
John Garry, ericspero, jason600.groome, Linux-Renesas,
Linux Kernel Mailing List, linux-ide, James Bottomley
Hoi Bart,
On Mon, Aug 15, 2022 at 3:49 PM Bart Van Assche <bvanassche@acm.org> wrote:
> On 8/15/22 03:13, Geert Uytterhoeven wrote:
> > Showing all locks held in the system:
> > 1 lock held by rcu_tasks_kthre/10:
> > #0: ffff800009575c38 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at:
> > rcu_tasks_one_gp+0x34/0x4c8
> > 4 locks held by kworker/0:10/104:
> > #0: ffff0004c0008738 ((wq_completion)events){+.+.}-{0:0}, at:
> > process_one_work+0x1f4/0x6a0
> > #1: ffff80000a90bde0
> > ((work_completion)(&ap->scsi_rescan_task)){+.+.}-{0:0}, at:
> > process_one_work+0x1f4/0x6a0
> > #2: ffff0004c2b6bf60 (&ap->scsi_scan_mutex){+.+.}-{3:3}, at:
> > ata_scsi_dev_rescan+0x28/0x118
> > #3: ffff0004c2902368 (&dev->mutex){....}-{3:3}, at:
> > scsi_rescan_device+0x28/0x78
> > 1 lock held by in:imklog/636:
> > #0: ffff0004c5ee86e8 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x54/0x68
> > 1 lock held by hd/1013:
> > #0: ffff0004c06388b8 (mapping.invalidate_lock#2){.+.+}-{3:3}, at:
> > page_cache_ra_unbounded+0x64/0x1a8
>
> Thank you for having shared this information. I will take a closer look
> and see what I can derive from the above information.
>
> > I've just tried with a USB storage device on the same platform,
> > and it can be read fine after s2idle. So it looks like the issue
> > is related to SATA.
>
> Unfortunately the above does not learn us anything new. The code
> modified by commit 88f1669019bd ("scsi: sd: Rework asynchronous resume
> support") is only called if sdev->manage_start_stop != 1. Only the SATA
> code, the Firewire code and the manage_start_stop sysfs attribute store
> method set that member variable:
>
> $ git grep -nH 'manage_start_stop = '
> drivers/ata/libata-scsi.c:1083: sdev->manage_start_stop = 1;
> drivers/firewire/sbp2.c:1521: sdev->manage_start_stop = 1;
> drivers/scsi/sd.c:240: sdp->manage_start_stop = v;
>
> Would it be possible to share the output of the command below? That
> should reveal which ATA driver is active on the test setup.
>
> find /sys -name proc_name | xargs grep -aH .
/sys/devices/platform/soc/ee300000.sata/ata1/host0/scsi_host/host0/proc_name:sata_rcar
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
2022-08-15 18:26 ` Geert Uytterhoeven
@ 2022-08-16 20:21 ` Bart Van Assche
2022-08-17 8:53 ` Sergey Shtylyov
0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2022-08-16 20:21 UTC (permalink / raw)
To: Geert Uytterhoeven
Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
John Garry, ericspero, jason600.groome, Linux-Renesas,
Linux Kernel Mailing List, linux-ide, James Bottomley
On 8/15/22 11:26, Geert Uytterhoeven wrote:
> On Mon, Aug 15, 2022 at 3:49 PM Bart Van Assche <bvanassche@acm.org> wrote:
>> Would it be possible to share the output of the command below? That
>> should reveal which ATA driver is active on the test setup.
>>
>> find /sys -name proc_name | xargs grep -aH .
>
> /sys/devices/platform/soc/ee300000.sata/ata1/host0/scsi_host/host0/proc_name:sata_rcar
Thanks Geert for the help. Although I already posted a revert, I'm still
trying to root-cause this issue. Do you perhaps know whether sata_rcar
controllers support NCQ and if so, what queue depth these controllers
support? I think that information is available in sysfs. Here is an
example for a VM:
# (cd /sys/class/scsi_device && for a in */device/*/*/ncq_prio_enable;
do p=${a%/ncq_prio_enable}; grep -qi ata $p/inquiry || continue; grep
-aH . $p/{queue_depth,ncq*}; done)
2:0:0:0/device/driver/2:0:0:0/queue_depth:32
2:0:0:0/device/driver/2:0:0:0/ncq_prio_enable:0
2:0:0:0/device/driver/2:0:0:0/ncq_prio_supported:0
2:0:0:0/device/generic/device/queue_depth:32
2:0:0:0/device/generic/device/ncq_prio_enable:0
2:0:0:0/device/generic/device/ncq_prio_supported:0
6:0:0:1/device/driver/2:0:0:0/queue_depth:32
6:0:0:1/device/driver/2:0:0:0/ncq_prio_enable:0
6:0:0:1/device/driver/2:0:0:0/ncq_prio_supported:0
Thanks,
Bart.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
2022-08-16 20:21 ` Bart Van Assche
@ 2022-08-17 8:53 ` Sergey Shtylyov
0 siblings, 0 replies; 10+ messages in thread
From: Sergey Shtylyov @ 2022-08-17 8:53 UTC (permalink / raw)
To: Bart Van Assche, Geert Uytterhoeven
Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
John Garry, ericspero, jason600.groome, Linux-Renesas,
Linux Kernel Mailing List, linux-ide, James Bottomley
Hello!
On 8/16/22 11:21 PM, Bart Van Assche wrote:
>> On Mon, Aug 15, 2022 at 3:49 PM Bart Van Assche <bvanassche@acm.org> wrote:
>>> Would it be possible to share the output of the command below? That
>>> should reveal which ATA driver is active on the test setup.
>>>
>>> find /sys -name proc_name | xargs grep -aH .
>>
>> /sys/devices/platform/soc/ee300000.sata/ata1/host0/scsi_host/host0/proc_name:sata_rcar
>
> Thanks Geert for the help. Although I already posted a revert, I'm still trying to
> root-cause this issue. Do you perhaps know whether sata_rcar controllers support NCQ
They don't. :-)
> and if so, what queue depth these controllers support? I think that information is available in sysfs. Here is an example for a VM:
>
> # (cd /sys/class/scsi_device && for a in */device/*/*/ncq_prio_enable; do p=${a%/ncq_prio_enable}; grep -qi ata $p/inquiry || continue; grep -aH . $p/{queue_depth,ncq*}; done)
> 2:0:0:0/device/driver/2:0:0:0/queue_depth:32
> 2:0:0:0/device/driver/2:0:0:0/ncq_prio_enable:0
> 2:0:0:0/device/driver/2:0:0:0/ncq_prio_supported:0
> 2:0:0:0/device/generic/device/queue_depth:32
> 2:0:0:0/device/generic/device/ncq_prio_enable:0
> 2:0:0:0/device/generic/device/ncq_prio_supported:0
> 6:0:0:1/device/driver/2:0:0:0/queue_depth:32
> 6:0:0:1/device/driver/2:0:0:0/ncq_prio_enable:0
> 6:0:0:1/device/driver/2:0:0:0/ncq_prio_supported:0
>
> Thanks,
>
> Bart.
MBR, Sergey
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
2022-08-15 13:49 ` Bart Van Assche
2022-08-15 18:26 ` Geert Uytterhoeven
@ 2022-08-17 19:07 ` Vlastimil Babka
2022-08-17 19:28 ` Bart Van Assche
2022-08-28 11:52 ` [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support #forregzbot Thorsten Leemhuis
1 sibling, 2 replies; 10+ messages in thread
From: Vlastimil Babka @ 2022-08-17 19:07 UTC (permalink / raw)
To: Bart Van Assche, Geert Uytterhoeven
Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
John Garry, ericspero, jason600.groome, Linux-Renesas,
Linux Kernel Mailing List, linux-ide, James Bottomley,
regressions
Hi, I have a T460 hanging on resume from suspend to ram in 6.0-rc1 that
I bisected to this commit.
> Unfortunately the above does not learn us anything new. The code
> modified by commit 88f1669019bd ("scsi: sd: Rework asynchronous resume
> support") is only called if sdev->manage_start_stop != 1. Only the SATA
> code, the Firewire code and the manage_start_stop sysfs attribute store
> method set that member variable:
>
> $ git grep -nH 'manage_start_stop = '
> drivers/ata/libata-scsi.c:1083: sdev->manage_start_stop = 1;
> drivers/firewire/sbp2.c:1521: sdev->manage_start_stop = 1;
> drivers/scsi/sd.c:240: sdp->manage_start_stop = v;
>
> Would it be possible to share the output of the command below? That
> should reveal which ATA driver is active on the test setup.
>
> find /sys -name proc_name | xargs grep -aH .
In my case it's
/sys/devices/pci0000:00/0000:00:17.0/ata1/host0/scsi_host/host0/proc_name:ahci
/sys/devices/pci0000:00/0000:00:17.0/ata2/host1/scsi_host/host1/proc_name:ahci
Some more details from dmesg
[ 0.849373] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 0.852849] ata2.00: ACPI cmd f5/00:00:00:00:00:a0(SECURITY FREEZE LOCK) filtered out
[ 0.854671] ata2.00: supports DRM functions and may not be fully accessible
[ 0.856181] ata2.00: ATA-9: SAMSUNG MZ7LN512HMJP-000L7, MAV01L6Q, max UDMA/133
[ 0.858115] ata2.00: 1000215216 sectors, multi 1: LBA48 NCQ (depth 32), AA
[ 0.861584] ata2.00: Features: Trust Dev-Sleep NCQ-sndrcv
[ 0.863749] ata2.00: ACPI cmd f5/00:00:00:00:00:a0(SECURITY FREEZE LOCK) filtered out
[ 0.865481] ata2.00: supports DRM functions and may not be fully accessible
[ 0.870043] ata2.00: configured for UDMA/133
[ 0.871871] scsi 1:0:0:0: Direct-Access ATA SAMSUNG MZ7LN512 1L6Q PQ: 0 ANSI: 5
Please Cc me on further questions/steps to try/patches to test.
#regzbot introduced: 88f1669019bd62b3
#regzbot monitor: https://lore.kernel.org/all/20220816172638.538734-1-bvanassche@acm.org/
> Thanks,
>
> Bart.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
2022-08-17 19:07 ` Vlastimil Babka
@ 2022-08-17 19:28 ` Bart Van Assche
2022-08-28 11:52 ` [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support #forregzbot Thorsten Leemhuis
1 sibling, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2022-08-17 19:28 UTC (permalink / raw)
To: Vlastimil Babka, Geert Uytterhoeven
Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
John Garry, ericspero, jason600.groome, Linux-Renesas,
Linux Kernel Mailing List, linux-ide, James Bottomley,
regressions
On 8/17/22 12:07, Vlastimil Babka wrote:
> In my case it's
> /sys/devices/pci0000:00/0000:00:17.0/ata1/host0/scsi_host/host0/proc_name:ahci
> /sys/devices/pci0000:00/0000:00:17.0/ata2/host1/scsi_host/host1/proc_name:ahci
>
> Some more details from dmesg
>
> [ 0.849373] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> [ 0.852849] ata2.00: ACPI cmd f5/00:00:00:00:00:a0(SECURITY FREEZE LOCK) filtered out
> [ 0.854671] ata2.00: supports DRM functions and may not be fully accessible
> [ 0.856181] ata2.00: ATA-9: SAMSUNG MZ7LN512HMJP-000L7, MAV01L6Q, max UDMA/133
> [ 0.858115] ata2.00: 1000215216 sectors, multi 1: LBA48 NCQ (depth 32), AA
> [ 0.861584] ata2.00: Features: Trust Dev-Sleep NCQ-sndrcv
> [ 0.863749] ata2.00: ACPI cmd f5/00:00:00:00:00:a0(SECURITY FREEZE LOCK) filtered out
> [ 0.865481] ata2.00: supports DRM functions and may not be fully accessible
> [ 0.870043] ata2.00: configured for UDMA/133
> [ 0.871871] scsi 1:0:0:0: Direct-Access ATA SAMSUNG MZ7LN512 1L6Q PQ: 0 ANSI: 5
>
> Please Cc me on further questions/steps to try/patches to test.
Hi Vlastimil,
Thank you for having provided the above information. The root cause of
the hang is not yet clear to me. I was wondering whether the hang
perhaps would be triggered by controllers that only support queue depth
1. However, in the above output I see "depth 32".
As already reported in this email thread a revert for commit
88f1669019bd62b3 has been posted on the linux-scsi mailing list.
Additionally, Greg KH has been asked to drop that patch from the stable
trees.
Thanks,
Bart.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support #forregzbot
2022-08-17 19:07 ` Vlastimil Babka
2022-08-17 19:28 ` Bart Van Assche
@ 2022-08-28 11:52 ` Thorsten Leemhuis
1 sibling, 0 replies; 10+ messages in thread
From: Thorsten Leemhuis @ 2022-08-28 11:52 UTC (permalink / raw)
To: regressions; +Cc: scsi, Linux-Renesas, Linux Kernel Mailing List, linux-ide
TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.
On 17.08.22 21:07, Vlastimil Babka wrote:
> Hi, I have a T460 hanging on resume from suspend to ram in 6.0-rc1 that
> I bisected to this commit.
>
>> Unfortunately the above does not learn us anything new. The code
>> modified by commit 88f1669019bd ("scsi: sd: Rework asynchronous resume
>> support") is only called if sdev->manage_start_stop != 1. Only the SATA
>> code, the Firewire code and the manage_start_stop sysfs attribute store
>> method set that member variable:
> [...]
> #regzbot introduced: 88f1669019bd62b3
> #regzbot monitor: https://lore.kernel.org/all/20220816172638.538734-1-bvanassche@acm.org/
#regzbot fixed-by: 785538bfdd682c8e962341d585f9b88262a0475
Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-08-28 11:53 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20220630195703.10155-1-bvanassche@acm.org>
[not found] ` <20220630195703.10155-3-bvanassche@acm.org>
[not found] ` <alpine.DEB.2.22.394.2207191125130.1006766@ramsan.of.borg>
[not found] ` <db19ed29-e7f9-e5b0-3a6c-f2812078a07d@acm.org>
[not found] ` <CAMuHMdVzsgSYtbJQnaigNax_JbxPsQfU+gHcteS-ojWbxUdMfw@mail.gmail.com>
[not found] ` <CAMuHMdWtxBj8ug7AHTqentF8UD4jpO2sgoWWcQCOvEKLJtdq8A@mail.gmail.com>
[not found] ` <506ca1a6-1122-5755-fc74-60f7c7bfbd0d@acm.org>
[not found] ` <CAMuHMdVQ2K2v8jpsFfOMk99DG_sBB4_ioiQRroC7K_Ov1wvp9w@mail.gmail.com>
[not found] ` <6f70e742-9d8a-f389-0482-0ba9696bf445@acm.org>
[not found] ` <CAMuHMdVc+ATGV-=R3uV6RyF0-mZiuKv7HpmogRBgqGVyO-MKWg@mail.gmail.com>
[not found] ` <54e20a27-a10b-b77a-e950-1d3398e2e907@acm.org>
[not found] ` <CAMuHMdURQpAEGgv4cY7v0rqzs12v2TT=Amt26Y0OoBSW7YAoaw@mail.gmail.com>
[not found] ` <084e7c5a-f98d-d61e-de81-83525851ecf9@acm.org>
2022-08-12 10:48 ` [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support Geert Uytterhoeven
2022-08-12 15:53 ` Bart Van Assche
2022-08-15 10:13 ` Geert Uytterhoeven
2022-08-15 13:49 ` Bart Van Assche
2022-08-15 18:26 ` Geert Uytterhoeven
2022-08-16 20:21 ` Bart Van Assche
2022-08-17 8:53 ` Sergey Shtylyov
2022-08-17 19:07 ` Vlastimil Babka
2022-08-17 19:28 ` Bart Van Assche
2022-08-28 11:52 ` [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support #forregzbot Thorsten Leemhuis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox