public inbox for linux-ide@vger.kernel.org
 help / color / mirror / Atom feed
* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
       [not found]                       ` <084e7c5a-f98d-d61e-de81-83525851ecf9@acm.org>
@ 2022-08-12 10:48                         ` Geert Uytterhoeven
  2022-08-12 15:53                           ` Bart Van Assche
  0 siblings, 1 reply; 10+ messages in thread
From: Geert Uytterhoeven @ 2022-08-12 10:48 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
	John Garry, ericspero, jason600.groome, Linux-Renesas,
	Linux Kernel Mailing List, linux-ide

Hi Bart,

CC linux-ide

On Fri, Jul 22, 2022 at 7:56 PM Bart Van Assche <bvanassche@acm.org> wrote:
> On 7/22/22 01:53, Geert Uytterhoeven wrote:
> > During s2idle, the following trace data is generated:
> >
> >     kworker/u16:9-325     [000] ...2.   230.478731: block_rq_issue: 8,0
> > N 0 () 0 + 0 [kworker/u16:9]
> >     kworker/u16:9-325     [000] ...2.   230.478745:
> > scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=0
> > prot_sgl=0 prot_op=SCSI_PROT_NORMAL driver_tag=0 scheduler_tag=0
> > cmnd=(SYNCHRONIZE_CACHE - raw=35 00 00 00 00 00 00 00 00 00)
> >            <idle>-0       [007] d.h3.   230.478832:
> > scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=0
> > prot_sgl=0 prot_op=SCSI_PROT_NORMAL driver_tag=0 scheduler_tag=0
> > cmnd=(SYNCHRONIZE_CACHE - raw=35 00 00 00 00 00 00 00 00 00)
> > result=(driver=DRIVER_OK host=DID_OK message=COMMAND_COMPLETE
> > status=SAM_STAT_GOOD)
> >            <idle>-0       [000] ..s2.   230.478851: block_rq_complete:
> > 8,0 N () 18446744073709551615 + 0 [0]
> >     kworker/u16:9-325     [000] ...2.   230.483134: block_rq_issue: 8,0
> > N 0 () 0 + 0 [kworker/u16:9]
> >     kworker/u16:9-325     [000] ...2.   230.483136:
> > scsi_dispatch_cmd_start: host_no=0 channel=0 id=0 lun=0 data_sgl=0
> > prot_sgl=0 prot_op=SCSI_PROT_NORMAL driver_tag=0 scheduler_tag=1
> > cmnd=(START_STOP - raw=1b 00 00 00 00 00)
> >            <idle>-0       [007] d.h3.   230.624530:
> > scsi_dispatch_cmd_done: host_no=0 channel=0 id=0 lun=0 data_sgl=0
> > prot_sgl=0 prot_op=SCSI_PROT_NORMAL driver_tag=0 scheduler_tag=1
> > cmnd=(START_STOP - raw=1b 00 00 00 00 00) result=(driver=DRIVER_OK
> > host=DID_OK message=COMMAND_COMPLETE status=SAM_STAT_GOOD)
> >            <idle>-0       [000] d.s4.   230.624634: scsi_eh_wakeup: host_no=0
> >            <idle>-0       [000] ..s2.   230.624642: block_rq_complete:
> > 8,0 N () 18446744073709551615 + 0 [0]
> >    kworker/u16:14-1027    [007] d..3.   231.393642: scsi_eh_wakeup: host_no=0
> >
> > When reading from hard drive after s2idle, no more trace data
> > is generated.
>
> I think the above commands come from the suspend sequence. '1b 00 00 00
> 00 00' stops a block device. The lowest bit in byte 4 needs to be set to
> start a block device.
>
> Something that is not yet clear is whether or not sd_submit_start()
> hangs during the resume process. How about verifying whether or not
> sd_submit_start() hangs by either issuing SysRq-t or by adding pr_info()
> statements in that function?

sd_submit_start() is called once during suspend, and once during
resume.  It does not hang.

Reading from /dev/sda hangs after resume (not in sd_submit_start(),
which is never called for reading).

Two tasks are blocked in blk_mq_get_tag() calling io_schedule():

task:kworker/7:1     state:D stack:    0 pid:  122 ppid:     2 flags:0x00000008
Workqueue: events ata_scsi_dev_rescan
Call trace:
 __switch_to+0xbc/0x124
 __schedule+0x540/0x71c
 schedule+0x58/0xa0
 io_schedule+0x18/0x34
 blk_mq_get_tag+0x138/0x244
 __blk_mq_alloc_requests+0x130/0x2f0
 blk_mq_alloc_request+0x74/0xa8
 scsi_alloc_request+0x10/0x30
 __scsi_execute+0x5c/0x18c
 scsi_vpd_inquiry+0x7c/0xdc
 scsi_get_vpd_size+0x34/0xa8
 scsi_get_vpd_buf+0x28/0xf4
 scsi_attach_vpd+0x44/0x170
 scsi_rescan_device+0x30/0x98
 ata_scsi_dev_rescan+0xc8/0xfc
 process_one_work+0x2e0/0x474
 worker_thread+0x1cc/0x270
 kthread+0xd8/0xe8
 ret_from_fork+0x10/0x20


task:hd              state:D stack:    0 pid: 1163 ppid:  1076 flags:0x00000000
Call trace:
 __switch_to+0xbc/0x124
 __schedule+0x540/0x71c
 schedule+0x58/0xa0
 io_schedule+0x18/0x34
 blk_mq_get_tag+0x138/0x244
 __blk_mq_alloc_requests+0x130/0x2f0
 blk_mq_submit_bio+0x44c/0x5b4
 __submit_bio+0x24/0x5c
 submit_bio_noacct_nocheck+0x8c/0x178
 submit_bio_noacct+0x380/0x3b0
 submit_bio+0x34/0x3c
 mpage_bio_submit+0x28/0x38
 mpage_readahead+0xa8/0x178
 blkdev_readahead+0x14/0x1c
 read_pages+0x4c/0x158
 page_cache_ra_unbounded+0xd8/0x174
 do_page_cache_ra+0x40/0x4c
 page_cache_ra_order+0x14/0x1c
 ondemand_readahead+0x124/0x2fc
 page_cache_sync_ra+0x50/0x54
 filemap_read+0x130/0x6e8
 blkdev_read_iter+0xf0/0x164
 new_sync_read+0x74/0xc0
 vfs_read+0xbc/0xd8
 ksys_read+0x6c/0xd4
 __arm64_sys_read+0x14/0x1c
 invoke_syscall+0x70/0xf4
 el0_svc_common.constprop.0+0xbc/0xf0
 do_el0_svc+0x18/0x20
 el0_svc+0x30/0x84
 el0t_64_sync_handler+0x90/0xf8
 el0t_64_sync+0x14c/0x150

I hope this helps.
Thanks!

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
  2022-08-12 10:48                         ` [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support Geert Uytterhoeven
@ 2022-08-12 15:53                           ` Bart Van Assche
  2022-08-15 10:13                             ` Geert Uytterhoeven
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2022-08-12 15:53 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
	John Garry, ericspero, jason600.groome, Linux-Renesas,
	Linux Kernel Mailing List, linux-ide

On 8/12/22 03:48, Geert Uytterhoeven wrote:
> sd_submit_start() is called once during suspend, and once during
> resume.  It does not hang.
> 
> Reading from /dev/sda hangs after resume (not in sd_submit_start(),
> which is never called for reading).
> 
> Two tasks are blocked in blk_mq_get_tag() calling io_schedule():
> 
> task:kworker/7:1     state:D stack:    0 pid:  122 ppid:     2 flags:0x00000008
> Workqueue: events ata_scsi_dev_rescan
> Call trace:
>   __switch_to+0xbc/0x124
>   __schedule+0x540/0x71c
>   schedule+0x58/0xa0
>   io_schedule+0x18/0x34
>   blk_mq_get_tag+0x138/0x244
>   __blk_mq_alloc_requests+0x130/0x2f0
>   blk_mq_alloc_request+0x74/0xa8
>   scsi_alloc_request+0x10/0x30
>   __scsi_execute+0x5c/0x18c
>   scsi_vpd_inquiry+0x7c/0xdc
>   scsi_get_vpd_size+0x34/0xa8
>   scsi_get_vpd_buf+0x28/0xf4
>   scsi_attach_vpd+0x44/0x170
>   scsi_rescan_device+0x30/0x98
>   ata_scsi_dev_rescan+0xc8/0xfc
>   process_one_work+0x2e0/0x474
>   worker_thread+0x1cc/0x270
>   kthread+0xd8/0xe8
>   ret_from_fork+0x10/0x20
> 
> 
> task:hd              state:D stack:    0 pid: 1163 ppid:  1076 flags:0x00000000
> Call trace:
>   __switch_to+0xbc/0x124
>   __schedule+0x540/0x71c
>   schedule+0x58/0xa0
>   io_schedule+0x18/0x34
>   blk_mq_get_tag+0x138/0x244
>   __blk_mq_alloc_requests+0x130/0x2f0
>   blk_mq_submit_bio+0x44c/0x5b4
>   __submit_bio+0x24/0x5c
>   submit_bio_noacct_nocheck+0x8c/0x178
>   submit_bio_noacct+0x380/0x3b0
>   submit_bio+0x34/0x3c
>   mpage_bio_submit+0x28/0x38
>   mpage_readahead+0xa8/0x178
>   blkdev_readahead+0x14/0x1c
>   read_pages+0x4c/0x158
>   page_cache_ra_unbounded+0xd8/0x174
>   do_page_cache_ra+0x40/0x4c
>   page_cache_ra_order+0x14/0x1c
>   ondemand_readahead+0x124/0x2fc
>   page_cache_sync_ra+0x50/0x54
>   filemap_read+0x130/0x6e8
>   blkdev_read_iter+0xf0/0x164
>   new_sync_read+0x74/0xc0
>   vfs_read+0xbc/0xd8
>   ksys_read+0x6c/0xd4
>   __arm64_sys_read+0x14/0x1c
>   invoke_syscall+0x70/0xf4
>   el0_svc_common.constprop.0+0xbc/0xf0
>   do_el0_svc+0x18/0x20
>   el0_svc+0x30/0x84
>   el0t_64_sync_handler+0x90/0xf8
>   el0t_64_sync+0x14c/0x150

Hi Geert,

All that can be concluded from the above is that blk_mq_get_tag() is 
waiting for other I/O request(s) to finish. One or more other requests 
are in progress and either scsi_done() has not been called for these 
requests or the error handler got stuck. Since the issue reported above 
is not observed with other ATA interfaces, this may be related to the 
ATA interface driver used in your test setup.

Bart.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
  2022-08-12 15:53                           ` Bart Van Assche
@ 2022-08-15 10:13                             ` Geert Uytterhoeven
  2022-08-15 13:49                               ` Bart Van Assche
  0 siblings, 1 reply; 10+ messages in thread
From: Geert Uytterhoeven @ 2022-08-15 10:13 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
	John Garry, ericspero, jason600.groome, Linux-Renesas,
	Linux Kernel Mailing List, linux-ide, James Bottomley

Hoi Bart,

On Fri, Aug 12, 2022 at 5:53 PM Bart Van Assche <bvanassche@acm.org> wrote:
> On 8/12/22 03:48, Geert Uytterhoeven wrote:
> > sd_submit_start() is called once during suspend, and once during
> > resume.  It does not hang.
> >
> > Reading from /dev/sda hangs after resume (not in sd_submit_start(),
> > which is never called for reading).

FTR, this issue is now present in v6.0-rc1. Reverting commit
88f1669019bd62b3 ("scsi: sd: Rework asynchronous resume support")
fixes it.

> > Two tasks are blocked in blk_mq_get_tag() calling io_schedule():
> >
> > task:kworker/7:1     state:D stack:    0 pid:  122 ppid:     2 flags:0x00000008
> > Workqueue: events ata_scsi_dev_rescan
> > Call trace:
> >   __switch_to+0xbc/0x124
> >   __schedule+0x540/0x71c
> >   schedule+0x58/0xa0
> >   io_schedule+0x18/0x34
> >   blk_mq_get_tag+0x138/0x244
> >   __blk_mq_alloc_requests+0x130/0x2f0
> >   blk_mq_alloc_request+0x74/0xa8
> >   scsi_alloc_request+0x10/0x30
> >   __scsi_execute+0x5c/0x18c
> >   scsi_vpd_inquiry+0x7c/0xdc
> >   scsi_get_vpd_size+0x34/0xa8
> >   scsi_get_vpd_buf+0x28/0xf4
> >   scsi_attach_vpd+0x44/0x170
> >   scsi_rescan_device+0x30/0x98
> >   ata_scsi_dev_rescan+0xc8/0xfc
> >   process_one_work+0x2e0/0x474
> >   worker_thread+0x1cc/0x270
> >   kthread+0xd8/0xe8
> >   ret_from_fork+0x10/0x20
> >
> >
> > task:hd              state:D stack:    0 pid: 1163 ppid:  1076 flags:0x00000000
> > Call trace:
> >   __switch_to+0xbc/0x124
> >   __schedule+0x540/0x71c
> >   schedule+0x58/0xa0
> >   io_schedule+0x18/0x34
> >   blk_mq_get_tag+0x138/0x244
> >   __blk_mq_alloc_requests+0x130/0x2f0
> >   blk_mq_submit_bio+0x44c/0x5b4
> >   __submit_bio+0x24/0x5c
> >   submit_bio_noacct_nocheck+0x8c/0x178
> >   submit_bio_noacct+0x380/0x3b0
> >   submit_bio+0x34/0x3c
> >   mpage_bio_submit+0x28/0x38
> >   mpage_readahead+0xa8/0x178
> >   blkdev_readahead+0x14/0x1c
> >   read_pages+0x4c/0x158
> >   page_cache_ra_unbounded+0xd8/0x174
> >   do_page_cache_ra+0x40/0x4c
> >   page_cache_ra_order+0x14/0x1c
> >   ondemand_readahead+0x124/0x2fc
> >   page_cache_sync_ra+0x50/0x54
> >   filemap_read+0x130/0x6e8
> >   blkdev_read_iter+0xf0/0x164
> >   new_sync_read+0x74/0xc0
> >   vfs_read+0xbc/0xd8
> >   ksys_read+0x6c/0xd4
> >   __arm64_sys_read+0x14/0x1c
> >   invoke_syscall+0x70/0xf4
> >   el0_svc_common.constprop.0+0xbc/0xf0
> >   do_el0_svc+0x18/0x20
> >   el0_svc+0x30/0x84
> >   el0t_64_sync_handler+0x90/0xf8
> >   el0t_64_sync+0x14c/0x150
>
> All that can be concluded from the above is that blk_mq_get_tag() is
> waiting for other I/O request(s) to finish. One or more other requests
> are in progress and either scsi_done() has not been called for these
> requests or the error handler got stuck. Since the issue reported above
> is not observed with other ATA interfaces, this may be related to the
> ATA interface driver used in your test setup.

I have added debug prints to all ata_port_operations in
sata_rcar_port_ops.  After s2idle, running "hd /dev/sda | head -70"
hangs before any of these functions are called.

Showing all locks held in the system:
1 lock held by rcu_tasks_kthre/10:
 #0: ffff800009575c38 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at:
rcu_tasks_one_gp+0x34/0x4c8
4 locks held by kworker/0:10/104:
 #0: ffff0004c0008738 ((wq_completion)events){+.+.}-{0:0}, at:
process_one_work+0x1f4/0x6a0
 #1: ffff80000a90bde0
((work_completion)(&ap->scsi_rescan_task)){+.+.}-{0:0}, at:
process_one_work+0x1f4/0x6a0
 #2: ffff0004c2b6bf60 (&ap->scsi_scan_mutex){+.+.}-{3:3}, at:
ata_scsi_dev_rescan+0x28/0x118
 #3: ffff0004c2902368 (&dev->mutex){....}-{3:3}, at:
scsi_rescan_device+0x28/0x78
1 lock held by in:imklog/636:
 #0: ffff0004c5ee86e8 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x54/0x68
1 lock held by hd/1013:
 #0: ffff0004c06388b8 (mapping.invalidate_lock#2){.+.+}-{3:3}, at:
page_cache_ra_unbounded+0x64/0x1a8

I've just tried with a USB storage device on the same platform,
and it can be read fine after s2idle.  So it looks like the issue
is related to SATA.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
  2022-08-15 10:13                             ` Geert Uytterhoeven
@ 2022-08-15 13:49                               ` Bart Van Assche
  2022-08-15 18:26                                 ` Geert Uytterhoeven
  2022-08-17 19:07                                 ` Vlastimil Babka
  0 siblings, 2 replies; 10+ messages in thread
From: Bart Van Assche @ 2022-08-15 13:49 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
	John Garry, ericspero, jason600.groome, Linux-Renesas,
	Linux Kernel Mailing List, linux-ide, James Bottomley

On 8/15/22 03:13, Geert Uytterhoeven wrote:
> Showing all locks held in the system:
> 1 lock held by rcu_tasks_kthre/10:
>   #0: ffff800009575c38 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at:
> rcu_tasks_one_gp+0x34/0x4c8
> 4 locks held by kworker/0:10/104:
>   #0: ffff0004c0008738 ((wq_completion)events){+.+.}-{0:0}, at:
> process_one_work+0x1f4/0x6a0
>   #1: ffff80000a90bde0
> ((work_completion)(&ap->scsi_rescan_task)){+.+.}-{0:0}, at:
> process_one_work+0x1f4/0x6a0
>   #2: ffff0004c2b6bf60 (&ap->scsi_scan_mutex){+.+.}-{3:3}, at:
> ata_scsi_dev_rescan+0x28/0x118
>   #3: ffff0004c2902368 (&dev->mutex){....}-{3:3}, at:
> scsi_rescan_device+0x28/0x78
> 1 lock held by in:imklog/636:
>   #0: ffff0004c5ee86e8 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x54/0x68
> 1 lock held by hd/1013:
>   #0: ffff0004c06388b8 (mapping.invalidate_lock#2){.+.+}-{3:3}, at:
> page_cache_ra_unbounded+0x64/0x1a8

Thank you for having shared this information. I will take a closer look 
and see what I can derive from the above information.

> I've just tried with a USB storage device on the same platform,
> and it can be read fine after s2idle.  So it looks like the issue
> is related to SATA.

Unfortunately the above does not learn us anything new. The code 
modified by commit 88f1669019bd ("scsi: sd: Rework asynchronous resume 
support") is only called if sdev->manage_start_stop != 1. Only the SATA 
code, the Firewire code and the manage_start_stop sysfs attribute store 
method set that member variable:

$ git grep -nH 'manage_start_stop = '
drivers/ata/libata-scsi.c:1083:		sdev->manage_start_stop = 1;
drivers/firewire/sbp2.c:1521:		sdev->manage_start_stop = 1;
drivers/scsi/sd.c:240:	sdp->manage_start_stop = v;

Would it be possible to share the output of the command below? That 
should reveal which ATA driver is active on the test setup.

find /sys -name proc_name | xargs grep -aH .

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
  2022-08-15 13:49                               ` Bart Van Assche
@ 2022-08-15 18:26                                 ` Geert Uytterhoeven
  2022-08-16 20:21                                   ` Bart Van Assche
  2022-08-17 19:07                                 ` Vlastimil Babka
  1 sibling, 1 reply; 10+ messages in thread
From: Geert Uytterhoeven @ 2022-08-15 18:26 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
	John Garry, ericspero, jason600.groome, Linux-Renesas,
	Linux Kernel Mailing List, linux-ide, James Bottomley

Hoi Bart,

On Mon, Aug 15, 2022 at 3:49 PM Bart Van Assche <bvanassche@acm.org> wrote:
> On 8/15/22 03:13, Geert Uytterhoeven wrote:
> > Showing all locks held in the system:
> > 1 lock held by rcu_tasks_kthre/10:
> >   #0: ffff800009575c38 (rcu_tasks.tasks_gp_mutex){+.+.}-{3:3}, at:
> > rcu_tasks_one_gp+0x34/0x4c8
> > 4 locks held by kworker/0:10/104:
> >   #0: ffff0004c0008738 ((wq_completion)events){+.+.}-{0:0}, at:
> > process_one_work+0x1f4/0x6a0
> >   #1: ffff80000a90bde0
> > ((work_completion)(&ap->scsi_rescan_task)){+.+.}-{0:0}, at:
> > process_one_work+0x1f4/0x6a0
> >   #2: ffff0004c2b6bf60 (&ap->scsi_scan_mutex){+.+.}-{3:3}, at:
> > ata_scsi_dev_rescan+0x28/0x118
> >   #3: ffff0004c2902368 (&dev->mutex){....}-{3:3}, at:
> > scsi_rescan_device+0x28/0x78
> > 1 lock held by in:imklog/636:
> >   #0: ffff0004c5ee86e8 (&f->f_pos_lock){+.+.}-{3:3}, at: __fdget_pos+0x54/0x68
> > 1 lock held by hd/1013:
> >   #0: ffff0004c06388b8 (mapping.invalidate_lock#2){.+.+}-{3:3}, at:
> > page_cache_ra_unbounded+0x64/0x1a8
>
> Thank you for having shared this information. I will take a closer look
> and see what I can derive from the above information.
>
> > I've just tried with a USB storage device on the same platform,
> > and it can be read fine after s2idle.  So it looks like the issue
> > is related to SATA.
>
> Unfortunately the above does not learn us anything new. The code
> modified by commit 88f1669019bd ("scsi: sd: Rework asynchronous resume
> support") is only called if sdev->manage_start_stop != 1. Only the SATA
> code, the Firewire code and the manage_start_stop sysfs attribute store
> method set that member variable:
>
> $ git grep -nH 'manage_start_stop = '
> drivers/ata/libata-scsi.c:1083:         sdev->manage_start_stop = 1;
> drivers/firewire/sbp2.c:1521:           sdev->manage_start_stop = 1;
> drivers/scsi/sd.c:240:  sdp->manage_start_stop = v;
>
> Would it be possible to share the output of the command below? That
> should reveal which ATA driver is active on the test setup.
>
> find /sys -name proc_name | xargs grep -aH .

/sys/devices/platform/soc/ee300000.sata/ata1/host0/scsi_host/host0/proc_name:sata_rcar

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
  2022-08-15 18:26                                 ` Geert Uytterhoeven
@ 2022-08-16 20:21                                   ` Bart Van Assche
  2022-08-17  8:53                                     ` Sergey Shtylyov
  0 siblings, 1 reply; 10+ messages in thread
From: Bart Van Assche @ 2022-08-16 20:21 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
	John Garry, ericspero, jason600.groome, Linux-Renesas,
	Linux Kernel Mailing List, linux-ide, James Bottomley

On 8/15/22 11:26, Geert Uytterhoeven wrote:
> On Mon, Aug 15, 2022 at 3:49 PM Bart Van Assche <bvanassche@acm.org> wrote:
>> Would it be possible to share the output of the command below? That
>> should reveal which ATA driver is active on the test setup.
>>
>> find /sys -name proc_name | xargs grep -aH .
> 
> /sys/devices/platform/soc/ee300000.sata/ata1/host0/scsi_host/host0/proc_name:sata_rcar

Thanks Geert for the help. Although I already posted a revert, I'm still 
trying to root-cause this issue. Do you perhaps know whether sata_rcar 
controllers support NCQ and if so, what queue depth these controllers 
support? I think that information is available in sysfs. Here is an 
example for a VM:

# (cd /sys/class/scsi_device && for a in */device/*/*/ncq_prio_enable; 
do p=${a%/ncq_prio_enable}; grep -qi ata $p/inquiry || continue; grep 
-aH . $p/{queue_depth,ncq*}; done)
2:0:0:0/device/driver/2:0:0:0/queue_depth:32
2:0:0:0/device/driver/2:0:0:0/ncq_prio_enable:0
2:0:0:0/device/driver/2:0:0:0/ncq_prio_supported:0
2:0:0:0/device/generic/device/queue_depth:32
2:0:0:0/device/generic/device/ncq_prio_enable:0
2:0:0:0/device/generic/device/ncq_prio_supported:0
6:0:0:1/device/driver/2:0:0:0/queue_depth:32
6:0:0:1/device/driver/2:0:0:0/ncq_prio_enable:0
6:0:0:1/device/driver/2:0:0:0/ncq_prio_supported:0

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
  2022-08-16 20:21                                   ` Bart Van Assche
@ 2022-08-17  8:53                                     ` Sergey Shtylyov
  0 siblings, 0 replies; 10+ messages in thread
From: Sergey Shtylyov @ 2022-08-17  8:53 UTC (permalink / raw)
  To: Bart Van Assche, Geert Uytterhoeven
  Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
	John Garry, ericspero, jason600.groome, Linux-Renesas,
	Linux Kernel Mailing List, linux-ide, James Bottomley

Hello!

On 8/16/22 11:21 PM, Bart Van Assche wrote:

>> On Mon, Aug 15, 2022 at 3:49 PM Bart Van Assche <bvanassche@acm.org> wrote:
>>> Would it be possible to share the output of the command below? That
>>> should reveal which ATA driver is active on the test setup.
>>>
>>> find /sys -name proc_name | xargs grep -aH .
>>
>> /sys/devices/platform/soc/ee300000.sata/ata1/host0/scsi_host/host0/proc_name:sata_rcar
> 
> Thanks Geert for the help. Although I already posted a revert, I'm still trying to
> root-cause this issue. Do you perhaps know whether sata_rcar controllers support NCQ

   They don't. :-)

> and if so, what queue depth these controllers support? I think that information is available in sysfs. Here is an example for a VM:
> 
> # (cd /sys/class/scsi_device && for a in */device/*/*/ncq_prio_enable; do p=${a%/ncq_prio_enable}; grep -qi ata $p/inquiry || continue; grep -aH . $p/{queue_depth,ncq*}; done)
> 2:0:0:0/device/driver/2:0:0:0/queue_depth:32
> 2:0:0:0/device/driver/2:0:0:0/ncq_prio_enable:0
> 2:0:0:0/device/driver/2:0:0:0/ncq_prio_supported:0
> 2:0:0:0/device/generic/device/queue_depth:32
> 2:0:0:0/device/generic/device/ncq_prio_enable:0
> 2:0:0:0/device/generic/device/ncq_prio_supported:0
> 6:0:0:1/device/driver/2:0:0:0/queue_depth:32
> 6:0:0:1/device/driver/2:0:0:0/ncq_prio_enable:0
> 6:0:0:1/device/driver/2:0:0:0/ncq_prio_supported:0
> 
> Thanks,
> 
> Bart.

MBR, Sergey

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
  2022-08-15 13:49                               ` Bart Van Assche
  2022-08-15 18:26                                 ` Geert Uytterhoeven
@ 2022-08-17 19:07                                 ` Vlastimil Babka
  2022-08-17 19:28                                   ` Bart Van Assche
  2022-08-28 11:52                                   ` [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support #forregzbot Thorsten Leemhuis
  1 sibling, 2 replies; 10+ messages in thread
From: Vlastimil Babka @ 2022-08-17 19:07 UTC (permalink / raw)
  To: Bart Van Assche, Geert Uytterhoeven
  Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
	John Garry, ericspero, jason600.groome, Linux-Renesas,
	Linux Kernel Mailing List, linux-ide, James Bottomley,
	regressions

Hi, I have a T460 hanging on resume from suspend to ram in 6.0-rc1 that
I bisected to this commit.

> Unfortunately the above does not learn us anything new. The code 
> modified by commit 88f1669019bd ("scsi: sd: Rework asynchronous resume 
> support") is only called if sdev->manage_start_stop != 1. Only the SATA 
> code, the Firewire code and the manage_start_stop sysfs attribute store 
> method set that member variable:
> 
> $ git grep -nH 'manage_start_stop = '
> drivers/ata/libata-scsi.c:1083:		sdev->manage_start_stop = 1;
> drivers/firewire/sbp2.c:1521:		sdev->manage_start_stop = 1;
> drivers/scsi/sd.c:240:	sdp->manage_start_stop = v;
> 
> Would it be possible to share the output of the command below? That 
> should reveal which ATA driver is active on the test setup.
> 
> find /sys -name proc_name | xargs grep -aH .

In my case it's
/sys/devices/pci0000:00/0000:00:17.0/ata1/host0/scsi_host/host0/proc_name:ahci
/sys/devices/pci0000:00/0000:00:17.0/ata2/host1/scsi_host/host1/proc_name:ahci

Some more details from dmesg

[    0.849373] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[    0.852849] ata2.00: ACPI cmd f5/00:00:00:00:00:a0(SECURITY FREEZE LOCK) filtered out
[    0.854671] ata2.00: supports DRM functions and may not be fully accessible
[    0.856181] ata2.00: ATA-9: SAMSUNG MZ7LN512HMJP-000L7, MAV01L6Q, max UDMA/133
[    0.858115] ata2.00: 1000215216 sectors, multi 1: LBA48 NCQ (depth 32), AA
[    0.861584] ata2.00: Features: Trust Dev-Sleep NCQ-sndrcv
[    0.863749] ata2.00: ACPI cmd f5/00:00:00:00:00:a0(SECURITY FREEZE LOCK) filtered out
[    0.865481] ata2.00: supports DRM functions and may not be fully accessible
[    0.870043] ata2.00: configured for UDMA/133
[    0.871871] scsi 1:0:0:0: Direct-Access     ATA      SAMSUNG MZ7LN512 1L6Q PQ: 0 ANSI: 5

Please Cc me on further questions/steps to try/patches to test.

#regzbot introduced: 88f1669019bd62b3
#regzbot monitor: https://lore.kernel.org/all/20220816172638.538734-1-bvanassche@acm.org/

> Thanks,
> 
> Bart.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support
  2022-08-17 19:07                                 ` Vlastimil Babka
@ 2022-08-17 19:28                                   ` Bart Van Assche
  2022-08-28 11:52                                   ` [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support #forregzbot Thorsten Leemhuis
  1 sibling, 0 replies; 10+ messages in thread
From: Bart Van Assche @ 2022-08-17 19:28 UTC (permalink / raw)
  To: Vlastimil Babka, Geert Uytterhoeven
  Cc: Martin K . Petersen, Jaegeuk Kim, scsi, Ming Lei, Hannes Reinecke,
	John Garry, ericspero, jason600.groome, Linux-Renesas,
	Linux Kernel Mailing List, linux-ide, James Bottomley,
	regressions

On 8/17/22 12:07, Vlastimil Babka wrote:
> In my case it's
> /sys/devices/pci0000:00/0000:00:17.0/ata1/host0/scsi_host/host0/proc_name:ahci
> /sys/devices/pci0000:00/0000:00:17.0/ata2/host1/scsi_host/host1/proc_name:ahci
> 
> Some more details from dmesg
> 
> [    0.849373] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
> [    0.852849] ata2.00: ACPI cmd f5/00:00:00:00:00:a0(SECURITY FREEZE LOCK) filtered out
> [    0.854671] ata2.00: supports DRM functions and may not be fully accessible
> [    0.856181] ata2.00: ATA-9: SAMSUNG MZ7LN512HMJP-000L7, MAV01L6Q, max UDMA/133
> [    0.858115] ata2.00: 1000215216 sectors, multi 1: LBA48 NCQ (depth 32), AA
> [    0.861584] ata2.00: Features: Trust Dev-Sleep NCQ-sndrcv
> [    0.863749] ata2.00: ACPI cmd f5/00:00:00:00:00:a0(SECURITY FREEZE LOCK) filtered out
> [    0.865481] ata2.00: supports DRM functions and may not be fully accessible
> [    0.870043] ata2.00: configured for UDMA/133
> [    0.871871] scsi 1:0:0:0: Direct-Access     ATA      SAMSUNG MZ7LN512 1L6Q PQ: 0 ANSI: 5
> 
> Please Cc me on further questions/steps to try/patches to test.

Hi Vlastimil,

Thank you for having provided the above information. The root cause of 
the hang is not yet clear to me. I was wondering whether the hang 
perhaps would be triggered by controllers that only support queue depth 
1. However, in the above output I see "depth 32".

As already reported in this email thread a revert for commit 
88f1669019bd62b3 has been posted on the linux-scsi mailing list. 
Additionally, Greg KH has been asked to drop that patch from the stable 
trees.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support #forregzbot
  2022-08-17 19:07                                 ` Vlastimil Babka
  2022-08-17 19:28                                   ` Bart Van Assche
@ 2022-08-28 11:52                                   ` Thorsten Leemhuis
  1 sibling, 0 replies; 10+ messages in thread
From: Thorsten Leemhuis @ 2022-08-28 11:52 UTC (permalink / raw)
  To: regressions; +Cc: scsi, Linux-Renesas, Linux Kernel Mailing List, linux-ide

TWIMC: this mail is primarily send for documentation purposes and for
regzbot, my Linux kernel regression tracking bot. These mails usually
contain '#forregzbot' in the subject, to make them easy to spot and filter.

On 17.08.22 21:07, Vlastimil Babka wrote:
> Hi, I have a T460 hanging on resume from suspend to ram in 6.0-rc1 that
> I bisected to this commit.
> 
>> Unfortunately the above does not learn us anything new. The code 
>> modified by commit 88f1669019bd ("scsi: sd: Rework asynchronous resume 
>> support") is only called if sdev->manage_start_stop != 1. Only the SATA 
>> code, the Firewire code and the manage_start_stop sysfs attribute store 
>> method set that member variable:
> [...]
> #regzbot introduced: 88f1669019bd62b3
> #regzbot monitor: https://lore.kernel.org/all/20220816172638.538734-1-bvanassche@acm.org/

#regzbot fixed-by: 785538bfdd682c8e962341d585f9b88262a0475

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)

P.S.: As the Linux kernel's regression tracker I deal with a lot of
reports and sometimes miss something important when writing mails like
this. If that's the case here, don't hesitate to tell me in a public
reply, it's in everyone's interest to set the public record straight.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-08-28 11:53 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20220630195703.10155-1-bvanassche@acm.org>
     [not found] ` <20220630195703.10155-3-bvanassche@acm.org>
     [not found]   ` <alpine.DEB.2.22.394.2207191125130.1006766@ramsan.of.borg>
     [not found]     ` <db19ed29-e7f9-e5b0-3a6c-f2812078a07d@acm.org>
     [not found]       ` <CAMuHMdVzsgSYtbJQnaigNax_JbxPsQfU+gHcteS-ojWbxUdMfw@mail.gmail.com>
     [not found]         ` <CAMuHMdWtxBj8ug7AHTqentF8UD4jpO2sgoWWcQCOvEKLJtdq8A@mail.gmail.com>
     [not found]           ` <506ca1a6-1122-5755-fc74-60f7c7bfbd0d@acm.org>
     [not found]             ` <CAMuHMdVQ2K2v8jpsFfOMk99DG_sBB4_ioiQRroC7K_Ov1wvp9w@mail.gmail.com>
     [not found]               ` <6f70e742-9d8a-f389-0482-0ba9696bf445@acm.org>
     [not found]                 ` <CAMuHMdVc+ATGV-=R3uV6RyF0-mZiuKv7HpmogRBgqGVyO-MKWg@mail.gmail.com>
     [not found]                   ` <54e20a27-a10b-b77a-e950-1d3398e2e907@acm.org>
     [not found]                     ` <CAMuHMdURQpAEGgv4cY7v0rqzs12v2TT=Amt26Y0OoBSW7YAoaw@mail.gmail.com>
     [not found]                       ` <084e7c5a-f98d-d61e-de81-83525851ecf9@acm.org>
2022-08-12 10:48                         ` [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support Geert Uytterhoeven
2022-08-12 15:53                           ` Bart Van Assche
2022-08-15 10:13                             ` Geert Uytterhoeven
2022-08-15 13:49                               ` Bart Van Assche
2022-08-15 18:26                                 ` Geert Uytterhoeven
2022-08-16 20:21                                   ` Bart Van Assche
2022-08-17  8:53                                     ` Sergey Shtylyov
2022-08-17 19:07                                 ` Vlastimil Babka
2022-08-17 19:28                                   ` Bart Van Assche
2022-08-28 11:52                                   ` [PATCH v2 2/2] scsi: sd: Rework asynchronous resume support #forregzbot Thorsten Leemhuis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox