* Re: [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 19:33 [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5 bugzilla-daemon
@ 2023-09-15 20:42 ` Bart Van Assche
2023-09-15 20:55 ` Laurence Oberman
2023-09-15 22:01 ` Niklas Cassel
2023-09-15 20:42 ` [Bug 217914] " bugzilla-daemon
` (11 subsequent siblings)
12 siblings, 2 replies; 19+ messages in thread
From: Bart Van Assche @ 2023-09-15 20:42 UTC (permalink / raw)
To: bugzilla-daemon, linux-scsi, Damien Le Moal
On 9/15/23 12:33, bugzilla-daemon@kernel.org wrote:
> The users loqs and leonshaw helped to narrow it down to this commit:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=624885209f31eb9985bf51abe204ecbffe2fdeea
Damien, can you please take a look?
Thanks,
Bart.
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 20:42 ` Bart Van Assche
@ 2023-09-15 20:55 ` Laurence Oberman
2023-09-15 22:01 ` Niklas Cassel
1 sibling, 0 replies; 19+ messages in thread
From: Laurence Oberman @ 2023-09-15 20:55 UTC (permalink / raw)
To: Bart Van Assche, bugzilla-daemon, linux-scsi, Damien Le Moal
On Fri, 2023-09-15 at 13:42 -0700, Bart Van Assche wrote:
> On 9/15/23 12:33, bugzilla-daemon@kernel.org wrote:
> > The users loqs and leonshaw helped to narrow it down to this
> > commit:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=624885209f31eb9985bf51abe204ecbffe2fdeea
>
> Damien, can you please take a look?
>
> Thanks,
>
> Bart.
>
I had a quick look at this and its not making sense. The only calls I
see are in the scan when the devices is added and in the re-scan.
It should not be consuming the scsi_eh thread unless some type of udev
events keeps happening.
Would be good to get some
cat /proc/<PID>/stack of the scsi_eh threads if they are constantly
consuming CPUY
I will try reproduce and try figure out what is going on here.
Thanks
Laurence
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 20:42 ` Bart Van Assche
2023-09-15 20:55 ` Laurence Oberman
@ 2023-09-15 22:01 ` Niklas Cassel
2023-09-15 22:10 ` Damien Le Moal
2023-09-15 22:46 ` Damien Le Moal
1 sibling, 2 replies; 19+ messages in thread
From: Niklas Cassel @ 2023-09-15 22:01 UTC (permalink / raw)
To: Bart Van Assche
Cc: bugzilla-daemon@kernel.org, linux-scsi@vger.kernel.org,
Damien Le Moal
On Fri, Sep 15, 2023 at 01:42:18PM -0700, Bart Van Assche wrote:
> On 9/15/23 12:33, bugzilla-daemon@kernel.org wrote:
> > The users loqs and leonshaw helped to narrow it down to this commit:
> >
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=624885209f31eb9985bf51abe204ecbffe2fdeea
>
> Damien, can you please take a look?
>
Hello Bart,
It seems like:
https://lore.kernel.org/linux-scsi/20230915022034.678121-1-dlemoal@kernel.org/
Solves the problem.
From a quick look at the logs with extra log leves enabled:
https://pastebin.com/f2LQ8kQD
it appears that the MAINTENANCE_IN / MI_REPORT_SUPPORTED_OPERATION_CODES
command with a non-zero service action issued by scsi_cdl_check() fails,
and will be added to SCSI EH over and over.
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 22:01 ` Niklas Cassel
@ 2023-09-15 22:10 ` Damien Le Moal
2023-09-15 22:46 ` Damien Le Moal
1 sibling, 0 replies; 19+ messages in thread
From: Damien Le Moal @ 2023-09-15 22:10 UTC (permalink / raw)
To: Niklas Cassel, Bart Van Assche
Cc: bugzilla-daemon@kernel.org, linux-scsi@vger.kernel.org
On 9/16/23 07:01, Niklas Cassel wrote:
> On Fri, Sep 15, 2023 at 01:42:18PM -0700, Bart Van Assche wrote:
>> On 9/15/23 12:33, bugzilla-daemon@kernel.org wrote:
>>> The users loqs and leonshaw helped to narrow it down to this commit:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=624885209f31eb9985bf51abe204ecbffe2fdeea
>>
>> Damien, can you please take a look?
>>
>
> Hello Bart,
>
> It seems like:
> https://lore.kernel.org/linux-scsi/20230915022034.678121-1-dlemoal@kernel.org/
>
> Solves the problem.
>
> From a quick look at the logs with extra log leves enabled:
> https://pastebin.com/f2LQ8kQD
> it appears that the MAINTENANCE_IN / MI_REPORT_SUPPORTED_OPERATION_CODES
> command with a non-zero service action issued by scsi_cdl_check() fails,
> and will be added to SCSI EH over and over.
The failure is due to the drive not liking this command. My patch avoids sending
that command, thus solves the issue with drives that choke on it. However, the
constant retry sound to me like a different bug... We should not retry that
command at all I think. Or maybe limit it to 3 retries.
>
>
> Kind regards,
> Niklas
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 22:01 ` Niklas Cassel
2023-09-15 22:10 ` Damien Le Moal
@ 2023-09-15 22:46 ` Damien Le Moal
1 sibling, 0 replies; 19+ messages in thread
From: Damien Le Moal @ 2023-09-15 22:46 UTC (permalink / raw)
To: Niklas Cassel, Bart Van Assche
Cc: bugzilla-daemon@kernel.org, linux-scsi@vger.kernel.org
On 9/16/23 07:01, Niklas Cassel wrote:
> On Fri, Sep 15, 2023 at 01:42:18PM -0700, Bart Van Assche wrote:
>> On 9/15/23 12:33, bugzilla-daemon@kernel.org wrote:
>>> The users loqs and leonshaw helped to narrow it down to this commit:
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=624885209f31eb9985bf51abe204ecbffe2fdeea
>>
>> Damien, can you please take a look?
>>
>
> Hello Bart,
>
> It seems like:
> https://lore.kernel.org/linux-scsi/20230915022034.678121-1-dlemoal@kernel.org/
>
> Solves the problem.
>
> From a quick look at the logs with extra log leves enabled:
> https://pastebin.com/f2LQ8kQD
> it appears that the MAINTENANCE_IN / MI_REPORT_SUPPORTED_OPERATION_CODES
> command with a non-zero service action issued by scsi_cdl_check() fails,
> and will be added to SCSI EH over and over.
Looks like the vmware emulated scsi cdrom (sr) does not like this command...
While SPC would allow cdroms to support CDL, I do not think we will ever see
that. So we could restrict CDL probe to block devices only. That still does not
explain why the constant retry. The MAINTENANCE_IN /
MI_REPORT_SUPPORTED_OPERATION_CODES failing is expected in most cases so it
should silently move on with cdl probe returning false. My patch is still needed
as some drives seem to hang on that command.
>
>
> Kind regards,
> Niklas
--
Damien Le Moal
Western Digital Research
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug 217914] scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 19:33 [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5 bugzilla-daemon
2023-09-15 20:42 ` Bart Van Assche
@ 2023-09-15 20:42 ` bugzilla-daemon
2023-09-15 20:55 ` bugzilla-daemon
` (10 subsequent siblings)
12 siblings, 0 replies; 19+ messages in thread
From: bugzilla-daemon @ 2023-09-15 20:42 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=217914
--- Comment #1 from Bart Van Assche (bvanassche@acm.org) ---
On 9/15/23 12:33, bugzilla-daemon@kernel.org wrote:
> The users loqs and leonshaw helped to narrow it down to this commit:
>
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=624885209f31eb9985bf51abe204ecbffe2fdeea
Damien, can you please take a look?
Thanks,
Bart.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 19+ messages in thread* [Bug 217914] scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 19:33 [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5 bugzilla-daemon
2023-09-15 20:42 ` Bart Van Assche
2023-09-15 20:42 ` [Bug 217914] " bugzilla-daemon
@ 2023-09-15 20:55 ` bugzilla-daemon
2023-09-15 21:06 ` bugzilla-daemon
` (9 subsequent siblings)
12 siblings, 0 replies; 19+ messages in thread
From: bugzilla-daemon @ 2023-09-15 20:55 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=217914
--- Comment #2 from loberman@redhat.com ---
On Fri, 2023-09-15 at 13:42 -0700, Bart Van Assche wrote:
> On 9/15/23 12:33, bugzilla-daemon@kernel.org wrote:
> > The users loqs and leonshaw helped to narrow it down to this
> > commit:
> >
> >
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=624885209f31eb9985bf51abe204ecbffe2fdeea
>
> Damien, can you please take a look?
>
> Thanks,
>
> Bart.
>
I had a quick look at this and its not making sense. The only calls I
see are in the scan when the devices is added and in the re-scan.
It should not be consuming the scsi_eh thread unless some type of udev
events keeps happening.
Would be good to get some
cat /proc/<PID>/stack of the scsi_eh threads if they are constantly
consuming CPUY
I will try reproduce and try figure out what is going on here.
Thanks
Laurence
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 19+ messages in thread* [Bug 217914] scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 19:33 [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5 bugzilla-daemon
` (2 preceding siblings ...)
2023-09-15 20:55 ` bugzilla-daemon
@ 2023-09-15 21:06 ` bugzilla-daemon
2023-09-15 22:01 ` bugzilla-daemon
` (8 subsequent siblings)
12 siblings, 0 replies; 19+ messages in thread
From: bugzilla-daemon @ 2023-09-15 21:06 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=217914
loberman@redhat.com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |loberman@redhat.com
--- Comment #3 from loberman@redhat.com ---
Not reproducible generically for me
[root@penguin8 ~]# uname -a
Linux penguin8 6.5.0+ #2 SMP PREEMPT_DYNAMIC
[root@penguin8 ~]# lsscsi
[0:0:0:0] disk ATA Samsung SSD 850 3B6Q /dev/sdb
[1:0:0:0] disk ATA Samsung SSD 850 3B6Q /dev/sda
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1649 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_0]
root 1651 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_1]
root 1653 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_2]
root 1655 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_3]
root 1668 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_4]
root 1670 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_5]
root 1672 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_6]
root 1674 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_7]
root 1866 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_8]
root 1887 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_9]
root 1649 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_0]
root 1651 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_1]
root 1653 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_2]
root 1655 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_3]
root 1668 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_4]
root 1670 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_5]
root 1672 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_6]
root 1674 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_7]
root 1866 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_8]
root 1887 0.0 0.0 0 0 ? S 16:58 0:00 [scsi_eh_9]
I Have no CDROm so I think its the virtual cdrom.
In VMware the CDROM will continuously get probed and log errors due to no media
and every time that happens it will call the cdl stuff.
I will bring up a Virtual guest now, will take time as I will have to build
upstream kernels.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 19+ messages in thread* [Bug 217914] scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 19:33 [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5 bugzilla-daemon
` (3 preceding siblings ...)
2023-09-15 21:06 ` bugzilla-daemon
@ 2023-09-15 22:01 ` bugzilla-daemon
2023-09-15 22:10 ` bugzilla-daemon
` (7 subsequent siblings)
12 siblings, 0 replies; 19+ messages in thread
From: bugzilla-daemon @ 2023-09-15 22:01 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=217914
--- Comment #4 from Niklas.Cassel@wdc.com ---
On Fri, Sep 15, 2023 at 01:42:18PM -0700, Bart Van Assche wrote:
> On 9/15/23 12:33, bugzilla-daemon@kernel.org wrote:
> > The users loqs and leonshaw helped to narrow it down to this commit:
> >
> >
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=624885209f31eb9985bf51abe204ecbffe2fdeea
>
> Damien, can you please take a look?
>
Hello Bart,
It seems like:
https://lore.kernel.org/linux-scsi/20230915022034.678121-1-dlemoal@kernel.org/
Solves the problem.
From a quick look at the logs with extra log leves enabled:
https://pastebin.com/f2LQ8kQD
it appears that the MAINTENANCE_IN / MI_REPORT_SUPPORTED_OPERATION_CODES
command with a non-zero service action issued by scsi_cdl_check() fails,
and will be added to SCSI EH over and over.
Kind regards,
Niklas
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 19+ messages in thread* [Bug 217914] scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 19:33 [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5 bugzilla-daemon
` (4 preceding siblings ...)
2023-09-15 22:01 ` bugzilla-daemon
@ 2023-09-15 22:10 ` bugzilla-daemon
2023-09-15 22:46 ` bugzilla-daemon
` (6 subsequent siblings)
12 siblings, 0 replies; 19+ messages in thread
From: bugzilla-daemon @ 2023-09-15 22:10 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=217914
--- Comment #5 from dlemoal@kernel.org ---
On 9/16/23 07:01, Niklas Cassel wrote:
> On Fri, Sep 15, 2023 at 01:42:18PM -0700, Bart Van Assche wrote:
>> On 9/15/23 12:33, bugzilla-daemon@kernel.org wrote:
>>> The users loqs and leonshaw helped to narrow it down to this commit:
>>>
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=624885209f31eb9985bf51abe204ecbffe2fdeea
>>
>> Damien, can you please take a look?
>>
>
> Hello Bart,
>
> It seems like:
>
> https://lore.kernel.org/linux-scsi/20230915022034.678121-1-dlemoal@kernel.org/
>
> Solves the problem.
>
> From a quick look at the logs with extra log leves enabled:
> https://pastebin.com/f2LQ8kQD
> it appears that the MAINTENANCE_IN / MI_REPORT_SUPPORTED_OPERATION_CODES
> command with a non-zero service action issued by scsi_cdl_check() fails,
> and will be added to SCSI EH over and over.
The failure is due to the drive not liking this command. My patch avoids
sending
that command, thus solves the issue with drives that choke on it. However, the
constant retry sound to me like a different bug... We should not retry that
command at all I think. Or maybe limit it to 3 retries.
>
>
> Kind regards,
> Niklas
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 19+ messages in thread* [Bug 217914] scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 19:33 [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5 bugzilla-daemon
` (5 preceding siblings ...)
2023-09-15 22:10 ` bugzilla-daemon
@ 2023-09-15 22:46 ` bugzilla-daemon
2023-11-13 3:30 ` bugzilla-daemon
` (5 subsequent siblings)
12 siblings, 0 replies; 19+ messages in thread
From: bugzilla-daemon @ 2023-09-15 22:46 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=217914
--- Comment #6 from dlemoal@kernel.org ---
On 9/16/23 07:01, Niklas Cassel wrote:
> On Fri, Sep 15, 2023 at 01:42:18PM -0700, Bart Van Assche wrote:
>> On 9/15/23 12:33, bugzilla-daemon@kernel.org wrote:
>>> The users loqs and leonshaw helped to narrow it down to this commit:
>>>
>>>
>>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=624885209f31eb9985bf51abe204ecbffe2fdeea
>>
>> Damien, can you please take a look?
>>
>
> Hello Bart,
>
> It seems like:
>
> https://lore.kernel.org/linux-scsi/20230915022034.678121-1-dlemoal@kernel.org/
>
> Solves the problem.
>
> From a quick look at the logs with extra log leves enabled:
> https://pastebin.com/f2LQ8kQD
> it appears that the MAINTENANCE_IN / MI_REPORT_SUPPORTED_OPERATION_CODES
> command with a non-zero service action issued by scsi_cdl_check() fails,
> and will be added to SCSI EH over and over.
Looks like the vmware emulated scsi cdrom (sr) does not like this command...
While SPC would allow cdroms to support CDL, I do not think we will ever see
that. So we could restrict CDL probe to block devices only. That still does not
explain why the constant retry. The MAINTENANCE_IN /
MI_REPORT_SUPPORTED_OPERATION_CODES failing is expected in most cases so it
should silently move on with cdl probe returning false. My patch is still
needed
as some drives seem to hang on that command.
>
>
> Kind regards,
> Niklas
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 19+ messages in thread* [Bug 217914] scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 19:33 [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5 bugzilla-daemon
` (6 preceding siblings ...)
2023-09-15 22:46 ` bugzilla-daemon
@ 2023-11-13 3:30 ` bugzilla-daemon
2023-11-13 11:53 ` Niklas Cassel
2023-11-13 11:53 ` bugzilla-daemon
` (4 subsequent siblings)
12 siblings, 1 reply; 19+ messages in thread
From: bugzilla-daemon @ 2023-11-13 3:30 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=217914
Christian Kujau (kernel@nerdbynature.de) changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |kernel@nerdbynature.de
--- Comment #7 from Christian Kujau (kernel@nerdbynature.de) ---
Noticed the same here when upgrading from 6.1.0-13-amd64 to
6.5.0-0.deb12.1-amd64 (both Debian kernels) earlier this month:
============================================================================
$ sar -f /var/log/sysstat/sa20231108
CPU %user %nice %system %iowait %steal %idle
[...]
18:30:03 all 1.03 0.00 0.45 0.04 0.00 98.47
18:40:01 all 1.07 0.00 0.52 0.05 0.00 98.36
18:50:01 all 1.07 0.00 0.53 0.04 0.00 98.37
19:00:01 all 1.35 0.00 0.69 0.08 0.00 97.88
19:10:04 all 1.09 0.00 0.52 0.07 0.00 98.31
19:20:02 all 1.14 0.00 0.51 0.05 0.00 98.30
19:30:04 all 1.62 0.00 0.65 0.08 0.00 97.65
Average: all 1.06 0.00 0.50 0.06 0.00 98.38
19:32:27 LINUX RESTART (2 CPU)
19:40:03 CPU %user %nice %system %iowait %steal %idle
19:50:00 all 2.27 0.00 3.23 57.40 0.00 37.11
20:00:02 all 1.29 0.00 2.70 59.27 0.00 36.75
20:10:03 all 1.48 0.00 2.93 58.38 0.00 37.21
20:20:03 all 1.40 0.00 2.94 58.93 0.00 36.73
20:30:02 all 1.39 0.00 2.87 59.99 0.00 35.74
20:40:03 all 1.48 0.00 3.44 59.83 0.00 35.26
20:50:00 all 1.29 0.00 2.88 60.84 0.00 34.98
21:00:03 all 1.31 0.00 2.63 59.81 0.00 36.25
21:10:03 all 1.33 0.00 2.72 59.85 0.00 36.09
21:20:01 all 1.31 0.00 2.82 59.28 0.00 36.59
21:30:01 all 1.39 0.00 2.92 60.51 0.00 35.18
21:40:01 all 1.34 0.00 3.04 60.04 0.00 35.57
21:50:03 all 1.29 0.00 2.51 59.79 0.00 36.41
22:00:03 all 1.36 0.00 3.23 59.81 0.00 35.59
22:10:03 all 1.37 0.00 2.56 59.13 0.00 36.93
22:20:03 all 1.36 0.00 2.88 58.46 0.00 37.29
22:30:03 all 1.31 0.00 2.65 59.07 0.00 36.97
22:40:00 all 1.32 0.00 2.72 59.61 0.00 36.35
22:50:01 all 1.32 0.00 2.72 59.35 0.00 36.61
23:00:03 all 1.29 0.00 2.68 59.30 0.00 36.72
23:10:03 all 1.35 0.00 2.62 60.11 0.00 35.91
23:20:02 all 1.29 0.00 2.91 59.55 0.00 36.25
23:30:03 all 1.32 0.00 2.72 58.37 0.00 37.59
23:40:01 all 1.34 0.00 2.97 57.74 0.00 37.95
23:50:00 all 1.33 0.00 2.54 59.90 0.00 36.24
Average: all 1.38 0.00 2.83 59.37 0.00 36.41
$ last -n 3 reboot
reboot system boot 6.5.0-0.deb12.1- Wed Nov 8 19:32 still running
reboot system boot 6.1.0-13-amd64 Mon Oct 9 23:14 - 19:32 (29+21:17)
reboot system boot 6.1.0-13-amd64 Mon Oct 9 22:39 - 23:14 (00:35)
============================================================================
And top only shows a single [scsi_eh_2] thread using ~50% of CPU time,
sometimes there's an events thread too.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
336 root 20 0 0.0m 0.0m 0.0m D 50.0 0.0 53,37
[scsi_eh_2]
3794126 root 20 0 0.0m 0.0m 0.0m I 25.7 0.0 0:06.27
[kworker/0:0-events]
This is a Debian/amd64 VM running on a VMware ESX host. There's a virtual CDROM
drive, but nothing is attached here and I'm not using it, all:
============================================================================
$ lsblk -d --scsi
NAME HCTL TYPE VENDOR MODEL REV SERIAL
TRAN
sda 0:0:0:0 disk VMware Virtual disk 2.0
sr0 2:0:0:0 rom NECVMWar VMware Virtual IDE CDROM Drive 1.00
10000000000000000001 ata
$ dmesg -t | grep -i scsi
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 247)
SCSI subsystem initialized
VMware PVSCSI driver - version 1.0.7.0-k
vmw_pvscsi: using 64bit dma
vmw_pvscsi: max_id: 65
vmw_pvscsi: setting ring_pages to 32
vmw_pvscsi: enabling reqCallThreshold
vmw_pvscsi: driver-based request coalescing enabled
vmw_pvscsi: using MSI-X
scsi host0: VMware PVSCSI storage adapter rev 2, req/cmp/msg rings: 32/32/1
pages, cmd_per_lun=254
vmw_pvscsi 0000:03:00.0: VMware PVSCSI rev 2 host #0
scsi 0:0:0:0: Direct-Access VMware Virtual disk 2.0 PQ: 0 ANSI: 6
sd 0:0:0:0: [sda] Attached SCSI disk
sd 0:0:0:0: Attached scsi generic sg0 type 0
scsi host1: ata_piix
scsi host2: ata_piix
scsi 2:0:0:0: CD-ROM NECVMWar VMware IDE CDR10 1.00 PQ: 0 ANSI: 5
scsi 2:0:0:0: Attached scsi generic sg1 type 5
sr 2:0:0:0: [sr0] scsi3-mmc drive: 1x/1x writer dvd-ram cd/rw xa/form2 cdda
tray
sr 2:0:0:0: Attached scsi CD-ROM sr0
============================================================================
Will that "scsi: Do no try to probe for CDL on old drives" patch land in
mainline or is this still under discussion?
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [Bug 217914] scsi_eh_1 process high cpu after upgrading to 6.5
2023-11-13 3:30 ` bugzilla-daemon
@ 2023-11-13 11:53 ` Niklas Cassel
0 siblings, 0 replies; 19+ messages in thread
From: Niklas Cassel @ 2023-11-13 11:53 UTC (permalink / raw)
To: bugzilla-daemon@kernel.org; +Cc: linux-scsi@vger.kernel.org
On Mon, Nov 13, 2023 at 03:30:57AM +0000, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=217914
>
> Christian Kujau (kernel@nerdbynature.de) changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> CC| |kernel@nerdbynature.de
>
> --- Comment #7 from Christian Kujau (kernel@nerdbynature.de) ---
> Noticed the same here when upgrading from 6.1.0-13-amd64 to
> 6.5.0-0.deb12.1-amd64 (both Debian kernels) earlier this month:
>
> ============================================================================
> $ sar -f /var/log/sysstat/sa20231108
> CPU %user %nice %system %iowait %steal %idle
> [...]
> 18:30:03 all 1.03 0.00 0.45 0.04 0.00 98.47
> 18:40:01 all 1.07 0.00 0.52 0.05 0.00 98.36
> 18:50:01 all 1.07 0.00 0.53 0.04 0.00 98.37
> 19:00:01 all 1.35 0.00 0.69 0.08 0.00 97.88
> 19:10:04 all 1.09 0.00 0.52 0.07 0.00 98.31
> 19:20:02 all 1.14 0.00 0.51 0.05 0.00 98.30
> 19:30:04 all 1.62 0.00 0.65 0.08 0.00 97.65
> Average: all 1.06 0.00 0.50 0.06 0.00 98.38
>
> 19:32:27 LINUX RESTART (2 CPU)
>
> 19:40:03 CPU %user %nice %system %iowait %steal %idle
> 19:50:00 all 2.27 0.00 3.23 57.40 0.00 37.11
> 20:00:02 all 1.29 0.00 2.70 59.27 0.00 36.75
> 20:10:03 all 1.48 0.00 2.93 58.38 0.00 37.21
> 20:20:03 all 1.40 0.00 2.94 58.93 0.00 36.73
> 20:30:02 all 1.39 0.00 2.87 59.99 0.00 35.74
> 20:40:03 all 1.48 0.00 3.44 59.83 0.00 35.26
> 20:50:00 all 1.29 0.00 2.88 60.84 0.00 34.98
> 21:00:03 all 1.31 0.00 2.63 59.81 0.00 36.25
> 21:10:03 all 1.33 0.00 2.72 59.85 0.00 36.09
> 21:20:01 all 1.31 0.00 2.82 59.28 0.00 36.59
> 21:30:01 all 1.39 0.00 2.92 60.51 0.00 35.18
> 21:40:01 all 1.34 0.00 3.04 60.04 0.00 35.57
> 21:50:03 all 1.29 0.00 2.51 59.79 0.00 36.41
> 22:00:03 all 1.36 0.00 3.23 59.81 0.00 35.59
> 22:10:03 all 1.37 0.00 2.56 59.13 0.00 36.93
> 22:20:03 all 1.36 0.00 2.88 58.46 0.00 37.29
> 22:30:03 all 1.31 0.00 2.65 59.07 0.00 36.97
> 22:40:00 all 1.32 0.00 2.72 59.61 0.00 36.35
> 22:50:01 all 1.32 0.00 2.72 59.35 0.00 36.61
> 23:00:03 all 1.29 0.00 2.68 59.30 0.00 36.72
> 23:10:03 all 1.35 0.00 2.62 60.11 0.00 35.91
> 23:20:02 all 1.29 0.00 2.91 59.55 0.00 36.25
> 23:30:03 all 1.32 0.00 2.72 58.37 0.00 37.59
> 23:40:01 all 1.34 0.00 2.97 57.74 0.00 37.95
> 23:50:00 all 1.33 0.00 2.54 59.90 0.00 36.24
> Average: all 1.38 0.00 2.83 59.37 0.00 36.41
>
> $ last -n 3 reboot
> reboot system boot 6.5.0-0.deb12.1- Wed Nov 8 19:32 still running
> reboot system boot 6.1.0-13-amd64 Mon Oct 9 23:14 - 19:32 (29+21:17)
> reboot system boot 6.1.0-13-amd64 Mon Oct 9 22:39 - 23:14 (00:35)
> ============================================================================
>
> And top only shows a single [scsi_eh_2] thread using ~50% of CPU time,
> sometimes there's an events thread too.
>
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 336 root 20 0 0.0m 0.0m 0.0m D 50.0 0.0 53,37
> [scsi_eh_2]
> 3794126 root 20 0 0.0m 0.0m 0.0m I 25.7 0.0 0:06.27
> [kworker/0:0-events]
>
> This is a Debian/amd64 VM running on a VMware ESX host. There's a virtual CDROM
> drive, but nothing is attached here and I'm not using it, all:
>
>
> ============================================================================
> $ lsblk -d --scsi
> NAME HCTL TYPE VENDOR MODEL REV SERIAL
> TRAN
> sda 0:0:0:0 disk VMware Virtual disk 2.0
> sr0 2:0:0:0 rom NECVMWar VMware Virtual IDE CDROM Drive 1.00
> 10000000000000000001 ata
>
>
> $ dmesg -t | grep -i scsi
> Block layer SCSI generic (bsg) driver version 0.4 loaded (major 247)
> SCSI subsystem initialized
> VMware PVSCSI driver - version 1.0.7.0-k
> vmw_pvscsi: using 64bit dma
> vmw_pvscsi: max_id: 65
> vmw_pvscsi: setting ring_pages to 32
> vmw_pvscsi: enabling reqCallThreshold
> vmw_pvscsi: driver-based request coalescing enabled
> vmw_pvscsi: using MSI-X
> scsi host0: VMware PVSCSI storage adapter rev 2, req/cmp/msg rings: 32/32/1
> pages, cmd_per_lun=254
> vmw_pvscsi 0000:03:00.0: VMware PVSCSI rev 2 host #0
> scsi 0:0:0:0: Direct-Access VMware Virtual disk 2.0 PQ: 0 ANSI: 6
> sd 0:0:0:0: [sda] Attached SCSI disk
> sd 0:0:0:0: Attached scsi generic sg0 type 0
> scsi host1: ata_piix
> scsi host2: ata_piix
> scsi 2:0:0:0: CD-ROM NECVMWar VMware IDE CDR10 1.00 PQ: 0 ANSI: 5
> scsi 2:0:0:0: Attached scsi generic sg1 type 5
> sr 2:0:0:0: [sr0] scsi3-mmc drive: 1x/1x writer dvd-ram cd/rw xa/form2 cdda
> tray
> sr 2:0:0:0: Attached scsi CD-ROM sr0
> ============================================================================
>
> Will that "scsi: Do no try to probe for CDL on old drives" patch land in
> mainline or is this still under discussion?
It has been in mainline for a while:
2132df16f53b ("scsi: core: ata: Do no try to probe for CDL on old drives")
$ git tag --contains 2132df16f53b
v6.6
v6.6-rc4
v6.6-rc5
v6.6-rc6
v6.6-rc7
v6.6.1
v6.7-rc1
Kind regards,
Niklas
^ permalink raw reply [flat|nested] 19+ messages in thread
* [Bug 217914] scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 19:33 [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5 bugzilla-daemon
` (7 preceding siblings ...)
2023-11-13 3:30 ` bugzilla-daemon
@ 2023-11-13 11:53 ` bugzilla-daemon
2023-11-13 12:02 ` bugzilla-daemon
` (3 subsequent siblings)
12 siblings, 0 replies; 19+ messages in thread
From: bugzilla-daemon @ 2023-11-13 11:53 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=217914
--- Comment #8 from Niklas.Cassel@wdc.com ---
On Mon, Nov 13, 2023 at 03:30:57AM +0000, bugzilla-daemon@kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=217914
>
> Christian Kujau (kernel@nerdbynature.de) changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> CC| |kernel@nerdbynature.de
>
> --- Comment #7 from Christian Kujau (kernel@nerdbynature.de) ---
> Noticed the same here when upgrading from 6.1.0-13-amd64 to
> 6.5.0-0.deb12.1-amd64 (both Debian kernels) earlier this month:
>
> ============================================================================
> $ sar -f /var/log/sysstat/sa20231108
> CPU %user %nice %system %iowait %steal
> %idle
> [...]
> 18:30:03 all 1.03 0.00 0.45 0.04 0.00
> 98.47
> 18:40:01 all 1.07 0.00 0.52 0.05 0.00
> 98.36
> 18:50:01 all 1.07 0.00 0.53 0.04 0.00
> 98.37
> 19:00:01 all 1.35 0.00 0.69 0.08 0.00
> 97.88
> 19:10:04 all 1.09 0.00 0.52 0.07 0.00
> 98.31
> 19:20:02 all 1.14 0.00 0.51 0.05 0.00
> 98.30
> 19:30:04 all 1.62 0.00 0.65 0.08 0.00
> 97.65
> Average: all 1.06 0.00 0.50 0.06 0.00
> 98.38
>
> 19:32:27 LINUX RESTART (2 CPU)
>
> 19:40:03 CPU %user %nice %system %iowait %steal
> %idle
> 19:50:00 all 2.27 0.00 3.23 57.40 0.00
> 37.11
> 20:00:02 all 1.29 0.00 2.70 59.27 0.00
> 36.75
> 20:10:03 all 1.48 0.00 2.93 58.38 0.00
> 37.21
> 20:20:03 all 1.40 0.00 2.94 58.93 0.00
> 36.73
> 20:30:02 all 1.39 0.00 2.87 59.99 0.00
> 35.74
> 20:40:03 all 1.48 0.00 3.44 59.83 0.00
> 35.26
> 20:50:00 all 1.29 0.00 2.88 60.84 0.00
> 34.98
> 21:00:03 all 1.31 0.00 2.63 59.81 0.00
> 36.25
> 21:10:03 all 1.33 0.00 2.72 59.85 0.00
> 36.09
> 21:20:01 all 1.31 0.00 2.82 59.28 0.00
> 36.59
> 21:30:01 all 1.39 0.00 2.92 60.51 0.00
> 35.18
> 21:40:01 all 1.34 0.00 3.04 60.04 0.00
> 35.57
> 21:50:03 all 1.29 0.00 2.51 59.79 0.00
> 36.41
> 22:00:03 all 1.36 0.00 3.23 59.81 0.00
> 35.59
> 22:10:03 all 1.37 0.00 2.56 59.13 0.00
> 36.93
> 22:20:03 all 1.36 0.00 2.88 58.46 0.00
> 37.29
> 22:30:03 all 1.31 0.00 2.65 59.07 0.00
> 36.97
> 22:40:00 all 1.32 0.00 2.72 59.61 0.00
> 36.35
> 22:50:01 all 1.32 0.00 2.72 59.35 0.00
> 36.61
> 23:00:03 all 1.29 0.00 2.68 59.30 0.00
> 36.72
> 23:10:03 all 1.35 0.00 2.62 60.11 0.00
> 35.91
> 23:20:02 all 1.29 0.00 2.91 59.55 0.00
> 36.25
> 23:30:03 all 1.32 0.00 2.72 58.37 0.00
> 37.59
> 23:40:01 all 1.34 0.00 2.97 57.74 0.00
> 37.95
> 23:50:00 all 1.33 0.00 2.54 59.90 0.00
> 36.24
> Average: all 1.38 0.00 2.83 59.37 0.00
> 36.41
>
> $ last -n 3 reboot
> reboot system boot 6.5.0-0.deb12.1- Wed Nov 8 19:32 still running
> reboot system boot 6.1.0-13-amd64 Mon Oct 9 23:14 - 19:32 (29+21:17)
> reboot system boot 6.1.0-13-amd64 Mon Oct 9 22:39 - 23:14 (00:35)
> ============================================================================
>
> And top only shows a single [scsi_eh_2] thread using ~50% of CPU time,
> sometimes there's an events thread too.
>
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 336 root 20 0 0.0m 0.0m 0.0m D 50.0 0.0 53,37
> [scsi_eh_2]
> 3794126 root 20 0 0.0m 0.0m 0.0m I 25.7 0.0 0:06.27
> [kworker/0:0-events]
>
> This is a Debian/amd64 VM running on a VMware ESX host. There's a virtual
> CDROM
> drive, but nothing is attached here and I'm not using it, all:
>
>
> ============================================================================
> $ lsblk -d --scsi
> NAME HCTL TYPE VENDOR MODEL REV SERIAL
> TRAN
> sda 0:0:0:0 disk VMware Virtual disk 2.0
> sr0 2:0:0:0 rom NECVMWar VMware Virtual IDE CDROM Drive 1.00
> 10000000000000000001 ata
>
>
> $ dmesg -t | grep -i scsi
> Block layer SCSI generic (bsg) driver version 0.4 loaded (major 247)
> SCSI subsystem initialized
> VMware PVSCSI driver - version 1.0.7.0-k
> vmw_pvscsi: using 64bit dma
> vmw_pvscsi: max_id: 65
> vmw_pvscsi: setting ring_pages to 32
> vmw_pvscsi: enabling reqCallThreshold
> vmw_pvscsi: driver-based request coalescing enabled
> vmw_pvscsi: using MSI-X
> scsi host0: VMware PVSCSI storage adapter rev 2, req/cmp/msg rings: 32/32/1
> pages, cmd_per_lun=254
> vmw_pvscsi 0000:03:00.0: VMware PVSCSI rev 2 host #0
> scsi 0:0:0:0: Direct-Access VMware Virtual disk 2.0 PQ: 0 ANSI: 6
> sd 0:0:0:0: [sda] Attached SCSI disk
> sd 0:0:0:0: Attached scsi generic sg0 type 0
> scsi host1: ata_piix
> scsi host2: ata_piix
> scsi 2:0:0:0: CD-ROM NECVMWar VMware IDE CDR10 1.00 PQ: 0 ANSI: 5
> scsi 2:0:0:0: Attached scsi generic sg1 type 5
> sr 2:0:0:0: [sr0] scsi3-mmc drive: 1x/1x writer dvd-ram cd/rw xa/form2 cdda
> tray
> sr 2:0:0:0: Attached scsi CD-ROM sr0
> ============================================================================
>
> Will that "scsi: Do no try to probe for CDL on old drives" patch land in
> mainline or is this still under discussion?
It has been in mainline for a while:
2132df16f53b ("scsi: core: ata: Do no try to probe for CDL on old drives")
$ git tag --contains 2132df16f53b
v6.6
v6.6-rc4
v6.6-rc5
v6.6-rc6
v6.6-rc7
v6.6.1
v6.7-rc1
Kind regards,
Niklas
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 19+ messages in thread* [Bug 217914] scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 19:33 [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5 bugzilla-daemon
` (8 preceding siblings ...)
2023-11-13 11:53 ` bugzilla-daemon
@ 2023-11-13 12:02 ` bugzilla-daemon
2023-11-27 9:50 ` bugzilla-daemon
` (2 subsequent siblings)
12 siblings, 0 replies; 19+ messages in thread
From: bugzilla-daemon @ 2023-11-13 12:02 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=217914
--- Comment #9 from Christian Kujau (kernel@nerdbynature.de) ---
Oh, great! I was only searching for the subject line, not the actual changes.
So, I just have to wait for Debian (Backports) to move to v6.6 then :-) Thanks!
This Bugzilla entry can be closed then, I assume.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 19+ messages in thread* [Bug 217914] scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 19:33 [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5 bugzilla-daemon
` (9 preceding siblings ...)
2023-11-13 12:02 ` bugzilla-daemon
@ 2023-11-27 9:50 ` bugzilla-daemon
2024-11-21 9:32 ` bugzilla-daemon
2024-11-21 19:30 ` bugzilla-daemon
12 siblings, 0 replies; 19+ messages in thread
From: bugzilla-daemon @ 2023-11-27 9:50 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=217914
Fabio Pedretti (pedretti.fabio@gmail.com) changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |pedretti.fabio@gmail.com
--- Comment #10 from Fabio Pedretti (pedretti.fabio@gmail.com) ---
"scsi: core: ata: Do no try to probe for CDL on old drives" -> it's also on
6.5.6.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 19+ messages in thread* [Bug 217914] scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 19:33 [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5 bugzilla-daemon
` (10 preceding siblings ...)
2023-11-27 9:50 ` bugzilla-daemon
@ 2024-11-21 9:32 ` bugzilla-daemon
2024-11-21 19:30 ` bugzilla-daemon
12 siblings, 0 replies; 19+ messages in thread
From: bugzilla-daemon @ 2024-11-21 9:32 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=217914
--- Comment #11 from Fabio Pedretti (pedretti.fabio@gmail.com) ---
This should be closed.
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 19+ messages in thread* [Bug 217914] scsi_eh_1 process high cpu after upgrading to 6.5
2023-09-15 19:33 [Bug 217914] New: scsi_eh_1 process high cpu after upgrading to 6.5 bugzilla-daemon
` (11 preceding siblings ...)
2024-11-21 9:32 ` bugzilla-daemon
@ 2024-11-21 19:30 ` bugzilla-daemon
12 siblings, 0 replies; 19+ messages in thread
From: bugzilla-daemon @ 2024-11-21 19:30 UTC (permalink / raw)
To: linux-scsi
https://bugzilla.kernel.org/show_bug.cgi?id=217914
Christian Zangl (laktak@cdak.net) changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |CODE_FIX
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are the assignee for the bug.
^ permalink raw reply [flat|nested] 19+ messages in thread