CIFS lockup regression on SMB1 in 6.10

Linux CIFS filesystem development
 help / color / mirror / Atom feed

* CIFS lockup regression on SMB1 in 6.10
@ 2024-08-15 17:53 matoro
  2024-08-15 19:37 ` Steve French
  0 siblings, 1 reply; 13+ messages in thread
From: matoro @ 2024-08-15 17:53 UTC (permalink / raw)
  To: Linux Cifs; +Cc: Bruno Haible

Hi all, I run a service where user home directories are mounted over SMB1 
with unix extensions.  After upgrading to kernel 6.10 it was reported to me 
that users were observing lockups when performing compilations in their home 
directories.  I investigated and confirmed this to be the case.  It would 
cause the build processes to get stuck in I/O.  After the lockup triggered 
then all further reads/writes to the CIFS-mounted directory would get stuck.  
Even the df(1) command would block indefinitely.  Shutdown was also prevented 
as the directory could no longer be unmounted.

Triggering the issue is a little bit tricky.  I used compiling cpython as a 
test case.  Parallel compilation does not seem to be required to trigger it, 
because in some tests the hang would occur during ./configure phase, but it 
does seem to provoke it more easily, as the most common point where the 
lockup was observed was immediately after "make -j4".  However, sometimes it 
would take 10+ minutes of ongoing compilation before the lockup would 
trigger.  I never observed a complete successful compilation on kernel 6.10.

The furthest back I was able to confirm that the lockup is observed was 
v6.10-rc3.  The furthest forward I was able to confirm is good was v6.9.9 in 
the stable tree.  Unfortunately, between those two tags there seems to be a 
wide range of commits where the CIFS functionality is completely broken, and 
reads/writes return total nonsense results.  For example, any git commands 
return "git error: bad signature 0x00000000".  So I cannot execute a 
compilation on commits in this range in order to test whether they observe 
the lockup issue.  Therefore I wasn't able to test most of the range, and 
wasn't able to complete a traditional bisect.  I tried adjusting the 
read/write buffers down to 8192 from the defaults, but this did not help.  I 
also tried toggling several options that might be related, namely 
CONFIG_FSCACHE, to no effect.  There are no logs emitted to dmesg when the 
lockup occurs.

Thanks - please let me know if there is any further information I can 
provide.  For now I am rolling all hosts back to kernel 6.9.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: CIFS lockup regression on SMB1 in 6.10
  2024-08-15 17:53 CIFS lockup regression on SMB1 in 6.10 matoro
@ 2024-08-15 19:37 ` Steve French
  2024-08-15 21:22   ` matoro
  0 siblings, 1 reply; 13+ messages in thread
From: Steve French @ 2024-08-15 19:37 UTC (permalink / raw)
  To: matoro; +Cc: Linux Cifs, Bruno Haible

Do you have any data on whether this still fails with current Linux
kernel (6.11-rc3 e.g.)?


On Thu, Aug 15, 2024 at 1:08 PM matoro
<matoro_mailinglist_kernel@matoro.tk> wrote:
>
> Hi all, I run a service where user home directories are mounted over SMB1
> with unix extensions.  After upgrading to kernel 6.10 it was reported to me
> that users were observing lockups when performing compilations in their home
> directories.  I investigated and confirmed this to be the case.  It would
> cause the build processes to get stuck in I/O.  After the lockup triggered
> then all further reads/writes to the CIFS-mounted directory would get stuck.
> Even the df(1) command would block indefinitely.  Shutdown was also prevented
> as the directory could no longer be unmounted.
>
> Triggering the issue is a little bit tricky.  I used compiling cpython as a
> test case.  Parallel compilation does not seem to be required to trigger it,
> because in some tests the hang would occur during ./configure phase, but it
> does seem to provoke it more easily, as the most common point where the
> lockup was observed was immediately after "make -j4".  However, sometimes it
> would take 10+ minutes of ongoing compilation before the lockup would
> trigger.  I never observed a complete successful compilation on kernel 6.10.
>
> The furthest back I was able to confirm that the lockup is observed was
> v6.10-rc3.  The furthest forward I was able to confirm is good was v6.9.9 in
> the stable tree.  Unfortunately, between those two tags there seems to be a
> wide range of commits where the CIFS functionality is completely broken, and
> reads/writes return total nonsense results.  For example, any git commands
> return "git error: bad signature 0x00000000".  So I cannot execute a
> compilation on commits in this range in order to test whether they observe
> the lockup issue.  Therefore I wasn't able to test most of the range, and
> wasn't able to complete a traditional bisect.  I tried adjusting the
> read/write buffers down to 8192 from the defaults, but this did not help.  I
> also tried toggling several options that might be related, namely
> CONFIG_FSCACHE, to no effect.  There are no logs emitted to dmesg when the
> lockup occurs.
>
> Thanks - please let me know if there is any further information I can
> provide.  For now I am rolling all hosts back to kernel 6.9.
>


--
Thanks,

Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: CIFS lockup regression on SMB1 in 6.10
  2024-08-15 19:37 ` Steve French
@ 2024-08-15 21:22   ` matoro
  2024-08-16  3:31     ` Steve French
  0 siblings, 1 reply; 13+ messages in thread
From: matoro @ 2024-08-15 21:22 UTC (permalink / raw)
  To: Steve French; +Cc: Linux Cifs, Bruno Haible

On 2024-08-15 15:37, Steve French wrote:
> Do you have any data on whether this still fails with current Linux
> kernel (6.11-rc3 e.g.)?
> 
> 
> On Thu, Aug 15, 2024 at 1:08 PM matoro
> <matoro_mailinglist_kernel@matoro.tk> wrote:
>> 
>> Hi all, I run a service where user home directories are mounted over SMB1
>> with unix extensions.  After upgrading to kernel 6.10 it was reported to me
>> that users were observing lockups when performing compilations in their 
>> home
>> directories.  I investigated and confirmed this to be the case.  It would
>> cause the build processes to get stuck in I/O.  After the lockup triggered
>> then all further reads/writes to the CIFS-mounted directory would get 
>> stuck.
>> Even the df(1) command would block indefinitely.  Shutdown was also 
>> prevented
>> as the directory could no longer be unmounted.
>> 
>> Triggering the issue is a little bit tricky.  I used compiling cpython as a
>> test case.  Parallel compilation does not seem to be required to trigger 
>> it,
>> because in some tests the hang would occur during ./configure phase, but it
>> does seem to provoke it more easily, as the most common point where the
>> lockup was observed was immediately after "make -j4".  However, sometimes 
>> it
>> would take 10+ minutes of ongoing compilation before the lockup would
>> trigger.  I never observed a complete successful compilation on kernel 
>> 6.10.
>> 
>> The furthest back I was able to confirm that the lockup is observed was
>> v6.10-rc3.  The furthest forward I was able to confirm is good was v6.9.9 
>> in
>> the stable tree.  Unfortunately, between those two tags there seems to be a
>> wide range of commits where the CIFS functionality is completely broken, 
>> and
>> reads/writes return total nonsense results.  For example, any git commands
>> return "git error: bad signature 0x00000000".  So I cannot execute a
>> compilation on commits in this range in order to test whether they observe
>> the lockup issue.  Therefore I wasn't able to test most of the range, and
>> wasn't able to complete a traditional bisect.  I tried adjusting the
>> read/write buffers down to 8192 from the defaults, but this did not help.  
>> I
>> also tried toggling several options that might be related, namely
>> CONFIG_FSCACHE, to no effect.  There are no logs emitted to dmesg when the
>> lockup occurs.
>> 
>> Thanks - please let me know if there is any further information I can
>> provide.  For now I am rolling all hosts back to kernel 6.9.
>> 
> 
> 
> --
> Thanks,
> 
> Steve

Hi Steve, just tested.  Not only is it still there in 6.11-rc3, but it's much 
worse - I got an immediate lockup just from ./configure

Thank you for looking at this.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: CIFS lockup regression on SMB1 in 6.10
  2024-08-15 21:22   ` matoro
@ 2024-08-16  3:31     ` Steve French
  2024-08-16  3:47       ` matoro
  2024-08-20 19:33       ` Kris Karas (Bug Reporting)
  0 siblings, 2 replies; 13+ messages in thread
From: Steve French @ 2024-08-16  3:31 UTC (permalink / raw)
  To: matoro; +Cc: Linux Cifs, Bruno Haible

What is the simplest repro you have seen - e.g. is there a git tree
with very small source that fails with configure that you could share?

On Thu, Aug 15, 2024 at 4:22 PM matoro
<matoro_mailinglist_kernel@matoro.tk> wrote:
>
> On 2024-08-15 15:37, Steve French wrote:
> > Do you have any data on whether this still fails with current Linux
> > kernel (6.11-rc3 e.g.)?
> >
> >
> > On Thu, Aug 15, 2024 at 1:08 PM matoro
> > <matoro_mailinglist_kernel@matoro.tk> wrote:
> >>
> >> Hi all, I run a service where user home directories are mounted over SMB1
> >> with unix extensions.  After upgrading to kernel 6.10 it was reported to me
> >> that users were observing lockups when performing compilations in their
> >> home
> >> directories.  I investigated and confirmed this to be the case.  It would
> >> cause the build processes to get stuck in I/O.  After the lockup triggered
> >> then all further reads/writes to the CIFS-mounted directory would get
> >> stuck.
> >> Even the df(1) command would block indefinitely.  Shutdown was also
> >> prevented
> >> as the directory could no longer be unmounted.
> >>
> >> Triggering the issue is a little bit tricky.  I used compiling cpython as a
> >> test case.  Parallel compilation does not seem to be required to trigger
> >> it,
> >> because in some tests the hang would occur during ./configure phase, but it
> >> does seem to provoke it more easily, as the most common point where the
> >> lockup was observed was immediately after "make -j4".  However, sometimes
> >> it
> >> would take 10+ minutes of ongoing compilation before the lockup would
> >> trigger.  I never observed a complete successful compilation on kernel
> >> 6.10.
> >>
> >> The furthest back I was able to confirm that the lockup is observed was
> >> v6.10-rc3.  The furthest forward I was able to confirm is good was v6.9.9
> >> in
> >> the stable tree.  Unfortunately, between those two tags there seems to be a
> >> wide range of commits where the CIFS functionality is completely broken,
> >> and
> >> reads/writes return total nonsense results.  For example, any git commands
> >> return "git error: bad signature 0x00000000".  So I cannot execute a
> >> compilation on commits in this range in order to test whether they observe
> >> the lockup issue.  Therefore I wasn't able to test most of the range, and
> >> wasn't able to complete a traditional bisect.  I tried adjusting the
> >> read/write buffers down to 8192 from the defaults, but this did not help.
> >> I
> >> also tried toggling several options that might be related, namely
> >> CONFIG_FSCACHE, to no effect.  There are no logs emitted to dmesg when the
> >> lockup occurs.
> >>
> >> Thanks - please let me know if there is any further information I can
> >> provide.  For now I am rolling all hosts back to kernel 6.9.
> >>
> >
> >
> > --
> > Thanks,
> >
> > Steve
>
> Hi Steve, just tested.  Not only is it still there in 6.11-rc3, but it's much
> worse - I got an immediate lockup just from ./configure
>
> Thank you for looking at this.



-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: CIFS lockup regression on SMB1 in 6.10
  2024-08-16  3:31     ` Steve French
@ 2024-08-16  3:47       ` matoro
  2024-08-20 19:33       ` Kris Karas (Bug Reporting)
  1 sibling, 0 replies; 13+ messages in thread
From: matoro @ 2024-08-16  3:47 UTC (permalink / raw)
  To: Steve French; +Cc: Linux Cifs, Bruno Haible

On 2024-08-15 23:31, Steve French wrote:
> What is the simplest repro you have seen - e.g. is there a git tree
> with very small source that fails with configure that you could share?
> 
> On Thu, Aug 15, 2024 at 4:22 PM matoro
> <matoro_mailinglist_kernel@matoro.tk> wrote:
>> 
>> On 2024-08-15 15:37, Steve French wrote:
>> > Do you have any data on whether this still fails with current Linux
>> > kernel (6.11-rc3 e.g.)?
>> >
>> >
>> > On Thu, Aug 15, 2024 at 1:08 PM matoro
>> > <matoro_mailinglist_kernel@matoro.tk> wrote:
>> >>
>> >> Hi all, I run a service where user home directories are mounted over SMB1
>> >> with unix extensions.  After upgrading to kernel 6.10 it was reported to me
>> >> that users were observing lockups when performing compilations in their
>> >> home
>> >> directories.  I investigated and confirmed this to be the case.  It would
>> >> cause the build processes to get stuck in I/O.  After the lockup triggered
>> >> then all further reads/writes to the CIFS-mounted directory would get
>> >> stuck.
>> >> Even the df(1) command would block indefinitely.  Shutdown was also
>> >> prevented
>> >> as the directory could no longer be unmounted.
>> >>
>> >> Triggering the issue is a little bit tricky.  I used compiling cpython as a
>> >> test case.  Parallel compilation does not seem to be required to trigger
>> >> it,
>> >> because in some tests the hang would occur during ./configure phase, but it
>> >> does seem to provoke it more easily, as the most common point where the
>> >> lockup was observed was immediately after "make -j4".  However, sometimes
>> >> it
>> >> would take 10+ minutes of ongoing compilation before the lockup would
>> >> trigger.  I never observed a complete successful compilation on kernel
>> >> 6.10.
>> >>
>> >> The furthest back I was able to confirm that the lockup is observed was
>> >> v6.10-rc3.  The furthest forward I was able to confirm is good was v6.9.9
>> >> in
>> >> the stable tree.  Unfortunately, between those two tags there seems to be a
>> >> wide range of commits where the CIFS functionality is completely broken,
>> >> and
>> >> reads/writes return total nonsense results.  For example, any git commands
>> >> return "git error: bad signature 0x00000000".  So I cannot execute a
>> >> compilation on commits in this range in order to test whether they observe
>> >> the lockup issue.  Therefore I wasn't able to test most of the range, and
>> >> wasn't able to complete a traditional bisect.  I tried adjusting the
>> >> read/write buffers down to 8192 from the defaults, but this did not help.
>> >> I
>> >> also tried toggling several options that might be related, namely
>> >> CONFIG_FSCACHE, to no effect.  There are no logs emitted to dmesg when the
>> >> lockup occurs.
>> >>
>> >> Thanks - please let me know if there is any further information I can
>> >> provide.  For now I am rolling all hosts back to kernel 6.9.
>> >>
>> >
>> >
>> > --
>> > Thanks,
>> >
>> > Steve
>> 
>> Hi Steve, just tested.  Not only is it still there in 6.11-rc3, but it's 
>> much
>> worse - I got an immediate lockup just from ./configure
>> 
>> Thank you for looking at this.

I've been using the cpython source to test, 
https://github.com/python/cpython.  Just a plain ./configure and make -j4.  
But it seems to affect any substantial build process, I was also able to 
trigger it with coreutils build, really anything that generates I/O load.

Here's what my effective mount options look like:
type cifs 
(rw,nosuid,relatime,vers=1.0,cache=strict,username=nobody,uid=30000,forceuid,gid=30000,forcegid,addr=fd05:0000:0000:0000:0000:0000:0000:0001,soft,unix,posixpaths,serverino,mapposix,acl,reparse=nfs,rsize=1048576,wsize=65536,bsize=1048576,retrans=1,echo_interval=60,actimeo=1,closetimeo=1)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: CIFS lockup regression on SMB1 in 6.10
  2024-08-16  3:31     ` Steve French
  2024-08-16  3:47       ` matoro
@ 2024-08-20 19:33       ` Kris Karas (Bug Reporting)
       [not found]         ` <CAH2r5mugVqy=jd_9x1xKYym6id1F2O-QuSX8C0HKbZPHybgCDQ@mail.gmail.com>
  1 sibling, 1 reply; 13+ messages in thread
From: Kris Karas (Bug Reporting) @ 2024-08-20 19:33 UTC (permalink / raw)
  To: Steve French, matoro; +Cc: Linux Cifs, Bruno Haible

Steve French wrote:
> What is the simplest repro you have seen - e.g. is there a git tree
> with very small source that fails with configure that you could share?

Simplest and easiest way to reproduce is:
   1. Put a bunch of photographs on the server
   2. rm -rf $HOME/.cache/thumbnails
   3. mount -t cifs -o vers=1.0 //Server/Photos /mnt
   4. { geeqie | gwenview | digikam | ...}  /mnt

Just the process of generating dozens of thumbnail files in parallel 
will cause a lockup (for me) in short order.

I'm new to this thread, just found it because I was curious if anybody 
else has reported this, or whether I needed to start a new thread.  Glad 
it's already being worked on.  Don't remember just when this started, 
maybe around 6.10.3 or 6.10.4?  Can bisect if need be.

Kris

PS I'm not on linux-cifs, so CC me if you want me to see it.

PPS Looking for UNIX Extensions in SMB/CIFS vers=2.0+ that are
     supported by Samba, but I'm starting to lose hope.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: CIFS lockup regression on SMB1 in 6.10
       [not found]         ` <CAH2r5mugVqy=jd_9x1xKYym6id1F2O-QuSX8C0HKbZPHybgCDQ@mail.gmail.com>
@ 2024-08-23 20:51           ` Kris Karas (Bug Reporting)
  2024-09-03  4:55             ` matoro
  0 siblings, 1 reply; 13+ messages in thread
From: Kris Karas (Bug Reporting) @ 2024-08-23 20:51 UTC (permalink / raw)
  To: Steve French, Linux Cifs; +Cc: matoro, Bruno Haible

Steve French wrote:
> On Aug 20 Kris Karas wrote:
>> Don't remember just when this started, maybe around
>> 6.10.3 or 6.10.4?  Can bisect if need be.

I neglected to ask if any of the devs on Linux-CIFS know the culprit and 
thus what to fix, or whether somebody would like me to bisect?  Happy to 
do so.  Let me know.

> Smb311 Linux extensions work to ksmbd but for those extensions to samba 
> there is a server bug with qfsinfo but patch is available for that

Super!  I'm glad to hear it.  I've been stubbornly stuck using vers=1.0 
because I know of no other alternative.  I have heard of unofficial 
patches to Samba going back at least a couple years, and have been 
patiently awaiting official blessing; I'm sadly ignorant of the reasons 
for rebuff.

Kris

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: CIFS lockup regression on SMB1 in 6.10
  2024-08-23 20:51           ` Kris Karas (Bug Reporting)
@ 2024-09-03  4:55             ` matoro
       [not found]               ` <CAH2r5mtDbD2uNdodE5WsOmzSoswn67eHAqGVjZJPHbX1ipkhSw@mail.gmail.com>
       [not found]               ` <2322699.1725442054@warthog.procyon.org.uk>
  0 siblings, 2 replies; 13+ messages in thread
From: matoro @ 2024-09-03  4:55 UTC (permalink / raw)
  To: Kris Karas (Bug Reporting); +Cc: Steve French, Linux Cifs, Bruno Haible

On 2024-08-23 16:51, Kris Karas (Bug Reporting) wrote:
> Steve French wrote:
>> On Aug 20 Kris Karas wrote:
>>> Don't remember just when this started, maybe around
>>> 6.10.3 or 6.10.4?  Can bisect if need be.
> 
> I neglected to ask if any of the devs on Linux-CIFS know the culprit and 
> thus what to fix, or whether somebody would like me to bisect?  Happy to do 
> so.  Let me know.
> 
>> Smb311 Linux extensions work to ksmbd but for those extensions to samba 
>> there is a server bug with qfsinfo but patch is available for that
> 
> Super!  I'm glad to hear it.  I've been stubbornly stuck using vers=1.0 
> because I know of no other alternative.  I have heard of unofficial patches 
> to Samba going back at least a couple years, and have been patiently 
> awaiting official blessing; I'm sadly ignorant of the reasons for rebuff.
> 
> Kris

Kris, a bisesct attempt would be immensely helpful.  My attempt failed as 
there were other unrelated problems in the commit range which caused my test 
reproducer (compiling python) to fail, but your reproducer seems much more 
reliable (reading images).  Could you please take a crack at it and see what 
turns up?  I think that's probably the only way to get upstream to take up 
our case.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: CIFS lockup regression on SMB1 in 6.10
       [not found]               ` <CAH2r5mtDbD2uNdodE5WsOmzSoswn67eHAqGVjZJPHbX1ipkhSw@mail.gmail.com>
@ 2024-09-05 14:40                 ` Kris Karas (Bug Reporting)
  0 siblings, 0 replies; 13+ messages in thread
From: Kris Karas (Bug Reporting) @ 2024-09-05 14:40 UTC (permalink / raw)
  To: Steve French, matoro; +Cc: David Howells, Linux Cifs, Bruno Haible

Sorry it's taken me a few to get back to this, vacation weekend delays. 
The bisect was not as helpful as I would have thought, due to being 
stuck with too many "git bisect skip".  Seems there was some other bug 
causing OOPSen in the CIFS code, which happened to overlap the sequence 
of commits that were near our bug.  The bisect results, for what they're 
worth, are:

	There are only 'skip'ped commits left to test.
	The first bad commit could be any of:
	1a5b4edd97cee40922ca8bfb91008338d3a1de60
	dc5939de82f149633d6ec1c403003538442ec9ef
	3758c485f6c9124d8ad76b88382004cbc28a0892
	56257334e8e0075515aedc44044a5585dcf7f465
	ab58fbdeebc7f9fe8b9bc202660eae3a10e5e678
	edea94a69730b74a8867bbafe742c3fc4e580722
	a975a2f22cdce7ec0c678ce8d73d2f6616cb281c
	c20c0d7325abd9a8bf985a934591d75d514a3d4d
	69c3c023af25edb5433a2db824d3e7cc328f0183
	753b67eb630db34e36ec4ae1e86c75e243ea4fc9
	3ee1a1fc39819906f04d6c62c180e760cd3a689d
	We cannot bisect more!

The OOPS messages from the (unrelated?) bug were:

	refcount_t: underflow; use-after-free.
	...
	? refcount_warn_saturate+0xd9/0xe0
	? report_bug+0x11d/0x160
	? handle_bug+0x36/0x70
	? exc_invalid_op+0x1f/0x90
	? asm_exc_invalid_op+0x16/0x20
	? refcount_warn_saturate+0xd9/0xe0
	? refcount_warn_saturate+0xd9/0xe0
	cifs_readahead_complete+0x2db/0x300 [cifs]
	process_one_work+0x13e/0x240
	worker_thread+0x31a/0x460
	? rescuer_thread+0x480/0x480
	kthread+0xc6/0xf0
	? kthread_complete_and_exit+0x20/0x20
	ret_from_fork+0x44/0x50
	? kthread_complete_and_exit+0x20/0x20
	ret_from_fork_asm+0x11/0x20

I have not yet tried David Howells' patch.  Will give that a whirl next.

Kris


Steve French wrote:
> Let me know if any luck narrowing down the culprit
> 
> On Mon, Sep 2, 2024 at 11:56 PM matoro wrote:
>> Kris, a bisesct attempt would be immensely helpful.  My attempt failed as
>> there were other unrelated problems in the commit range which caused my test
>> reproducer (compiling python) to fail, but your reproducer seems much more
>> reliable (reading images).  Could you please take a crack at it and see what
>> turns up?  I think that's probably the only way to get upstream to take up
>> our case.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: CIFS lockup regression on SMB1 in 6.10
       [not found]                 ` <c8027078-bd61-449e-8199-908af20b1f10@moonlit-rail.com>
@ 2024-09-05 15:40                   ` Kris Karas (Bug Reporting)
  2024-09-14 22:15                     ` matoro
  0 siblings, 1 reply; 13+ messages in thread
From: Kris Karas (Bug Reporting) @ 2024-09-05 15:40 UTC (permalink / raw)
  To: David Howells, Steve French; +Cc: matoro, Bruno Haible, Linux Cifs

Kris Karas wrote:
> David Howells wrote:
>> The attached may help.
> 
> Thanks.  Gave it a whirl.  Alas, FTBFS against 6.10.8:

OK, I tried this against git master, compiles fine there.
Success!  The lockup with vers=1.0/unix is gone for me.

Just need a backport for 6.10.x to fix the missing NETFS_SREQ_HIT_EOF 
and rdata->actual_len.

Thanks!
Kris


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: CIFS lockup regression on SMB1 in 6.10
  2024-09-05 15:40                   ` Kris Karas (Bug Reporting)
@ 2024-09-14 22:15                     ` matoro
  2024-09-15  0:51                       ` Kris Karas (Bug Reporting)
  0 siblings, 1 reply; 13+ messages in thread
From: matoro @ 2024-09-14 22:15 UTC (permalink / raw)
  To: Kris Karas (Bug Reporting)
  Cc: David Howells, Steve French, Bruno Haible, Linux Cifs

On 2024-09-05 11:40, Kris Karas (Bug Reporting) wrote:
> Kris Karas wrote:
>> David Howells wrote:
>>> The attached may help.
>> 
>> Thanks.  Gave it a whirl.  Alas, FTBFS against 6.10.8:
> 
> OK, I tried this against git master, compiles fine there.
> Success!  The lockup with vers=1.0/unix is gone for me.
> 
> Just need a backport for 6.10.x to fix the missing NETFS_SREQ_HIT_EOF and 
> rdata->actual_len.
> 
> Thanks!
> Kris

Hey, I haven't tested this myself but if it fixes the issue for others, is 
there any way this can go into tip so that it lands in 6.11?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: CIFS lockup regression on SMB1 in 6.10
  2024-09-14 22:15                     ` matoro
@ 2024-09-15  0:51                       ` Kris Karas (Bug Reporting)
       [not found]                         ` <CAH2r5mtEdn5tWBn3cs6chxxRdWNT1VFjYwYcsWU7sZkAqsW8rw@mail.gmail.com>
  0 siblings, 1 reply; 13+ messages in thread
From: Kris Karas (Bug Reporting) @ 2024-09-15  0:51 UTC (permalink / raw)
  To: matoro; +Cc: David Howells, Steve French, Bruno Haible, Linux Cifs

Matoro wrote:
> Kris Karas wrote:
>> Just need a backport for 6.10.x to fix the missing NETFS_SREQ_HIT_EOF 
>> and rdata->actual_len.
> 
> Hey, I haven't tested this myself but if it fixes the issue for others, 
> is there any way this can go into tip so that it lands in 6.11?

The fix has already landed in 6.10.10.  Big thanks to Greg KH, David 
Howells, and Steve French for pushing this through the queue.

Given 6.10.10, I assume the fix is upstream already, or should land with 
6.11-rc8.  And if for some reason not, the patch that David Howells 
emailed earlier (Message-ID: 
<2322699.1725442054@warthog.procyon.org.uk>) applies cleanly against 
6.11-rc should you wish to remediate manually.

Well, let's hope this email makes it to matoro.tk via IPv4, as it was 
bouncing emails awhile earlier unless I was using a backup IPv6 MTA.  :-)

Kris

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Fwd: CIFS lockup regression on SMB1 in 6.10
       [not found]                         ` <CAH2r5mtEdn5tWBn3cs6chxxRdWNT1VFjYwYcsWU7sZkAqsW8rw@mail.gmail.com>
@ 2024-09-15  0:57                           ` Steve French
  0 siblings, 0 replies; 13+ messages in thread
From: Steve French @ 2024-09-15  0:57 UTC (permalink / raw)
  To: CIFS; +Cc: Kris Karas (Bug Reporting), matoro, Bruno Haible

On Sat, Sep 14, 2024 at 7:51 PM Kris Karas (Bug Reporting)
<bugs-a21@moonlit-rail.com> wrote:
>
> Matoro wrote:
> > Kris Karas wrote:
> >> Just need a backport for 6.10.x to fix the missing NETFS_SREQ_HIT_EOF
> >> and rdata->actual_len.
> >
> > Hey, I haven't tested this myself but if it fixes the issue for others,
> > is there any way this can go into tip so that it lands in 6.11?
>
> The fix has already landed in 6.10.10.  Big thanks to Greg KH, David
> Howells, and Steve French for pushing this through the queue.
>
> Given 6.10.10, I assume the fix is upstream already, or should land with
> 6.11-rc8.  And if for some reason not, the patch that David Howells
> emailed earlier (Message-ID:
> <2322699.1725442054@warthog.procyon.org.uk>) applies cleanly against
> 6.11-rc should you wish to remediate manually.
>

The fix went into mainline Linux 11 days ago.  See this:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/fs/smb/client?id=a68c74865f517e26728735aba0ae05055eaff76c

Let us know if you see other problems.  Thx for the report and testing.


-- 
Thanks,

Steve

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2024-09-15  1:09 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-08-15 17:53 CIFS lockup regression on SMB1 in 6.10 matoro
2024-08-15 19:37 ` Steve French
2024-08-15 21:22   ` matoro
2024-08-16  3:31     ` Steve French
2024-08-16  3:47       ` matoro
2024-08-20 19:33       ` Kris Karas (Bug Reporting)
     [not found]         ` <CAH2r5mugVqy=jd_9x1xKYym6id1F2O-QuSX8C0HKbZPHybgCDQ@mail.gmail.com>
2024-08-23 20:51           ` Kris Karas (Bug Reporting)
2024-09-03  4:55             ` matoro
     [not found]               ` <CAH2r5mtDbD2uNdodE5WsOmzSoswn67eHAqGVjZJPHbX1ipkhSw@mail.gmail.com>
2024-09-05 14:40                 ` Kris Karas (Bug Reporting)
     [not found]               ` <2322699.1725442054@warthog.procyon.org.uk>
     [not found]                 ` <c8027078-bd61-449e-8199-908af20b1f10@moonlit-rail.com>
2024-09-05 15:40                   ` Kris Karas (Bug Reporting)
2024-09-14 22:15                     ` matoro
2024-09-15  0:51                       ` Kris Karas (Bug Reporting)
     [not found]                         ` <CAH2r5mtEdn5tWBn3cs6chxxRdWNT1VFjYwYcsWU7sZkAqsW8rw@mail.gmail.com>
2024-09-15  0:57                           ` Fwd: " Steve French

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox