* [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic
@ 2023-10-03 17:00 Lee Jones
2023-10-03 18:14 ` Greg Kroah-Hartman
0 siblings, 1 reply; 29+ messages in thread
From: Lee Jones @ 2023-10-03 17:00 UTC (permalink / raw)
To: lee
Cc: linux-kernel, Daniel Starke, Fedor Pchelkin, Jiri Slaby,
Greg Kroah-Hartman, linux-serial, syzbot+5f47a8cea6a12b77a876
The fundamental issue here is that gsmld_write() uses spin_lock_irqsave()
to ensure mutual exclusion. spin_lock_irqsave() disables IRQs,
essentially placing the thread of execution into atomic mode and
therefore must not sleep at any point. Unfortunately, the call chain
eventually ends up in __might_resched() which complains loudly.
The BUG() splat looks like this:
BUG: sleeping function called from invalid context at kernel/printk/printk.c:2627
in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 13029, name: (agetty)
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
3 locks held by (agetty)/13029:
#0: ffff888112fc70a0 (&tty->ldisc_sem){++++}-{0:0}, at: tty_ldisc_ref_wait+0x21/0x70
#1: ffff888112fc7130 (&tty->atomic_write_lock){+.+.}-{3:3}, at: file_tty_write+0x1e4/0x9d0
#2: ffff8881053c73e0 (&gsm->tx_lock){....}-{2:2}, at: gsmld_write+0x5b/0x120
irq event stamp: 2506
hardirqs last enabled at (2505): [<ffffffff8b007b0e>] syscall_enter_from_user_mode+0x2e/0x1a0
hardirqs last disabled at (2506): [<ffffffff8b0ab5cc>] _raw_spin_lock_irqsave+0xac/0x120
softirqs last enabled at (514): [<ffffffff81561c9c>] __irq_exit_rcu+0xec/0x160
softirqs last disabled at (509): [<ffffffff81561c9c>] __irq_exit_rcu+0xec/0x160
Preemption disabled at:
[<0000000000000000>] 0x0
CPU: 1 PID: 13029 Comm: (agetty) Not tainted 6.6.0-rc3-next-20230929-00002-gcdf323998d5b #17
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
<TASK>
dump_stack_lvl+0x1e3/0x2d0
? nf_tcp_handle_invalid+0x650/0x650
? panic+0x8a0/0x8a0
__might_resched+0x5ce/0x790
? __might_sleep+0xc0/0xc0
? reacquire_held_locks+0x680/0x680
console_lock+0x1c/0x1b0
do_con_write+0x118/0x7410
? stack_trace_snprint+0xf0/0xf0
? lockdep_unlock+0x163/0x300
? lockdep_unlock+0x163/0x300
? mark_lock+0x2a0/0x350
? __lock_acquire+0x1302/0x2050
? vt_resize+0xc0/0xc0
? do_raw_spin_lock+0x147/0x3a0
? __rwlock_init+0x140/0x140
? _raw_spin_lock_irqsave+0xdd/0x120
? _raw_spin_lock+0x40/0x40
con_write+0x29/0x640
? check_heap_object+0x249/0x870
gsmld_write+0xf9/0x120
? gsmld_read+0x10/0x10
file_tty_write+0x57c/0x9d0
vfs_write+0x77d/0xaf0
? file_end_write+0x250/0x250
? tty_ioctl+0x8fa/0xbc0
? __fdget_pos+0x1d2/0x330
ksys_write+0x19b/0x2c0
? print_irqtrace_events+0x220/0x220
? __ia32_sys_read+0x80/0x80
? syscall_enter_from_user_mode+0x2e/0x1a0
? syscall_enter_from_user_mode+0x2e/0x1a0
do_syscall_64+0x2b/0x70
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f76546101b0
The important part of the call stack being:
gsmld_write() # Takes a lock and disables IRQs
con_write()
console_lock()
__might_sleep()
__might_resched() # Complains that IRQs are disabled
To fix this, let's ensure mutual exclusion by using a protected shared
variable (busy) instead. We'll use the current locking mechanism to
protect it, but ensure that the locks are released and IRQs re-enabled
by the time we transit further down the call chain which may sleep.
Cc: Daniel Starke <daniel.starke@siemens.com>
Cc: Fedor Pchelkin <pchelkin@ispras.ru>
Cc: Jiri Slaby <jirislaby@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-serial@vger.kernel.org
Reported-by: syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com
Signed-off-by: Lee Jones <lee@kernel.org>
---
drivers/tty/n_gsm.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c
index 1f3aba607cd51..b83a97d58381f 100644
--- a/drivers/tty/n_gsm.c
+++ b/drivers/tty/n_gsm.c
@@ -270,6 +270,7 @@ struct gsm_mux {
struct tty_struct *tty; /* The tty our ldisc is bound to */
spinlock_t lock;
struct mutex mutex;
+ bool busy;
unsigned int num;
struct kref ref;
@@ -3253,6 +3254,7 @@ static struct gsm_mux *gsm_alloc_mux(void)
gsm->dead = true; /* Avoid early tty opens */
gsm->wait_config = false; /* Disabled */
gsm->keep_alive = 0; /* Disabled */
+ gsm->busy = false;
/* Store the instance to the mux array or abort if no space is
* available.
@@ -3718,11 +3720,21 @@ static ssize_t gsmld_write(struct tty_struct *tty, struct file *file,
ret = -ENOBUFS;
spin_lock_irqsave(&gsm->tx_lock, flags);
+ if (gsm->busy) {
+ spin_unlock_irqrestore(&gsm->tx_lock, flags);
+ return -EBUSY;
+ }
+ gsm->busy = true;
+ spin_unlock_irqrestore(&gsm->tx_lock, flags);
+
space = tty_write_room(tty);
if (space >= nr)
ret = tty->ops->write(tty, buf, nr);
else
set_bit(TTY_DO_WRITE_WAKEUP, &tty->flags);
+
+ spin_lock_irqsave(&gsm->tx_lock, flags);
+ gsm->busy = false;
spin_unlock_irqrestore(&gsm->tx_lock, flags);
return ret;
--
2.42.0.582.g8ccd20d70d-goog
^ permalink raw reply related [flat|nested] 29+ messages in thread* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-03 17:00 [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic Lee Jones @ 2023-10-03 18:14 ` Greg Kroah-Hartman 2023-10-03 18:55 ` Lee Jones 2023-10-04 5:59 ` Starke, Daniel 0 siblings, 2 replies; 29+ messages in thread From: Greg Kroah-Hartman @ 2023-10-03 18:14 UTC (permalink / raw) To: Lee Jones Cc: linux-kernel, Daniel Starke, Fedor Pchelkin, Jiri Slaby, linux-serial, syzbot+5f47a8cea6a12b77a876 On Tue, Oct 03, 2023 at 06:00:20PM +0100, Lee Jones wrote: > The important part of the call stack being: > > gsmld_write() # Takes a lock and disables IRQs > con_write() > console_lock() Wait, why is the n_gsm line discipline being used for a console? What hardware/protocol wants this to happen? gsm I thought was for a very specific type of device, not a console. As per: https://www.kernel.org/doc/html/v5.9/driver-api/serial/n_gsm.html this is a specific modem protocol, why is con_write() being called? > __might_sleep() > __might_resched() # Complains that IRQs are disabled > > To fix this, let's ensure mutual exclusion by using a protected shared > variable (busy) instead. We'll use the current locking mechanism to > protect it, but ensure that the locks are released and IRQs re-enabled > by the time we transit further down the call chain which may sleep. > > Cc: Daniel Starke <daniel.starke@siemens.com> > Cc: Fedor Pchelkin <pchelkin@ispras.ru> > Cc: Jiri Slaby <jirislaby@kernel.org> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > Cc: linux-serial@vger.kernel.org > Reported-by: syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com > Signed-off-by: Lee Jones <lee@kernel.org> > --- > drivers/tty/n_gsm.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c > index 1f3aba607cd51..b83a97d58381f 100644 > --- a/drivers/tty/n_gsm.c > +++ b/drivers/tty/n_gsm.c > @@ -270,6 +270,7 @@ struct gsm_mux { > struct tty_struct *tty; /* The tty our ldisc is bound to */ > spinlock_t lock; > struct mutex mutex; > + bool busy; > unsigned int num; > struct kref ref; > > @@ -3253,6 +3254,7 @@ static struct gsm_mux *gsm_alloc_mux(void) > gsm->dead = true; /* Avoid early tty opens */ > gsm->wait_config = false; /* Disabled */ > gsm->keep_alive = 0; /* Disabled */ > + gsm->busy = false; > > /* Store the instance to the mux array or abort if no space is > * available. > @@ -3718,11 +3720,21 @@ static ssize_t gsmld_write(struct tty_struct *tty, struct file *file, > > ret = -ENOBUFS; > spin_lock_irqsave(&gsm->tx_lock, flags); > + if (gsm->busy) { > + spin_unlock_irqrestore(&gsm->tx_lock, flags); > + return -EBUSY; So you just "busted" the re-entrant call chain here, are you sure this is ok for this protocl? Can it handle -EBUSY? Daniel, any thoughts? And Lee, you really don't have this hardware, right? So why are you dealing with bug reports for it? :) thanks, greg k-h ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-03 18:14 ` Greg Kroah-Hartman @ 2023-10-03 18:55 ` Lee Jones 2023-10-04 6:04 ` Greg Kroah-Hartman 2023-10-04 5:59 ` Starke, Daniel 1 sibling, 1 reply; 29+ messages in thread From: Lee Jones @ 2023-10-03 18:55 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: linux-kernel, Daniel Starke, Fedor Pchelkin, Jiri Slaby, linux-serial, syzbot+5f47a8cea6a12b77a876 On Tue, 03 Oct 2023, Greg Kroah-Hartman wrote: > On Tue, Oct 03, 2023 at 06:00:20PM +0100, Lee Jones wrote: > > The important part of the call stack being: > > > > gsmld_write() # Takes a lock and disables IRQs > > con_write() > > console_lock() > > Wait, why is the n_gsm line discipline being used for a console? > > What hardware/protocol wants this to happen? > > gsm I thought was for a very specific type of device, not a console. > > As per: > https://www.kernel.org/doc/html/v5.9/driver-api/serial/n_gsm.html > this is a specific modem protocol, why is con_write() being called? What it's meant for and what random users can make it do are likely to be quite separate questions. This scenario is user driven and can be replicated simply by issuing a few syscalls (open, ioctl, write). > > __might_sleep() > > __might_resched() # Complains that IRQs are disabled > > > > To fix this, let's ensure mutual exclusion by using a protected shared > > variable (busy) instead. We'll use the current locking mechanism to > > protect it, but ensure that the locks are released and IRQs re-enabled > > by the time we transit further down the call chain which may sleep. > > > > Cc: Daniel Starke <daniel.starke@siemens.com> > > Cc: Fedor Pchelkin <pchelkin@ispras.ru> > > Cc: Jiri Slaby <jirislaby@kernel.org> > > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> > > Cc: linux-serial@vger.kernel.org > > Reported-by: syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com > > Signed-off-by: Lee Jones <lee@kernel.org> > > --- > > drivers/tty/n_gsm.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c > > index 1f3aba607cd51..b83a97d58381f 100644 > > --- a/drivers/tty/n_gsm.c > > +++ b/drivers/tty/n_gsm.c > > @@ -270,6 +270,7 @@ struct gsm_mux { > > struct tty_struct *tty; /* The tty our ldisc is bound to */ > > spinlock_t lock; > > struct mutex mutex; > > + bool busy; > > unsigned int num; > > struct kref ref; > > > > @@ -3253,6 +3254,7 @@ static struct gsm_mux *gsm_alloc_mux(void) > > gsm->dead = true; /* Avoid early tty opens */ > > gsm->wait_config = false; /* Disabled */ > > gsm->keep_alive = 0; /* Disabled */ > > + gsm->busy = false; > > > > /* Store the instance to the mux array or abort if no space is > > * available. > > @@ -3718,11 +3720,21 @@ static ssize_t gsmld_write(struct tty_struct *tty, struct file *file, > > > > ret = -ENOBUFS; > > spin_lock_irqsave(&gsm->tx_lock, flags); > > + if (gsm->busy) { > > + spin_unlock_irqrestore(&gsm->tx_lock, flags); > > + return -EBUSY; > > So you just "busted" the re-entrant call chain here, are you sure this > is ok for this protocl? Can it handle -EBUSY? I should have marked this submission as RFC. Mea culpa. Please consider it as such going forward. Feedback like this is highly valuable. > Daniel, any thoughts? > > And Lee, you really don't have this hardware, right? So why are you > dealing with bug reports for it? :) 'cos Syzkaller. And no, as per the splat above, this was reproduced in qemu. -- Lee Jones [李琼斯] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-03 18:55 ` Lee Jones @ 2023-10-04 6:04 ` Greg Kroah-Hartman 2023-10-04 9:09 ` Lee Jones 0 siblings, 1 reply; 29+ messages in thread From: Greg Kroah-Hartman @ 2023-10-04 6:04 UTC (permalink / raw) To: Lee Jones Cc: linux-kernel, Daniel Starke, Fedor Pchelkin, Jiri Slaby, linux-serial, syzbot+5f47a8cea6a12b77a876 On Tue, Oct 03, 2023 at 07:55:00PM +0100, Lee Jones wrote: > On Tue, 03 Oct 2023, Greg Kroah-Hartman wrote: > > > On Tue, Oct 03, 2023 at 06:00:20PM +0100, Lee Jones wrote: > > > The important part of the call stack being: > > > > > > gsmld_write() # Takes a lock and disables IRQs > > > con_write() > > > console_lock() > > > > Wait, why is the n_gsm line discipline being used for a console? > > > > What hardware/protocol wants this to happen? > > > > gsm I thought was for a very specific type of device, not a console. > > > > As per: > > https://www.kernel.org/doc/html/v5.9/driver-api/serial/n_gsm.html > > this is a specific modem protocol, why is con_write() being called? > > What it's meant for and what random users can make it do are likely to > be quite separate questions. This scenario is user driven and can be > replicated simply by issuing a few syscalls (open, ioctl, write). I would recommend that any distro/system that does not want to support this specific hardware protocol, just disable it for now (it's marked as experimental too), if they don't want to deal with the potential sleep-while-atomic issue. > > And Lee, you really don't have this hardware, right? So why are you > > dealing with bug reports for it? :) > > 'cos Syzkaller. Ah, yeah, fake report, no real issue here then :) thanks, greg k-h ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 6:04 ` Greg Kroah-Hartman @ 2023-10-04 9:09 ` Lee Jones 2023-10-04 9:55 ` Greg Kroah-Hartman 2023-10-04 12:19 ` Greg Kroah-Hartman 0 siblings, 2 replies; 29+ messages in thread From: Lee Jones @ 2023-10-04 9:09 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: linux-kernel, Daniel Starke, Fedor Pchelkin, Jiri Slaby, linux-serial, syzbot+5f47a8cea6a12b77a876 On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > On Tue, Oct 03, 2023 at 07:55:00PM +0100, Lee Jones wrote: > > On Tue, 03 Oct 2023, Greg Kroah-Hartman wrote: > > > > > On Tue, Oct 03, 2023 at 06:00:20PM +0100, Lee Jones wrote: > > > > The important part of the call stack being: > > > > > > > > gsmld_write() # Takes a lock and disables IRQs > > > > con_write() > > > > console_lock() > > > > > > Wait, why is the n_gsm line discipline being used for a console? > > > > > > What hardware/protocol wants this to happen? > > > > > > gsm I thought was for a very specific type of device, not a console. > > > > > > As per: > > > https://www.kernel.org/doc/html/v5.9/driver-api/serial/n_gsm.html > > > this is a specific modem protocol, why is con_write() being called? > > > > What it's meant for and what random users can make it do are likely to > > be quite separate questions. This scenario is user driven and can be > > replicated simply by issuing a few syscalls (open, ioctl, write). > > I would recommend that any distro/system that does not want to support > this specific hardware protocol, just disable it for now (it's marked as > experimental too), if they don't want to deal with the potential > sleep-while-atomic issue. n_gsm is available on all the systems I have available. The mention of 'EXPERIMENTAL' in the module description appears to have zero effect on whether distros choose to make it available or not. If you're saying that we know this module is BROKEN however, then perhaps we should mark it as such. > > > And Lee, you really don't have this hardware, right? So why are you > > > dealing with bug reports for it? :) > > > > 'cos Syzkaller. > > Ah, yeah, fake report, no real issue here then :) Ouch! The way I see it, the present issue with Syzkaller is that we do not have the resources to remedy all of the issues it flags. Passing off all reports as 'not real issues' is going to make engineers who decide to work on them feel undervalued and is likely have a detrimental effect overall. As an ambassador for young and new people trying to get into Kernel Engineering in general, is that really the outcome you're after? -- Lee Jones [李琼斯] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 9:09 ` Lee Jones @ 2023-10-04 9:55 ` Greg Kroah-Hartman 2023-10-04 12:19 ` Greg Kroah-Hartman 1 sibling, 0 replies; 29+ messages in thread From: Greg Kroah-Hartman @ 2023-10-04 9:55 UTC (permalink / raw) To: Lee Jones Cc: linux-kernel, Daniel Starke, Fedor Pchelkin, Jiri Slaby, linux-serial, syzbot+5f47a8cea6a12b77a876 On Wed, Oct 04, 2023 at 10:09:18AM +0100, Lee Jones wrote: > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > On Tue, Oct 03, 2023 at 07:55:00PM +0100, Lee Jones wrote: > > > On Tue, 03 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > On Tue, Oct 03, 2023 at 06:00:20PM +0100, Lee Jones wrote: > > > > > The important part of the call stack being: > > > > > > > > > > gsmld_write() # Takes a lock and disables IRQs > > > > > con_write() > > > > > console_lock() > > > > > > > > Wait, why is the n_gsm line discipline being used for a console? > > > > > > > > What hardware/protocol wants this to happen? > > > > > > > > gsm I thought was for a very specific type of device, not a console. > > > > > > > > As per: > > > > https://www.kernel.org/doc/html/v5.9/driver-api/serial/n_gsm.html > > > > this is a specific modem protocol, why is con_write() being called? > > > > > > What it's meant for and what random users can make it do are likely to > > > be quite separate questions. This scenario is user driven and can be > > > replicated simply by issuing a few syscalls (open, ioctl, write). > > > > I would recommend that any distro/system that does not want to support > > this specific hardware protocol, just disable it for now (it's marked as > > experimental too), if they don't want to deal with the potential > > sleep-while-atomic issue. > > n_gsm is available on all the systems I have available. Then file a bug with your distro to disable it? No real general purpose distro should enable it from what I can tell. > The mention of > 'EXPERIMENTAL' in the module description appears to have zero effect on > whether distros choose to make it available or not. If you're saying > that we know this module is BROKEN however, then perhaps we should mark > it as such. Or we just prevent it from being bound to a console as that's not something that should be happening. And again, the "worst" that can happen is the calling process locks up, due to a lock inversion, right? > > > > And Lee, you really don't have this hardware, right? So why are you > > > > dealing with bug reports for it? :) > > > > > > 'cos Syzkaller. > > > > Ah, yeah, fake report, no real issue here then :) > > Ouch! The way I see it, the present issue with Syzkaller is that we do > not have the resources to remedy all of the issues it flags. Passing > off all reports as 'not real issues' is going to make engineers who > decide to work on them feel undervalued and is likely have a detrimental > effect overall. As an ambassador for young and new people trying to get > into Kernel Engineering in general, is that really the outcome you're > after? That's not what I'm saying here at all, what I'm saying is "pick issues that are real". syzbot does not always make it obvious what is, and is not, a real issue. There have been long threads and discussions about this and some developers are now just ignoring all syzbot reports (see the filesystem thread on the ksummit discuss list for more details.) For this specific issue, it's been much-reported, and is not trivial, and I would argue, not a "real" problem in the grand scheme of things for normal users to worry about. thanks, greg k-h ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 9:09 ` Lee Jones 2023-10-04 9:55 ` Greg Kroah-Hartman @ 2023-10-04 12:19 ` Greg Kroah-Hartman 2023-10-04 12:57 ` Lee Jones 1 sibling, 1 reply; 29+ messages in thread From: Greg Kroah-Hartman @ 2023-10-04 12:19 UTC (permalink / raw) To: Lee Jones Cc: linux-kernel, Daniel Starke, Fedor Pchelkin, Jiri Slaby, linux-serial, syzbot+5f47a8cea6a12b77a876 On Wed, Oct 04, 2023 at 10:09:18AM +0100, Lee Jones wrote: > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > On Tue, Oct 03, 2023 at 07:55:00PM +0100, Lee Jones wrote: > > > On Tue, 03 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > On Tue, Oct 03, 2023 at 06:00:20PM +0100, Lee Jones wrote: > > > > > The important part of the call stack being: > > > > > > > > > > gsmld_write() # Takes a lock and disables IRQs > > > > > con_write() > > > > > console_lock() > > > > > > > > Wait, why is the n_gsm line discipline being used for a console? > > > > > > > > What hardware/protocol wants this to happen? > > > > > > > > gsm I thought was for a very specific type of device, not a console. > > > > > > > > As per: > > > > https://www.kernel.org/doc/html/v5.9/driver-api/serial/n_gsm.html > > > > this is a specific modem protocol, why is con_write() being called? > > > > > > What it's meant for and what random users can make it do are likely to > > > be quite separate questions. This scenario is user driven and can be > > > replicated simply by issuing a few syscalls (open, ioctl, write). > > > > I would recommend that any distro/system that does not want to support > > this specific hardware protocol, just disable it for now (it's marked as > > experimental too), if they don't want to deal with the potential > > sleep-while-atomic issue. > > n_gsm is available on all the systems I have available. The mention of > 'EXPERIMENTAL' in the module description appears to have zero effect on > whether distros choose to make it available or not. If you're saying > that we know this module is BROKEN however, then perhaps we should mark > it as such. Also, I think this requires root to set this line discipline to the console, right? A normal user can't do that, or am I missing a code path here? Is there a reproducer somewhere for this issue that runs as a normal user? I couldn't find one in the syzbot listings but I might have been not looking deep enough. thanks, greg k-h ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 12:19 ` Greg Kroah-Hartman @ 2023-10-04 12:57 ` Lee Jones 2023-10-04 13:13 ` Greg Kroah-Hartman 2023-10-05 4:59 ` Jiri Slaby 0 siblings, 2 replies; 29+ messages in thread From: Lee Jones @ 2023-10-04 12:57 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: linux-kernel, Daniel Starke, Fedor Pchelkin, Jiri Slaby, linux-serial, syzbot+5f47a8cea6a12b77a876 On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > On Wed, Oct 04, 2023 at 10:09:18AM +0100, Lee Jones wrote: > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > On Tue, Oct 03, 2023 at 07:55:00PM +0100, Lee Jones wrote: > > > > On Tue, 03 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > > > On Tue, Oct 03, 2023 at 06:00:20PM +0100, Lee Jones wrote: > > > > > > The important part of the call stack being: > > > > > > > > > > > > gsmld_write() # Takes a lock and disables IRQs > > > > > > con_write() > > > > > > console_lock() > > > > > > > > > > Wait, why is the n_gsm line discipline being used for a console? > > > > > > > > > > What hardware/protocol wants this to happen? > > > > > > > > > > gsm I thought was for a very specific type of device, not a console. > > > > > > > > > > As per: > > > > > https://www.kernel.org/doc/html/v5.9/driver-api/serial/n_gsm.html > > > > > this is a specific modem protocol, why is con_write() being called? > > > > > > > > What it's meant for and what random users can make it do are likely to > > > > be quite separate questions. This scenario is user driven and can be > > > > replicated simply by issuing a few syscalls (open, ioctl, write). > > > > > > I would recommend that any distro/system that does not want to support > > > this specific hardware protocol, just disable it for now (it's marked as > > > experimental too), if they don't want to deal with the potential > > > sleep-while-atomic issue. > > > > n_gsm is available on all the systems I have available. The mention of > > 'EXPERIMENTAL' in the module description appears to have zero effect on > > whether distros choose to make it available or not. If you're saying > > that we know this module is BROKEN however, then perhaps we should mark > > it as such. > > Also, I think this requires root to set this line discipline to the > console, right? A normal user can't do that, or am I missing a code > path here? I haven't been testing long, but yes, early indications show that root is required. > Is there a reproducer somewhere for this issue that runs as a normal > user? I couldn't find one in the syzbot listings but I might have been > not looking deep enough. https://syzkaller.appspot.com/text?tag=ReproC&x=15578d8fa80000 -- Lee Jones [李琼斯] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 12:57 ` Lee Jones @ 2023-10-04 13:13 ` Greg Kroah-Hartman 2023-10-05 4:59 ` Jiri Slaby 1 sibling, 0 replies; 29+ messages in thread From: Greg Kroah-Hartman @ 2023-10-04 13:13 UTC (permalink / raw) To: Lee Jones Cc: linux-kernel, Daniel Starke, Fedor Pchelkin, Jiri Slaby, linux-serial, syzbot+5f47a8cea6a12b77a876 On Wed, Oct 04, 2023 at 01:57:58PM +0100, Lee Jones wrote: > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > On Wed, Oct 04, 2023 at 10:09:18AM +0100, Lee Jones wrote: > > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > On Tue, Oct 03, 2023 at 07:55:00PM +0100, Lee Jones wrote: > > > > > On Tue, 03 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > > > > > On Tue, Oct 03, 2023 at 06:00:20PM +0100, Lee Jones wrote: > > > > > > > The important part of the call stack being: > > > > > > > > > > > > > > gsmld_write() # Takes a lock and disables IRQs > > > > > > > con_write() > > > > > > > console_lock() > > > > > > > > > > > > Wait, why is the n_gsm line discipline being used for a console? > > > > > > > > > > > > What hardware/protocol wants this to happen? > > > > > > > > > > > > gsm I thought was for a very specific type of device, not a console. > > > > > > > > > > > > As per: > > > > > > https://www.kernel.org/doc/html/v5.9/driver-api/serial/n_gsm.html > > > > > > this is a specific modem protocol, why is con_write() being called? > > > > > > > > > > What it's meant for and what random users can make it do are likely to > > > > > be quite separate questions. This scenario is user driven and can be > > > > > replicated simply by issuing a few syscalls (open, ioctl, write). > > > > > > > > I would recommend that any distro/system that does not want to support > > > > this specific hardware protocol, just disable it for now (it's marked as > > > > experimental too), if they don't want to deal with the potential > > > > sleep-while-atomic issue. > > > > > > n_gsm is available on all the systems I have available. The mention of > > > 'EXPERIMENTAL' in the module description appears to have zero effect on > > > whether distros choose to make it available or not. If you're saying > > > that we know this module is BROKEN however, then perhaps we should mark > > > it as such. > > > > Also, I think this requires root to set this line discipline to the > > console, right? A normal user can't do that, or am I missing a code > > path here? > > I haven't been testing long, but yes, early indications show that root > is required. Oh good, then this really isn't that high of a priority to get fixed as root can do much worse things :) thanks, greg k-h ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 12:57 ` Lee Jones 2023-10-04 13:13 ` Greg Kroah-Hartman @ 2023-10-05 4:59 ` Jiri Slaby 2023-10-05 6:05 ` Greg Kroah-Hartman 1 sibling, 1 reply; 29+ messages in thread From: Jiri Slaby @ 2023-10-05 4:59 UTC (permalink / raw) To: Lee Jones, Greg Kroah-Hartman Cc: linux-kernel, Daniel Starke, Fedor Pchelkin, linux-serial, syzbot+5f47a8cea6a12b77a876 On 04. 10. 23, 14:57, Lee Jones wrote: >>> n_gsm is available on all the systems I have available. The mention of >>> 'EXPERIMENTAL' in the module description appears to have zero effect on >>> whether distros choose to make it available or not. If you're saying >>> that we know this module is BROKEN however, then perhaps we should mark >>> it as such. >> >> Also, I think this requires root to set this line discipline to the >> console, right? A normal user can't do that, or am I missing a code >> path here? > > I haven't been testing long, but yes, early indications show that root > is required. FWIW I concluded to the same yesterday, so I disputed the connected CVE-2023-31082. Waiting for mitre's ack/nack. thanks, -- js suse labs ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-05 4:59 ` Jiri Slaby @ 2023-10-05 6:05 ` Greg Kroah-Hartman 2023-10-05 7:26 ` Lee Jones 0 siblings, 1 reply; 29+ messages in thread From: Greg Kroah-Hartman @ 2023-10-05 6:05 UTC (permalink / raw) To: Jiri Slaby Cc: Lee Jones, linux-kernel, Daniel Starke, Fedor Pchelkin, linux-serial, syzbot+5f47a8cea6a12b77a876 On Thu, Oct 05, 2023 at 06:59:50AM +0200, Jiri Slaby wrote: > On 04. 10. 23, 14:57, Lee Jones wrote: > > > > n_gsm is available on all the systems I have available. The mention of > > > > 'EXPERIMENTAL' in the module description appears to have zero effect on > > > > whether distros choose to make it available or not. If you're saying > > > > that we know this module is BROKEN however, then perhaps we should mark > > > > it as such. > > > > > > Also, I think this requires root to set this line discipline to the > > > console, right? A normal user can't do that, or am I missing a code > > > path here? > > > > I haven't been testing long, but yes, early indications show that root > > is required. > > FWIW I concluded to the same yesterday, so I disputed the connected > CVE-2023-31082. Waiting for mitre's ack/nack. Trying to dispute obviously-wrong CVEs is a never-ending task. Personally, it's fun to see who keeps popping up to attempt to resolve this issue showing what companies have their security policies dictated by a random government authority that can be modified by anyone :) thanks, greg k-h ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-05 6:05 ` Greg Kroah-Hartman @ 2023-10-05 7:26 ` Lee Jones 0 siblings, 0 replies; 29+ messages in thread From: Lee Jones @ 2023-10-05 7:26 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Jiri Slaby, linux-kernel, Daniel Starke, Fedor Pchelkin, linux-serial, syzbot+5f47a8cea6a12b77a876 On Thu, 05 Oct 2023, Greg Kroah-Hartman wrote: > On Thu, Oct 05, 2023 at 06:59:50AM +0200, Jiri Slaby wrote: > > On 04. 10. 23, 14:57, Lee Jones wrote: > > > > > n_gsm is available on all the systems I have available. The mention of > > > > > 'EXPERIMENTAL' in the module description appears to have zero effect on > > > > > whether distros choose to make it available or not. If you're saying > > > > > that we know this module is BROKEN however, then perhaps we should mark > > > > > it as such. > > > > > > > > Also, I think this requires root to set this line discipline to the > > > > console, right? A normal user can't do that, or am I missing a code > > > > path here? > > > > > > I haven't been testing long, but yes, early indications show that root > > > is required. > > > > FWIW I concluded to the same yesterday, so I disputed the connected > > CVE-2023-31082. Waiting for mitre's ack/nack. > > Trying to dispute obviously-wrong CVEs is a never-ending task. > > Personally, it's fun to see who keeps popping up to attempt to resolve > this issue showing what companies have their security policies dictated > by a random government authority that can be modified by anyone :) It's a struggle! :) -- Lee Jones [李琼斯] ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-03 18:14 ` Greg Kroah-Hartman 2023-10-03 18:55 ` Lee Jones @ 2023-10-04 5:59 ` Starke, Daniel 2023-10-04 6:05 ` Greg Kroah-Hartman 1 sibling, 1 reply; 29+ messages in thread From: Starke, Daniel @ 2023-10-04 5:59 UTC (permalink / raw) To: Greg Kroah-Hartman, Lee Jones Cc: linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com > Daniel, any thoughts? Our application of this protocol is only with specific modems to enable circuit switched operation (handling calls, selecting/querying networks, etc.) while doing packet switched communication (i.e. IP traffic over PPP). The protocol was developed for such use cases. Regarding the issue itself: There was already an attempt to fix all this by switching from spinlocks to mutexes resulting in ~20% performance loss. However, the patch was reverted as it did not handle the T1 timer leading into sleep during atomic within gsm_dlci_t1() on every mutex lock there. There was also a suggestion to fix this in do_con_write() as tty_operations::write() appears to be documented as "not allowed to sleep". The patch for this was rejected. It did not fix the issue within n_gsm. Link: https://lore.kernel.org/all/20221203215518.8150-1-pchelkin@ispras.ru/ Link: https://lore.kernel.org/all/20221212023530.2498025-1-zengheng4@huawei.com/ Link: https://lore.kernel.org/all/5a994a13-d1f2-87a8-09e4-a877e65ed166@kernel.org/ Best regards, Daniel Starke ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 5:59 ` Starke, Daniel @ 2023-10-04 6:05 ` Greg Kroah-Hartman 2023-10-04 8:40 ` Jiri Slaby 2023-10-04 8:57 ` Lee Jones 0 siblings, 2 replies; 29+ messages in thread From: Greg Kroah-Hartman @ 2023-10-04 6:05 UTC (permalink / raw) To: Starke, Daniel Cc: Lee Jones, linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On Wed, Oct 04, 2023 at 05:59:09AM +0000, Starke, Daniel wrote: > > Daniel, any thoughts? > > Our application of this protocol is only with specific modems to enable > circuit switched operation (handling calls, selecting/querying networks, > etc.) while doing packet switched communication (i.e. IP traffic over PPP). > The protocol was developed for such use cases. > > Regarding the issue itself: > There was already an attempt to fix all this by switching from spinlocks to > mutexes resulting in ~20% performance loss. However, the patch was reverted > as it did not handle the T1 timer leading into sleep during atomic within > gsm_dlci_t1() on every mutex lock there. > There was also a suggestion to fix this in do_con_write() as > tty_operations::write() appears to be documented as "not allowed to sleep". > The patch for this was rejected. It did not fix the issue within n_gsm. > > Link: https://lore.kernel.org/all/20221203215518.8150-1-pchelkin@ispras.ru/ > Link: https://lore.kernel.org/all/20221212023530.2498025-1-zengheng4@huawei.com/ > Link: https://lore.kernel.org/all/5a994a13-d1f2-87a8-09e4-a877e65ed166@kernel.org/ Ok, I thought I remembered this, I'll just drop this patch from my review queue and wait for a better solution if it ever comes up as this isn't a real issue that people are seeing on actual systems, but just a syzbot report. thanks, greg k-h ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 6:05 ` Greg Kroah-Hartman @ 2023-10-04 8:40 ` Jiri Slaby 2023-10-04 8:58 ` Starke, Daniel 2023-10-04 8:57 ` Lee Jones 1 sibling, 1 reply; 29+ messages in thread From: Jiri Slaby @ 2023-10-04 8:40 UTC (permalink / raw) To: Greg Kroah-Hartman, Starke, Daniel Cc: Lee Jones, linux-kernel@vger.kernel.org, Fedor Pchelkin, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On 04. 10. 23, 8:05, Greg Kroah-Hartman wrote: > On Wed, Oct 04, 2023 at 05:59:09AM +0000, Starke, Daniel wrote: >>> Daniel, any thoughts? >> >> Our application of this protocol is only with specific modems to enable >> circuit switched operation (handling calls, selecting/querying networks, >> etc.) while doing packet switched communication (i.e. IP traffic over PPP). >> The protocol was developed for such use cases. >> >> Regarding the issue itself: >> There was already an attempt to fix all this by switching from spinlocks to >> mutexes resulting in ~20% performance loss. However, the patch was reverted >> as it did not handle the T1 timer leading into sleep during atomic within >> gsm_dlci_t1() on every mutex lock there. >> There was also a suggestion to fix this in do_con_write() as >> tty_operations::write() appears to be documented as "not allowed to sleep". >> The patch for this was rejected. It did not fix the issue within n_gsm. >> >> Link: https://lore.kernel.org/all/20221203215518.8150-1-pchelkin@ispras.ru/ >> Link: https://lore.kernel.org/all/20221212023530.2498025-1-zengheng4@huawei.com/ >> Link: https://lore.kernel.org/all/5a994a13-d1f2-87a8-09e4-a877e65ed166@kernel.org/ > > Ok, I thought I remembered this, I'll just drop this patch from my > review queue and wait for a better solution if it ever comes up as this > isn't a real issue that people are seeing on actual systems, but just a > syzbot report. I remember too and it is not easy. So without actually looking into the code, can we just somehow disallow attaching this line discipline to a console (ban such a TIOCSETD) and be done with this issue? IOW disallow root to shoot their foot. regards, -- js suse labs ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 8:40 ` Jiri Slaby @ 2023-10-04 8:58 ` Starke, Daniel 0 siblings, 0 replies; 29+ messages in thread From: Starke, Daniel @ 2023-10-04 8:58 UTC (permalink / raw) To: Jiri Slaby, Greg Kroah-Hartman Cc: Lee Jones, linux-kernel@vger.kernel.org, Fedor Pchelkin, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com > So without actually looking into the code, can we just somehow disallow > attaching this line discipline to a console (ban such a TIOCSETD) and be > done with this issue? IOW disallow root to shoot their foot. Probably possible by patching gsmld_open() if there is a function which can check this based on a struct tty_struct * handle. That may, however, introduce some cross references that are hard to maintain. Best regards, Daniel Starke ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 6:05 ` Greg Kroah-Hartman 2023-10-04 8:40 ` Jiri Slaby @ 2023-10-04 8:57 ` Lee Jones 2023-10-04 12:21 ` Greg Kroah-Hartman 1 sibling, 1 reply; 29+ messages in thread From: Lee Jones @ 2023-10-04 8:57 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Starke, Daniel, linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > On Wed, Oct 04, 2023 at 05:59:09AM +0000, Starke, Daniel wrote: > > > Daniel, any thoughts? > > > > Our application of this protocol is only with specific modems to enable > > circuit switched operation (handling calls, selecting/querying networks, > > etc.) while doing packet switched communication (i.e. IP traffic over PPP). > > The protocol was developed for such use cases. > > > > Regarding the issue itself: > > There was already an attempt to fix all this by switching from spinlocks to > > mutexes resulting in ~20% performance loss. However, the patch was reverted > > as it did not handle the T1 timer leading into sleep during atomic within > > gsm_dlci_t1() on every mutex lock there. That's correct. When I initially saw this report, my initial thought was to replace the spinlocks with mutexts, but having read the previous accepted attempt and it's subsequent reversion I started to think of other ways to solve this issue. This solution, unlike the last, does not involve adding sleep inducing locks into atomic contexts, nor should it negatively affect performance. > > There was also a suggestion to fix this in do_con_write() as > > tty_operations::write() appears to be documented as "not allowed to sleep". > > The patch for this was rejected. It did not fix the issue within n_gsm. > > > > Link: https://lore.kernel.org/all/20221203215518.8150-1-pchelkin@ispras.ru/ > > Link: https://lore.kernel.org/all/20221212023530.2498025-1-zengheng4@huawei.com/ > > Link: https://lore.kernel.org/all/5a994a13-d1f2-87a8-09e4-a877e65ed166@kernel.org/ > > Ok, I thought I remembered this, I'll just drop this patch from my > review queue and wait for a better solution if it ever comes up as this > isn't a real issue that people are seeing on actual systems, but just a > syzbot report. What does the "better solution" look like? -- Lee Jones [李琼斯] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 8:57 ` Lee Jones @ 2023-10-04 12:21 ` Greg Kroah-Hartman 2023-10-04 12:57 ` Lee Jones 0 siblings, 1 reply; 29+ messages in thread From: Greg Kroah-Hartman @ 2023-10-04 12:21 UTC (permalink / raw) To: Lee Jones Cc: Starke, Daniel, linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On Wed, Oct 04, 2023 at 09:57:20AM +0100, Lee Jones wrote: > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > On Wed, Oct 04, 2023 at 05:59:09AM +0000, Starke, Daniel wrote: > > > > Daniel, any thoughts? > > > > > > Our application of this protocol is only with specific modems to enable > > > circuit switched operation (handling calls, selecting/querying networks, > > > etc.) while doing packet switched communication (i.e. IP traffic over PPP). > > > The protocol was developed for such use cases. > > > > > > Regarding the issue itself: > > > There was already an attempt to fix all this by switching from spinlocks to > > > mutexes resulting in ~20% performance loss. However, the patch was reverted > > > as it did not handle the T1 timer leading into sleep during atomic within > > > gsm_dlci_t1() on every mutex lock there. > > That's correct. When I initially saw this report, my initial thought > was to replace the spinlocks with mutexts, but having read the previous > accepted attempt and it's subsequent reversion I started to think of > other ways to solve this issue. This solution, unlike the last, does > not involve adding sleep inducing locks into atomic contexts, nor > should it negatively affect performance. > > > > There was also a suggestion to fix this in do_con_write() as > > > tty_operations::write() appears to be documented as "not allowed to sleep". > > > The patch for this was rejected. It did not fix the issue within n_gsm. > > > > > > Link: https://lore.kernel.org/all/20221203215518.8150-1-pchelkin@ispras.ru/ > > > Link: https://lore.kernel.org/all/20221212023530.2498025-1-zengheng4@huawei.com/ > > > Link: https://lore.kernel.org/all/5a994a13-d1f2-87a8-09e4-a877e65ed166@kernel.org/ > > > > Ok, I thought I remembered this, I'll just drop this patch from my > > review queue and wait for a better solution if it ever comes up as this > > isn't a real issue that people are seeing on actual systems, but just a > > syzbot report. > > What does the "better solution" look like? One that actually fixes the root problem here (i.e. does not break the recursion loop, or cause a performance decrease for normal users, or prevent this from being bound to the console). thanks, greg k-h ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 12:21 ` Greg Kroah-Hartman @ 2023-10-04 12:57 ` Lee Jones 2023-10-04 13:14 ` Greg Kroah-Hartman 0 siblings, 1 reply; 29+ messages in thread From: Lee Jones @ 2023-10-04 12:57 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Starke, Daniel, linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > On Wed, Oct 04, 2023 at 09:57:20AM +0100, Lee Jones wrote: > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > On Wed, Oct 04, 2023 at 05:59:09AM +0000, Starke, Daniel wrote: > > > > > Daniel, any thoughts? > > > > > > > > Our application of this protocol is only with specific modems to enable > > > > circuit switched operation (handling calls, selecting/querying networks, > > > > etc.) while doing packet switched communication (i.e. IP traffic over PPP). > > > > The protocol was developed for such use cases. > > > > > > > > Regarding the issue itself: > > > > There was already an attempt to fix all this by switching from spinlocks to > > > > mutexes resulting in ~20% performance loss. However, the patch was reverted > > > > as it did not handle the T1 timer leading into sleep during atomic within > > > > gsm_dlci_t1() on every mutex lock there. > > > > That's correct. When I initially saw this report, my initial thought > > was to replace the spinlocks with mutexts, but having read the previous > > accepted attempt and it's subsequent reversion I started to think of > > other ways to solve this issue. This solution, unlike the last, does > > not involve adding sleep inducing locks into atomic contexts, nor > > should it negatively affect performance. > > > > > > There was also a suggestion to fix this in do_con_write() as > > > > tty_operations::write() appears to be documented as "not allowed to sleep". > > > > The patch for this was rejected. It did not fix the issue within n_gsm. > > > > > > > > Link: https://lore.kernel.org/all/20221203215518.8150-1-pchelkin@ispras.ru/ > > > > Link: https://lore.kernel.org/all/20221212023530.2498025-1-zengheng4@huawei.com/ > > > > Link: https://lore.kernel.org/all/5a994a13-d1f2-87a8-09e4-a877e65ed166@kernel.org/ > > > > > > Ok, I thought I remembered this, I'll just drop this patch from my > > > review queue and wait for a better solution if it ever comes up as this > > > isn't a real issue that people are seeing on actual systems, but just a > > > syzbot report. > > > > What does the "better solution" look like? > > One that actually fixes the root problem here (i.e. does not break the > recursion loop, or cause a performance decrease for normal users, or > prevent this from being bound to the console). Does this solution break the recursion loop or affect performance? The last suggestion was recently made (after mine was posted). -- Lee Jones [李琼斯] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 12:57 ` Lee Jones @ 2023-10-04 13:14 ` Greg Kroah-Hartman 2023-10-05 9:03 ` Lee Jones 0 siblings, 1 reply; 29+ messages in thread From: Greg Kroah-Hartman @ 2023-10-04 13:14 UTC (permalink / raw) To: Lee Jones Cc: Starke, Daniel, linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On Wed, Oct 04, 2023 at 01:57:04PM +0100, Lee Jones wrote: > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > On Wed, Oct 04, 2023 at 09:57:20AM +0100, Lee Jones wrote: > > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > On Wed, Oct 04, 2023 at 05:59:09AM +0000, Starke, Daniel wrote: > > > > > > Daniel, any thoughts? > > > > > > > > > > Our application of this protocol is only with specific modems to enable > > > > > circuit switched operation (handling calls, selecting/querying networks, > > > > > etc.) while doing packet switched communication (i.e. IP traffic over PPP). > > > > > The protocol was developed for such use cases. > > > > > > > > > > Regarding the issue itself: > > > > > There was already an attempt to fix all this by switching from spinlocks to > > > > > mutexes resulting in ~20% performance loss. However, the patch was reverted > > > > > as it did not handle the T1 timer leading into sleep during atomic within > > > > > gsm_dlci_t1() on every mutex lock there. > > > > > > That's correct. When I initially saw this report, my initial thought > > > was to replace the spinlocks with mutexts, but having read the previous > > > accepted attempt and it's subsequent reversion I started to think of > > > other ways to solve this issue. This solution, unlike the last, does > > > not involve adding sleep inducing locks into atomic contexts, nor > > > should it negatively affect performance. > > > > > > > > There was also a suggestion to fix this in do_con_write() as > > > > > tty_operations::write() appears to be documented as "not allowed to sleep". > > > > > The patch for this was rejected. It did not fix the issue within n_gsm. > > > > > > > > > > Link: https://lore.kernel.org/all/20221203215518.8150-1-pchelkin@ispras.ru/ > > > > > Link: https://lore.kernel.org/all/20221212023530.2498025-1-zengheng4@huawei.com/ > > > > > Link: https://lore.kernel.org/all/5a994a13-d1f2-87a8-09e4-a877e65ed166@kernel.org/ > > > > > > > > Ok, I thought I remembered this, I'll just drop this patch from my > > > > review queue and wait for a better solution if it ever comes up as this > > > > isn't a real issue that people are seeing on actual systems, but just a > > > > syzbot report. > > > > > > What does the "better solution" look like? > > > > One that actually fixes the root problem here (i.e. does not break the > > recursion loop, or cause a performance decrease for normal users, or > > prevent this from being bound to the console). > > Does this solution break the recursion loop or affect performance? This solution broke the recursion by returning an error, right? The performance one was by using mutexes as in previous attempts. thanks, greg k-h ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-04 13:14 ` Greg Kroah-Hartman @ 2023-10-05 9:03 ` Lee Jones 2023-10-05 9:18 ` Greg Kroah-Hartman 0 siblings, 1 reply; 29+ messages in thread From: Lee Jones @ 2023-10-05 9:03 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Starke, Daniel, linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > On Wed, Oct 04, 2023 at 01:57:04PM +0100, Lee Jones wrote: > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > On Wed, Oct 04, 2023 at 09:57:20AM +0100, Lee Jones wrote: > > > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > > > On Wed, Oct 04, 2023 at 05:59:09AM +0000, Starke, Daniel wrote: > > > > > > > Daniel, any thoughts? > > > > > > > > > > > > Our application of this protocol is only with specific modems to enable > > > > > > circuit switched operation (handling calls, selecting/querying networks, > > > > > > etc.) while doing packet switched communication (i.e. IP traffic over PPP). > > > > > > The protocol was developed for such use cases. > > > > > > > > > > > > Regarding the issue itself: > > > > > > There was already an attempt to fix all this by switching from spinlocks to > > > > > > mutexes resulting in ~20% performance loss. However, the patch was reverted > > > > > > as it did not handle the T1 timer leading into sleep during atomic within > > > > > > gsm_dlci_t1() on every mutex lock there. > > > > > > > > That's correct. When I initially saw this report, my initial thought > > > > was to replace the spinlocks with mutexts, but having read the previous > > > > accepted attempt and it's subsequent reversion I started to think of > > > > other ways to solve this issue. This solution, unlike the last, does > > > > not involve adding sleep inducing locks into atomic contexts, nor > > > > should it negatively affect performance. > > > > > > > > > > There was also a suggestion to fix this in do_con_write() as > > > > > > tty_operations::write() appears to be documented as "not allowed to sleep". > > > > > > The patch for this was rejected. It did not fix the issue within n_gsm. > > > > > > > > > > > > Link: https://lore.kernel.org/all/20221203215518.8150-1-pchelkin@ispras.ru/ > > > > > > Link: https://lore.kernel.org/all/20221212023530.2498025-1-zengheng4@huawei.com/ > > > > > > Link: https://lore.kernel.org/all/5a994a13-d1f2-87a8-09e4-a877e65ed166@kernel.org/ > > > > > > > > > > Ok, I thought I remembered this, I'll just drop this patch from my > > > > > review queue and wait for a better solution if it ever comes up as this > > > > > isn't a real issue that people are seeing on actual systems, but just a > > > > > syzbot report. > > > > > > > > What does the "better solution" look like? > > > > > > One that actually fixes the root problem here (i.e. does not break the > > > recursion loop, or cause a performance decrease for normal users, or > > > prevent this from being bound to the console). > > > > Does this solution break the recursion loop or affect performance? > > This solution broke the recursion by returning an error, right? This is the part I was least sure about. If this was considered valid and we were to go forward with a solution like this, what would a quality improvement look like? Should we have stayed in this function and waited for the previous occupant to leave before continuing through ->write()? > The performance one was by using mutexes as in previous attempts. Right. This solution was designed to avoid that. -- Lee Jones [李琼斯] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-05 9:03 ` Lee Jones @ 2023-10-05 9:18 ` Greg Kroah-Hartman 2023-10-05 10:43 ` Lee Jones 0 siblings, 1 reply; 29+ messages in thread From: Greg Kroah-Hartman @ 2023-10-05 9:18 UTC (permalink / raw) To: Lee Jones Cc: Starke, Daniel, linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On Thu, Oct 05, 2023 at 10:03:11AM +0100, Lee Jones wrote: > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > On Wed, Oct 04, 2023 at 01:57:04PM +0100, Lee Jones wrote: > > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > On Wed, Oct 04, 2023 at 09:57:20AM +0100, Lee Jones wrote: > > > > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > > > > > On Wed, Oct 04, 2023 at 05:59:09AM +0000, Starke, Daniel wrote: > > > > > > > > Daniel, any thoughts? > > > > > > > > > > > > > > Our application of this protocol is only with specific modems to enable > > > > > > > circuit switched operation (handling calls, selecting/querying networks, > > > > > > > etc.) while doing packet switched communication (i.e. IP traffic over PPP). > > > > > > > The protocol was developed for such use cases. > > > > > > > > > > > > > > Regarding the issue itself: > > > > > > > There was already an attempt to fix all this by switching from spinlocks to > > > > > > > mutexes resulting in ~20% performance loss. However, the patch was reverted > > > > > > > as it did not handle the T1 timer leading into sleep during atomic within > > > > > > > gsm_dlci_t1() on every mutex lock there. > > > > > > > > > > That's correct. When I initially saw this report, my initial thought > > > > > was to replace the spinlocks with mutexts, but having read the previous > > > > > accepted attempt and it's subsequent reversion I started to think of > > > > > other ways to solve this issue. This solution, unlike the last, does > > > > > not involve adding sleep inducing locks into atomic contexts, nor > > > > > should it negatively affect performance. > > > > > > > > > > > > There was also a suggestion to fix this in do_con_write() as > > > > > > > tty_operations::write() appears to be documented as "not allowed to sleep". > > > > > > > The patch for this was rejected. It did not fix the issue within n_gsm. > > > > > > > > > > > > > > Link: https://lore.kernel.org/all/20221203215518.8150-1-pchelkin@ispras.ru/ > > > > > > > Link: https://lore.kernel.org/all/20221212023530.2498025-1-zengheng4@huawei.com/ > > > > > > > Link: https://lore.kernel.org/all/5a994a13-d1f2-87a8-09e4-a877e65ed166@kernel.org/ > > > > > > > > > > > > Ok, I thought I remembered this, I'll just drop this patch from my > > > > > > review queue and wait for a better solution if it ever comes up as this > > > > > > isn't a real issue that people are seeing on actual systems, but just a > > > > > > syzbot report. > > > > > > > > > > What does the "better solution" look like? > > > > > > > > One that actually fixes the root problem here (i.e. does not break the > > > > recursion loop, or cause a performance decrease for normal users, or > > > > prevent this from being bound to the console). > > > > > > Does this solution break the recursion loop or affect performance? > > > > This solution broke the recursion by returning an error, right? > > This is the part I was least sure about. > > If this was considered valid and we were to go forward with a solution > like this, what would a quality improvement look like? Should we have > stayed in this function and waited for the previous occupant to leave > before continuing through ->write()? This isn't valid, as it obviously never shows up in real use. The real solution should be to prevent binding a console to this line discipline as it can not handle the recursion that consoles require for the write path. Then, if consoles are really needed, the code can be fixed up to handle such recursion. That's not a trivial thing to do, as can be seen by the crazy gyrations that the n_tty line discipline does in its write path... thanks, greg k-h ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-05 9:18 ` Greg Kroah-Hartman @ 2023-10-05 10:43 ` Lee Jones 2023-10-05 11:33 ` Greg Kroah-Hartman 0 siblings, 1 reply; 29+ messages in thread From: Lee Jones @ 2023-10-05 10:43 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Starke, Daniel, linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On Thu, 05 Oct 2023, Greg Kroah-Hartman wrote: > On Thu, Oct 05, 2023 at 10:03:11AM +0100, Lee Jones wrote: > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > On Wed, Oct 04, 2023 at 01:57:04PM +0100, Lee Jones wrote: > > > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > > > On Wed, Oct 04, 2023 at 09:57:20AM +0100, Lee Jones wrote: > > > > > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > > > > > > > On Wed, Oct 04, 2023 at 05:59:09AM +0000, Starke, Daniel wrote: > > > > > > > > > Daniel, any thoughts? > > > > > > > > > > > > > > > > Our application of this protocol is only with specific modems to enable > > > > > > > > circuit switched operation (handling calls, selecting/querying networks, > > > > > > > > etc.) while doing packet switched communication (i.e. IP traffic over PPP). > > > > > > > > The protocol was developed for such use cases. > > > > > > > > > > > > > > > > Regarding the issue itself: > > > > > > > > There was already an attempt to fix all this by switching from spinlocks to > > > > > > > > mutexes resulting in ~20% performance loss. However, the patch was reverted > > > > > > > > as it did not handle the T1 timer leading into sleep during atomic within > > > > > > > > gsm_dlci_t1() on every mutex lock there. > > > > > > > > > > > > That's correct. When I initially saw this report, my initial thought > > > > > > was to replace the spinlocks with mutexts, but having read the previous > > > > > > accepted attempt and it's subsequent reversion I started to think of > > > > > > other ways to solve this issue. This solution, unlike the last, does > > > > > > not involve adding sleep inducing locks into atomic contexts, nor > > > > > > should it negatively affect performance. > > > > > > > > > > > > > > There was also a suggestion to fix this in do_con_write() as > > > > > > > > tty_operations::write() appears to be documented as "not allowed to sleep". > > > > > > > > The patch for this was rejected. It did not fix the issue within n_gsm. > > > > > > > > > > > > > > > > Link: https://lore.kernel.org/all/20221203215518.8150-1-pchelkin@ispras.ru/ > > > > > > > > Link: https://lore.kernel.org/all/20221212023530.2498025-1-zengheng4@huawei.com/ > > > > > > > > Link: https://lore.kernel.org/all/5a994a13-d1f2-87a8-09e4-a877e65ed166@kernel.org/ > > > > > > > > > > > > > > Ok, I thought I remembered this, I'll just drop this patch from my > > > > > > > review queue and wait for a better solution if it ever comes up as this > > > > > > > isn't a real issue that people are seeing on actual systems, but just a > > > > > > > syzbot report. > > > > > > > > > > > > What does the "better solution" look like? > > > > > > > > > > One that actually fixes the root problem here (i.e. does not break the > > > > > recursion loop, or cause a performance decrease for normal users, or > > > > > prevent this from being bound to the console). > > > > > > > > Does this solution break the recursion loop or affect performance? > > > > > > This solution broke the recursion by returning an error, right? > > > > This is the part I was least sure about. > > > > If this was considered valid and we were to go forward with a solution > > like this, what would a quality improvement look like? Should we have > > stayed in this function and waited for the previous occupant to leave > > before continuing through ->write()? > > This isn't valid, as it obviously never shows up in real use. > > The real solution should be to prevent binding a console to this line > discipline as it can not handle the recursion that consoles require for > the write path. Would something like this tick that box? diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c index 1f3aba607cd51..5c1d2fcd5d9e2 100644 --- a/drivers/tty/n_gsm.c +++ b/drivers/tty/n_gsm.c @@ -3716,6 +3716,10 @@ static ssize_t gsmld_write(struct tty_struct *tty, struct file *file, if (!gsm) return -ENODEV; + /* The GSM line discipline does not support binding to console */ + if (strncmp(tty->name, "tty", 3)) + return -EINVAL; + ret = -ENOBUFS; spin_lock_irqsave(&gsm->tx_lock, flags); space = tty_write_room(tty); > Then, if consoles are really needed, the code can be fixed up to handle > such recursion. That's not a trivial thing to do, as can be seen by the > crazy gyrations that the n_tty line discipline does in its write path... -- Lee Jones [李琼斯] ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-05 10:43 ` Lee Jones @ 2023-10-05 11:33 ` Greg Kroah-Hartman 2023-10-05 11:38 ` Starke, Daniel 0 siblings, 1 reply; 29+ messages in thread From: Greg Kroah-Hartman @ 2023-10-05 11:33 UTC (permalink / raw) To: Lee Jones Cc: Starke, Daniel, linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On Thu, Oct 05, 2023 at 11:43:11AM +0100, Lee Jones wrote: > On Thu, 05 Oct 2023, Greg Kroah-Hartman wrote: > > > On Thu, Oct 05, 2023 at 10:03:11AM +0100, Lee Jones wrote: > > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > On Wed, Oct 04, 2023 at 01:57:04PM +0100, Lee Jones wrote: > > > > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > > > > > On Wed, Oct 04, 2023 at 09:57:20AM +0100, Lee Jones wrote: > > > > > > > On Wed, 04 Oct 2023, Greg Kroah-Hartman wrote: > > > > > > > > > > > > > > > On Wed, Oct 04, 2023 at 05:59:09AM +0000, Starke, Daniel wrote: > > > > > > > > > > Daniel, any thoughts? > > > > > > > > > > > > > > > > > > Our application of this protocol is only with specific modems to enable > > > > > > > > > circuit switched operation (handling calls, selecting/querying networks, > > > > > > > > > etc.) while doing packet switched communication (i.e. IP traffic over PPP). > > > > > > > > > The protocol was developed for such use cases. > > > > > > > > > > > > > > > > > > Regarding the issue itself: > > > > > > > > > There was already an attempt to fix all this by switching from spinlocks to > > > > > > > > > mutexes resulting in ~20% performance loss. However, the patch was reverted > > > > > > > > > as it did not handle the T1 timer leading into sleep during atomic within > > > > > > > > > gsm_dlci_t1() on every mutex lock there. > > > > > > > > > > > > > > That's correct. When I initially saw this report, my initial thought > > > > > > > was to replace the spinlocks with mutexts, but having read the previous > > > > > > > accepted attempt and it's subsequent reversion I started to think of > > > > > > > other ways to solve this issue. This solution, unlike the last, does > > > > > > > not involve adding sleep inducing locks into atomic contexts, nor > > > > > > > should it negatively affect performance. > > > > > > > > > > > > > > > > There was also a suggestion to fix this in do_con_write() as > > > > > > > > > tty_operations::write() appears to be documented as "not allowed to sleep". > > > > > > > > > The patch for this was rejected. It did not fix the issue within n_gsm. > > > > > > > > > > > > > > > > > > Link: https://lore.kernel.org/all/20221203215518.8150-1-pchelkin@ispras.ru/ > > > > > > > > > Link: https://lore.kernel.org/all/20221212023530.2498025-1-zengheng4@huawei.com/ > > > > > > > > > Link: https://lore.kernel.org/all/5a994a13-d1f2-87a8-09e4-a877e65ed166@kernel.org/ > > > > > > > > > > > > > > > > Ok, I thought I remembered this, I'll just drop this patch from my > > > > > > > > review queue and wait for a better solution if it ever comes up as this > > > > > > > > isn't a real issue that people are seeing on actual systems, but just a > > > > > > > > syzbot report. > > > > > > > > > > > > > > What does the "better solution" look like? > > > > > > > > > > > > One that actually fixes the root problem here (i.e. does not break the > > > > > > recursion loop, or cause a performance decrease for normal users, or > > > > > > prevent this from being bound to the console). > > > > > > > > > > Does this solution break the recursion loop or affect performance? > > > > > > > > This solution broke the recursion by returning an error, right? > > > > > > This is the part I was least sure about. > > > > > > If this was considered valid and we were to go forward with a solution > > > like this, what would a quality improvement look like? Should we have > > > stayed in this function and waited for the previous occupant to leave > > > before continuing through ->write()? > > > > This isn't valid, as it obviously never shows up in real use. > > > > The real solution should be to prevent binding a console to this line > > discipline as it can not handle the recursion that consoles require for > > the write path. > > Would something like this tick that box? > > diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c > index 1f3aba607cd51..5c1d2fcd5d9e2 100644 > --- a/drivers/tty/n_gsm.c > +++ b/drivers/tty/n_gsm.c > @@ -3716,6 +3716,10 @@ static ssize_t gsmld_write(struct tty_struct *tty, struct file *file, > if (!gsm) > return -ENODEV; > > + /* The GSM line discipline does not support binding to console */ > + if (strncmp(tty->name, "tty", 3)) > + return -EINVAL; No, that's not going to work, some consoles do not start with "tty" :( thanks, greg k-h ^ permalink raw reply [flat|nested] 29+ messages in thread
* RE: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-05 11:33 ` Greg Kroah-Hartman @ 2023-10-05 11:38 ` Starke, Daniel 2023-10-05 11:46 ` Lee Jones 0 siblings, 1 reply; 29+ messages in thread From: Starke, Daniel @ 2023-10-05 11:38 UTC (permalink / raw) To: Greg Kroah-Hartman, Lee Jones Cc: linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com > > Would something like this tick that box? > > > > diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c index > > 1f3aba607cd51..5c1d2fcd5d9e2 100644 > > --- a/drivers/tty/n_gsm.c > > +++ b/drivers/tty/n_gsm.c > > @@ -3716,6 +3716,10 @@ static ssize_t gsmld_write(struct tty_struct *tty, struct file *file, > > if (!gsm) > > return -ENODEV; > > > > + /* The GSM line discipline does not support binding to console */ > > + if (strncmp(tty->name, "tty", 3)) > > + return -EINVAL; > > No, that's not going to work, some consoles do not start with "tty" :( Also, I would recommend exiting as early as possible. E.g. in gsmld_open(). And please retain support for real serial devices (e.g. ttyS, ttyUSB, ttyACM, ...). Best regards, Daniel Starke ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-05 11:38 ` Starke, Daniel @ 2023-10-05 11:46 ` Lee Jones 2023-10-05 11:55 ` Greg Kroah-Hartman 0 siblings, 1 reply; 29+ messages in thread From: Lee Jones @ 2023-10-05 11:46 UTC (permalink / raw) To: Starke, Daniel Cc: Greg Kroah-Hartman, linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On Thu, 05 Oct 2023, Starke, Daniel wrote: > > > Would something like this tick that box? > > > > > > diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c index > > > 1f3aba607cd51..5c1d2fcd5d9e2 100644 > > > --- a/drivers/tty/n_gsm.c > > > +++ b/drivers/tty/n_gsm.c > > > @@ -3716,6 +3716,10 @@ static ssize_t gsmld_write(struct tty_struct *tty, struct file *file, > > > if (!gsm) > > > return -ENODEV; > > > > > > + /* The GSM line discipline does not support binding to console */ > > > + if (strncmp(tty->name, "tty", 3)) > > > + return -EINVAL; > > > > No, that's not going to work, some consoles do not start with "tty" :( Ah, you mean there are others that we need to consider? I was just covering off con_write() from drivers/tty/vt/vt.c. Does anyone have a counter proposal? > Also, I would recommend exiting as early as possible. E.g. in gsmld_open(). Good suggestion. > And please retain support for real serial devices (e.g. ttyS, ttyUSB, > ttyACM, ...). Okay, so it's "tty{0-9}*$" that should be excluded? Plus others that Greg alluded to? Is there a definitive accept / reject list? -- Lee Jones [李琼斯] ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-05 11:46 ` Lee Jones @ 2023-10-05 11:55 ` Greg Kroah-Hartman 2023-10-05 12:17 ` Lee Jones 0 siblings, 1 reply; 29+ messages in thread From: Greg Kroah-Hartman @ 2023-10-05 11:55 UTC (permalink / raw) To: Lee Jones Cc: Starke, Daniel, linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On Thu, Oct 05, 2023 at 12:46:32PM +0100, Lee Jones wrote: > On Thu, 05 Oct 2023, Starke, Daniel wrote: > > > > > Would something like this tick that box? > > > > > > > > diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c index > > > > 1f3aba607cd51..5c1d2fcd5d9e2 100644 > > > > --- a/drivers/tty/n_gsm.c > > > > +++ b/drivers/tty/n_gsm.c > > > > @@ -3716,6 +3716,10 @@ static ssize_t gsmld_write(struct tty_struct *tty, struct file *file, > > > > if (!gsm) > > > > return -ENODEV; > > > > > > > > + /* The GSM line discipline does not support binding to console */ > > > > + if (strncmp(tty->name, "tty", 3)) > > > > + return -EINVAL; > > > > > > No, that's not going to work, some consoles do not start with "tty" :( > > Ah, you mean there are others that we need to consider? > > I was just covering off con_write() from drivers/tty/vt/vt.c. > > Does anyone have a counter proposal? consoles are dynamically assigned, the device knows if it is a console or not, so there is a way to determine this at runtime. It's not a device naming thing at all. thanks, greg k-h ^ permalink raw reply [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-05 11:55 ` Greg Kroah-Hartman @ 2023-10-05 12:17 ` Lee Jones 2023-10-05 12:37 ` Greg Kroah-Hartman 0 siblings, 1 reply; 29+ messages in thread From: Lee Jones @ 2023-10-05 12:17 UTC (permalink / raw) To: Greg Kroah-Hartman Cc: Starke, Daniel, linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On Thu, 05 Oct 2023, Greg Kroah-Hartman wrote: > On Thu, Oct 05, 2023 at 12:46:32PM +0100, Lee Jones wrote: > > On Thu, 05 Oct 2023, Starke, Daniel wrote: > > > > > > > Would something like this tick that box? > > > > > > > > > > diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c index > > > > > 1f3aba607cd51..5c1d2fcd5d9e2 100644 > > > > > --- a/drivers/tty/n_gsm.c > > > > > +++ b/drivers/tty/n_gsm.c > > > > > @@ -3716,6 +3716,10 @@ static ssize_t gsmld_write(struct tty_struct *tty, struct file *file, > > > > > if (!gsm) > > > > > return -ENODEV; > > > > > > > > > > + /* The GSM line discipline does not support binding to console */ > > > > > + if (strncmp(tty->name, "tty", 3)) > > > > > + return -EINVAL; > > > > > > > > No, that's not going to work, some consoles do not start with "tty" :( > > > > Ah, you mean there are others that we need to consider? > > > > I was just covering off con_write() from drivers/tty/vt/vt.c. > > > > Does anyone have a counter proposal? > > consoles are dynamically assigned, the device knows if it is a console > or not, so there is a way to determine this at runtime. It's not a > device naming thing at all. It's not a device naming thing, but it is a ... :) Okay, here's another uninformed stab in the dark: diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c index 1f3aba607cd51..ddf00f6a4141d 100644 --- a/drivers/tty/n_gsm.c +++ b/drivers/tty/n_gsm.c @@ -3629,6 +3629,10 @@ static int gsmld_open(struct tty_struct *tty) if (tty->ops->write == NULL) return -EINVAL; + /* The GSM line discipline does not support binding to console */ + if (tty->driver->type == TTY_DRIVER_TYPE_CONSOLE) + return -EINVAL; + /* Attach our ldisc data */ gsm = gsm_alloc_mux(); if (gsm == NULL) This seems to allow for the real serial devices (TTY_DRIVER_TYPE_SERIAL) suggested by Daniel. -- Lee Jones [李琼斯] ^ permalink raw reply related [flat|nested] 29+ messages in thread
* Re: [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic 2023-10-05 12:17 ` Lee Jones @ 2023-10-05 12:37 ` Greg Kroah-Hartman 0 siblings, 0 replies; 29+ messages in thread From: Greg Kroah-Hartman @ 2023-10-05 12:37 UTC (permalink / raw) To: Lee Jones Cc: Starke, Daniel, linux-kernel@vger.kernel.org, Fedor Pchelkin, Jiri Slaby, linux-serial@vger.kernel.org, syzbot+5f47a8cea6a12b77a876@syzkaller.appspotmail.com On Thu, Oct 05, 2023 at 01:17:56PM +0100, Lee Jones wrote: > On Thu, 05 Oct 2023, Greg Kroah-Hartman wrote: > > > On Thu, Oct 05, 2023 at 12:46:32PM +0100, Lee Jones wrote: > > > On Thu, 05 Oct 2023, Starke, Daniel wrote: > > > > > > > > > Would something like this tick that box? > > > > > > > > > > > > diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c index > > > > > > 1f3aba607cd51..5c1d2fcd5d9e2 100644 > > > > > > --- a/drivers/tty/n_gsm.c > > > > > > +++ b/drivers/tty/n_gsm.c > > > > > > @@ -3716,6 +3716,10 @@ static ssize_t gsmld_write(struct tty_struct *tty, struct file *file, > > > > > > if (!gsm) > > > > > > return -ENODEV; > > > > > > > > > > > > + /* The GSM line discipline does not support binding to console */ > > > > > > + if (strncmp(tty->name, "tty", 3)) > > > > > > + return -EINVAL; > > > > > > > > > > No, that's not going to work, some consoles do not start with "tty" :( > > > > > > Ah, you mean there are others that we need to consider? > > > > > > I was just covering off con_write() from drivers/tty/vt/vt.c. > > > > > > Does anyone have a counter proposal? > > > > consoles are dynamically assigned, the device knows if it is a console > > or not, so there is a way to determine this at runtime. It's not a > > device naming thing at all. > > It's not a device naming thing, but it is a ... :) > > Okay, here's another uninformed stab in the dark: > > diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c > index 1f3aba607cd51..ddf00f6a4141d 100644 > --- a/drivers/tty/n_gsm.c > +++ b/drivers/tty/n_gsm.c > @@ -3629,6 +3629,10 @@ static int gsmld_open(struct tty_struct *tty) > if (tty->ops->write == NULL) > return -EINVAL; > > + /* The GSM line discipline does not support binding to console */ > + if (tty->driver->type == TTY_DRIVER_TYPE_CONSOLE) > + return -EINVAL; > + > /* Attach our ldisc data */ > gsm = gsm_alloc_mux(); > if (gsm == NULL) > > This seems to allow for the real serial devices (TTY_DRIVER_TYPE_SERIAL) > suggested by Daniel. Close, but not quite. Driver "types" can say if they are a console or not, but that doesn't mean you can't run the console over a serial port as well, right? Sorry, I don't have a real solution right now, and wouldn't wish the phrase "just dig through the tty console code!" on anyone, but that's what it is going to take to figure it out somehow, I can't remember the exact way consoles are hooked up at the moment (I conviently skipped that portion of the tty layer in my Embedded Recipies talk last week, saying "it's magic...") thanks, greg k-h ^ permalink raw reply [flat|nested] 29+ messages in thread
end of thread, other threads:[~2023-10-05 16:18 UTC | newest] Thread overview: 29+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-10-03 17:00 [PATCH 1/1] tty: n_gsm: Avoid sleeping during .write() whilst atomic Lee Jones 2023-10-03 18:14 ` Greg Kroah-Hartman 2023-10-03 18:55 ` Lee Jones 2023-10-04 6:04 ` Greg Kroah-Hartman 2023-10-04 9:09 ` Lee Jones 2023-10-04 9:55 ` Greg Kroah-Hartman 2023-10-04 12:19 ` Greg Kroah-Hartman 2023-10-04 12:57 ` Lee Jones 2023-10-04 13:13 ` Greg Kroah-Hartman 2023-10-05 4:59 ` Jiri Slaby 2023-10-05 6:05 ` Greg Kroah-Hartman 2023-10-05 7:26 ` Lee Jones 2023-10-04 5:59 ` Starke, Daniel 2023-10-04 6:05 ` Greg Kroah-Hartman 2023-10-04 8:40 ` Jiri Slaby 2023-10-04 8:58 ` Starke, Daniel 2023-10-04 8:57 ` Lee Jones 2023-10-04 12:21 ` Greg Kroah-Hartman 2023-10-04 12:57 ` Lee Jones 2023-10-04 13:14 ` Greg Kroah-Hartman 2023-10-05 9:03 ` Lee Jones 2023-10-05 9:18 ` Greg Kroah-Hartman 2023-10-05 10:43 ` Lee Jones 2023-10-05 11:33 ` Greg Kroah-Hartman 2023-10-05 11:38 ` Starke, Daniel 2023-10-05 11:46 ` Lee Jones 2023-10-05 11:55 ` Greg Kroah-Hartman 2023-10-05 12:17 ` Lee Jones 2023-10-05 12:37 ` Greg Kroah-Hartman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).