* Race between release_tty() and vt_disallocate()
@ 2017-08-10 15:55 Arnd Bergmann
2017-08-14 12:39 ` Alan Cox
0 siblings, 1 reply; 4+ messages in thread
From: Arnd Bergmann @ 2017-08-10 15:55 UTC (permalink / raw)
To: Greg Kroah-Hartman, Jiri Slaby
Cc: Linux Kernel Mailing List, Adam Borowski, linux-serial,
Peter Hurley, jun.he, graeme.gregory, gema.gomez-solano
Hi tty people,
I tracked down a bug report to what I think is a race between a tty_struct
and the vt_data going away at the same time. See
https://bugs.linaro.org/show_bug.cgi?id=3174 for the long story.
The short version is that a backtrace shows
[ 1138.433484] [<ffff0000080e78f0>] __cancel_work_timer+0x80/0x1c8
[ 1138.433486] [<ffff0000080e7a5c>] cancel_work_sync+0x24/0x30
[ 1138.433491] [<ffff0000084e9dd0>] tty_buffer_cancel_work+0x20/0x30
[ 1138.433493] [<ffff0000084de828>] release_tty+0xc8/0x138
[ 1138.433495] [<ffff0000084e0dc8>] tty_release+0x428/0x650
[ 1138.433499] [<ffff000008265a3c>] __fput+0xa4/0x220
[ 1138.433501] [<ffff000008265c58>] ____fput+0x20/0x30
[ 1138.433503] [<ffff0000080eb3a4>] task_work_run+0xcc/0xe8
[ 1138.433506] [<ffff0000080cf334>] do_exit+0x30c/0x9f0
[ 1138.433507] [<ffff0000080cfaa8>] do_group_exit+0x40/0xb0
[ 1138.433510] [<ffff0000080dbff8>] get_signal+0x2d0/0x588
[ 1138.433513] [<ffff0000080893f4>] do_signal+0x8c/0x550
[ 1138.433515] [<ffff000008089b28>] do_notify_resume+0x98/0xb8
[ 1138.433516] [<ffff0000080835dc>] work_pending+0x8/0x10
get_work_pool_id() crashes while dereferencing tty->port.buf.work.data
as a pointer, after that has apparently been overwritten with the
non-pointer value 0x00000028fecaedff. The tty_port belongs to
a vc_data structure, which gets freed after we find that
console_driver->ttys[i]->count is zero in the VT_DISALLOCATE
ioctl. Apparently at the same time, the agetty process owning
the tty closes and that leads to tty->count dropping to zero
before we call tty_buffer_cancel_work() on the tty_port that
has now been freed.
Apparently the locking and/or reference counting between the
two code paths is insufficient, but I don't understand enough
about tty locking to come up with a fix that doesn't break other
things. Please have a look.
Arnd
^ permalink raw reply [flat|nested] 4+ messages in thread* Re: Race between release_tty() and vt_disallocate() 2017-08-10 15:55 Race between release_tty() and vt_disallocate() Arnd Bergmann @ 2017-08-14 12:39 ` Alan Cox 2017-08-14 14:25 ` Arnd Bergmann 0 siblings, 1 reply; 4+ messages in thread From: Alan Cox @ 2017-08-14 12:39 UTC (permalink / raw) To: Arnd Bergmann Cc: Greg Kroah-Hartman, Jiri Slaby, Linux Kernel Mailing List, Adam Borowski, linux-serial, Peter Hurley, jun.he, graeme.gregory, gema.gomez-solano > non-pointer value 0x00000028fecaedff. The tty_port belongs to > a vc_data structure, which gets freed after we find that > console_driver->ttys[i]->count is zero in the VT_DISALLOCATE > ioctl. Apparently at the same time, the agetty process owning That wouldn't actually be a safe check. tty->count isn't a simple reference count even if the locking were right. > the tty closes and that leads to tty->count dropping to zero > before we call tty_buffer_cancel_work() on the tty_port that > has now been freed. > > Apparently the locking and/or reference counting between the > two code paths is insufficient, but I don't understand enough > about tty locking to come up with a fix that doesn't break other > things. Please have a look. I'm actually not sure how we can fix this within the current API. The tty port is refcounted (see tty_port_put() and tty_port_tty_get()) so any ioctl would end up returning but the console port resources would not disappear until that tty finally closed down. Calling tty_hangup on the tty for the port will close the tty down, but that in itself is also asynchronous. The only easy way I can think to keep the current semantics would instead be to keep the tty port resources around and indexed somewhere but blackhole input to/output from that port or switching to it and also call tty_hangup if the port has a tty. Alan ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Race between release_tty() and vt_disallocate() 2017-08-14 12:39 ` Alan Cox @ 2017-08-14 14:25 ` Arnd Bergmann 2017-08-17 12:11 ` Alan Cox 0 siblings, 1 reply; 4+ messages in thread From: Arnd Bergmann @ 2017-08-14 14:25 UTC (permalink / raw) To: Alan Cox Cc: Greg Kroah-Hartman, Jiri Slaby, Linux Kernel Mailing List, Adam Borowski, linux-serial, Peter Hurley, jun.he, graeme.gregory, gema.gomez-solano On Monday, August 14, 2017 1:39:47 PM CEST Alan Cox wrote: > > the tty closes and that leads to tty->count dropping to zero > > before we call tty_buffer_cancel_work() on the tty_port that > > has now been freed. > > > > Apparently the locking and/or reference counting between the > > two code paths is insufficient, but I don't understand enough > > about tty locking to come up with a fix that doesn't break other > > things. Please have a look. > > I'm actually not sure how we can fix this within the current API. The tty > port is refcounted (see tty_port_put() and tty_port_tty_get()) so > any ioctl would end up returning but the console port resources would not > disappear until that tty finally closed down. It seems that part of the problem is the lack of tty_port_put/tty_port_get calls in the VT code. > The only easy way I can think to keep the current semantics would instead > be to keep the tty port resources around and indexed somewhere but > blackhole input to/output from that port or switching to it and also call > tty_hangup if the port has a tty. What would still be missing if we just add that reference counting and delay the freeing of the vc_data/tty_port? I probably missed part of your analysis, so just throwing this out for discussion. (not tested, probably wrong as I said) Signed-off-by: Arnd Bergmann <arnd@arndb.de> diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c index 2ebaba16f785..9ab3df49d988 100644 --- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -750,6 +750,16 @@ static void visual_init(struct vc_data *vc, int num, int init) vc->vc_screenbuf_size = vc->vc_rows * vc->vc_size_row; } +static void vt_destruct(struct tty_port *port) +{ + struct vc_data *vc = container_of(port, struct vc_data, port); + kfree(vc); +} + +static const struct tty_port_operations vt_port_operations = { + .destruct = vt_destruct, +}; + int vc_allocate(unsigned int currcons) /* return 0 on success */ { struct vt_notifier_param param; @@ -775,6 +785,7 @@ int vc_allocate(unsigned int currcons) /* return 0 on success */ vc_cons[currcons].d = vc; tty_port_init(&vc->port); + vc->port.ops = &vt_port_operations; INIT_WORK(&vc_cons[currcons].SAK_work, vc_SAK); visual_init(vc, currcons, 1); @@ -2880,14 +2891,16 @@ static int con_install(struct tty_driver *driver, struct tty_struct *tty) vc = vc_cons[currcons].d; /* Still being freed */ - if (vc->port.tty) { + if (vc->port.tty || !tty_port_get(&vc->port)) { ret = -ERESTARTSYS; goto unlock; } ret = tty_port_install(&vc->port, driver, tty); - if (ret) + if (ret) { + tty_port_put(&vc->port); goto unlock; + } tty->driver_data = vc; vc->port.tty = tty; @@ -2926,6 +2939,11 @@ static void con_shutdown(struct tty_struct *tty) console_unlock(); } +static void con_cleanup(struct tty_struct *tty) +{ + tty_port_put(tty->port); +} + static int default_color = 7; /* white */ static int default_italic_color = 2; // green (ASCII) static int default_underline_color = 3; // cyan (ASCII) @@ -3050,7 +3068,8 @@ static const struct tty_operations con_ops = { .throttle = con_throttle, .unthrottle = con_unthrottle, .resize = vt_resize, - .shutdown = con_shutdown + .shutdown = con_shutdown, + .cleanup = con_cleanup, }; static struct cdev vc0_cdev; diff --git a/drivers/tty/vt/vt_ioctl.c b/drivers/tty/vt/vt_ioctl.c index 96d389cb506c..25aa37a93f58 100644 --- a/drivers/tty/vt/vt_ioctl.c +++ b/drivers/tty/vt/vt_ioctl.c @@ -292,10 +292,8 @@ static int vt_disallocate(unsigned int vc_num) vc = vc_deallocate(vc_num); console_unlock(); - if (vc && vc_num >= MIN_NR_CONSOLES) { - tty_port_destroy(&vc->port); - kfree(vc); - } + if (vc && vc_num >= MIN_NR_CONSOLES) + tty_port_put(&vc->port); return ret; } @@ -315,10 +313,8 @@ static void vt_disallocate_all(void) console_unlock(); for (i = 1; i < MAX_NR_CONSOLES; i++) { - if (vc[i] && i >= MIN_NR_CONSOLES) { - tty_port_destroy(&vc[i]->port); - kfree(vc[i]); - } + if (vc[i] && i >= MIN_NR_CONSOLES) + tty_port_put(&vc[i]->port); } } ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: Race between release_tty() and vt_disallocate() 2017-08-14 14:25 ` Arnd Bergmann @ 2017-08-17 12:11 ` Alan Cox 0 siblings, 0 replies; 4+ messages in thread From: Alan Cox @ 2017-08-17 12:11 UTC (permalink / raw) To: Arnd Bergmann Cc: Greg Kroah-Hartman, Jiri Slaby, Linux Kernel Mailing List, Adam Borowski, linux-serial, Peter Hurley, jun.he, graeme.gregory, gema.gomez-solano > It seems that part of the problem is the lack of tty_port_put/tty_port_get > calls in the VT code. Yes > > The only easy way I can think to keep the current semantics would instead > > be to keep the tty port resources around and indexed somewhere but > > blackhole input to/output from that port or switching to it and also call > > tty_hangup if the port has a tty. > > What would still be missing if we just add that reference counting and > delay the freeing of the vc_data/tty_port? I probably missed part of your > analysis, so just throwing this out for discussion. Is the expected behaviour that the disallocate also shuts down anything using the port. If so I think you also need to do a hangup on it. Otherwise, assuming the change in behaviour is ok this seems only part of the picture. Possibly we should also hangup any proces on the now destructed port, and right now I don't see that being done (tty_port_tty_hangup(port, 0);) I think the rest might also need fixing up. con_install sets vc->port.tty rather than using tty_port_tty_set() so looks like it doesn't end up with the needed refcount, and likewise con_sbutdown touches it wrongly as far as I can see. (That might actually explain a really strange tty ref counting race bug I've seen reported very rarely for some years and never found!) In addition there are other places that reference port->tty directly without the right locks against hangup that probably need to use tty_port_tty_get() instead (eg a vc_resize at the exact moment of a hangup looks like it will crash) > > (not tested, probably wrong as I said) > > Signed-off-by: Arnd Bergmann <arnd@arndb.de> > > diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c > index 2ebaba16f785..9ab3df49d988 100644 > --- a/drivers/tty/vt/vt.c > +++ b/drivers/tty/vt/vt.c > @@ -750,6 +750,16 @@ static void visual_init(struct vc_data *vc, int num, int init) > vc->vc_screenbuf_size = vc->vc_rows * vc->vc_size_row; > } > > +static void vt_destruct(struct tty_port *port) > +{ > + struct vc_data *vc = container_of(port, struct vc_data, port); > + kfree(vc); > +} > + > +static const struct tty_port_operations vt_port_operations = { > + .destruct = vt_destruct, > +}; > + > int vc_allocate(unsigned int currcons) /* return 0 on success */ > { > struct vt_notifier_param param; > @@ -775,6 +785,7 @@ int vc_allocate(unsigned int currcons) /* return 0 on success */ > > vc_cons[currcons].d = vc; > tty_port_init(&vc->port); > + vc->port.ops = &vt_port_operations; > INIT_WORK(&vc_cons[currcons].SAK_work, vc_SAK); > > visual_init(vc, currcons, 1); > @@ -2880,14 +2891,16 @@ static int con_install(struct tty_driver *driver, struct tty_struct *tty) > vc = vc_cons[currcons].d; > > /* Still being freed */ > - if (vc->port.tty) { > + if (vc->port.tty || !tty_port_get(&vc->port)) { Do we still need to check vc->port.tty as we should have a reference to the port if the tty is open ? Also on a hangup port->tty changes under tty_lock and port->lock not console lock. BTW if you need an example that handles every case of hotplugging at once the drivers/mmc/core/sdio_uart.c driver pretty much uses every API feature to handle the sd and tty refcounting. Alan ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-08-17 12:11 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-08-10 15:55 Race between release_tty() and vt_disallocate() Arnd Bergmann 2017-08-14 12:39 ` Alan Cox 2017-08-14 14:25 ` Arnd Bergmann 2017-08-17 12:11 ` Alan Cox
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).