* tty's use of file_list_lock and file_move @ 2006-07-10 15:10 Jon Smirl 2006-07-10 17:33 ` Alan Cox 0 siblings, 1 reply; 30+ messages in thread From: Jon Smirl @ 2006-07-10 15:10 UTC (permalink / raw) To: Alan Cox; +Cc: lkml Since you want a new subject can you explain tty's use of file_lock to me? Is there some non-obvious global coordination happening here or is it work breaking down the big kernel lock that never got finished? -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-10 15:10 tty's use of file_list_lock and file_move Jon Smirl @ 2006-07-10 17:33 ` Alan Cox 2006-07-10 17:27 ` Jon Smirl 0 siblings, 1 reply; 30+ messages in thread From: Alan Cox @ 2006-07-10 17:33 UTC (permalink / raw) To: Jon Smirl; +Cc: lkml Ar Llu, 2006-07-10 am 11:10 -0400, ysgrifennodd Jon Smirl: > Since you want a new subject can you explain tty's use of file_lock to > me? Is there some non-obvious global coordination happening here or is > it work breaking down the big kernel lock that never got finished? Its explained in the comment in do_SAK. Alan ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-10 17:33 ` Alan Cox @ 2006-07-10 17:27 ` Jon Smirl 2006-07-10 18:05 ` Alan Cox 0 siblings, 1 reply; 30+ messages in thread From: Jon Smirl @ 2006-07-10 17:27 UTC (permalink / raw) To: Alan Cox; +Cc: lkml On 7/10/06, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > Ar Llu, 2006-07-10 am 11:10 -0400, ysgrifennodd Jon Smirl: > > Since you want a new subject can you explain tty's use of file_lock to > > me? Is there some non-obvious global coordination happening here or is > > it work breaking down the big kernel lock that never got finished? > > Its explained in the comment in do_SAK. This problem seems to be aggravated by reusing the tty_struct for that tty. With the refcount patch it is now easy to disassociate an existing tty (and the processes attached to it) from the array tracking tty minors. A new clean tty_struct can be allocated and made visible. Init will grab this new struct. Now there is no way to get to the old one and it can be cleaned up and deleted. Is this strategy worth coding up? It would really simplify the locking issues. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-10 17:27 ` Jon Smirl @ 2006-07-10 18:05 ` Alan Cox 2006-07-10 18:09 ` Jon Smirl 2006-07-10 22:35 ` Jon Smirl 0 siblings, 2 replies; 30+ messages in thread From: Alan Cox @ 2006-07-10 18:05 UTC (permalink / raw) To: Jon Smirl; +Cc: lkml Ar Llu, 2006-07-10 am 13:27 -0400, ysgrifennodd Jon Smirl: > > Its explained in the comment in do_SAK. > > This problem seems to be aggravated by reusing the tty_struct for that > tty. With the refcount patch it is now easy to disassociate an > existing tty (and the processes attached to it) from the array > tracking tty minors. The real problem is the rather deeper one - the lack of revoke(). Its possible to paper over that with SELinux but really we need revoke() and when you get revoke() you get the handle stuff cleaned up We hold file_list_lock because we have to find everyone using that tty and hang up their instance of it, then flip the file operations not because we need to protect against tty structs going away. It's needed in order to walk the file list and protects against the file list itself changing rather than the tty structs. It may well be possible to move that to a tty layer private lock with care, but it would need care to deal with VFS operations. We hold the ->files->file_lock because we have to walk other processes file tables in a safe fashion in SAK. That one is fairly clear. Alan ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-10 18:05 ` Alan Cox @ 2006-07-10 18:09 ` Jon Smirl 2006-07-10 23:18 ` Alan Cox 2006-07-10 22:35 ` Jon Smirl 1 sibling, 1 reply; 30+ messages in thread From: Jon Smirl @ 2006-07-10 18:09 UTC (permalink / raw) To: Alan Cox; +Cc: lkml On 7/10/06, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > Ar Llu, 2006-07-10 am 13:27 -0400, ysgrifennodd Jon Smirl: > > > Its explained in the comment in do_SAK. > > > > This problem seems to be aggravated by reusing the tty_struct for that > > tty. With the refcount patch it is now easy to disassociate an > > existing tty (and the processes attached to it) from the array > > tracking tty minors. > > The real problem is the rather deeper one - the lack of revoke(). Its > possible to paper over that with SELinux but really we need revoke() and > when you get revoke() you get the handle stuff cleaned up > > We hold file_list_lock because we have to find everyone using that tty > and hang up their instance of it, then flip the file operations not > because we need to protect against tty structs going away. It's needed > in order to walk the file list and protects against the file list itself > changing rather than the tty structs. It may well be possible to move > that to a tty layer private lock with care, but it would need care to > deal with VFS operations. > > We hold the ->files->file_lock because we have to walk other processes > file tables in a safe fashion in SAK. That one is fairly clear. What if do_SAK did something like this? copy the tty struct to new_tty NULL out the file list in new_tty insert new_tty into the array tracking tty minors so that open will find the new one in old_tty stub out all of it's routines that do read/write/etc Now start a kernel thread doing the HUP/INT/KILL sequence, it has a reference to old_tty so it can find everything. When everything is clean, delete old_tty The processes and file handles attached to old_tty won't be able to do anything, all of their IO calls have been stubbed out. The file list can't be changed since there there is no way to get to it anymore. When init respawns it picks up new_tty which is guaranteed clean of processes and open handles because it was just created. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-10 18:09 ` Jon Smirl @ 2006-07-10 23:18 ` Alan Cox 0 siblings, 0 replies; 30+ messages in thread From: Alan Cox @ 2006-07-10 23:18 UTC (permalink / raw) To: Jon Smirl; +Cc: lkml Ar Llu, 2006-07-10 am 14:09 -0400, ysgrifennodd Jon Smirl: > > We hold the ->files->file_lock because we have to walk other processes > > file tables in a safe fashion in SAK. That one is fairly clear. > > What if do_SAK did something like this? > > copy the tty struct to new_tty > NULL out the file list in new_tty > insert new_tty into the array tracking tty minors so that open will > find the new one SAK isn't locking against the tty layer ,its locking against the VFS when it does htis ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-10 18:05 ` Alan Cox 2006-07-10 18:09 ` Jon Smirl @ 2006-07-10 22:35 ` Jon Smirl 2006-07-10 23:15 ` Alan Cox 2006-07-10 23:39 ` Theodore Tso 1 sibling, 2 replies; 30+ messages in thread From: Jon Smirl @ 2006-07-10 22:35 UTC (permalink / raw) To: Alan Cox; +Cc: lkml On 7/10/06, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > We hold file_list_lock because we have to find everyone using that tty > and hang up their instance of it, then flip the file operations not > because we need to protect against tty structs going away. It's needed > in order to walk the file list and protects against the file list itself > changing rather than the tty structs. It may well be possible to move > that to a tty layer private lock with care, but it would need care to > deal with VFS operations. Assuming do_SAK has blocked anyone's ability to newly open the tty, why does it need to search every file handle in the system instead of just using tty->tty_files? tty->tty_files should contain a list of everyone who has the tty open. Is this global search needed because of duplicated handles? -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-10 22:35 ` Jon Smirl @ 2006-07-10 23:15 ` Alan Cox 2006-07-10 23:04 ` Jon Smirl 2006-07-10 23:39 ` Theodore Tso 1 sibling, 1 reply; 30+ messages in thread From: Alan Cox @ 2006-07-10 23:15 UTC (permalink / raw) To: Jon Smirl; +Cc: lkml Ar Llu, 2006-07-10 am 18:35 -0400, ysgrifennodd Jon Smirl: > Assuming do_SAK has blocked anyone's ability to newly open the tty, > why does it need to search every file handle in the system instead of > just using tty->tty_files? tty->tty_files should contain a list of > everyone who has the tty open. Is this global search needed because of > duplicated handles? I don't actually know. Thats an area I've not dug too deeply into at all. ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-10 23:15 ` Alan Cox @ 2006-07-10 23:04 ` Jon Smirl 2006-07-10 23:49 ` Jon Smirl 0 siblings, 1 reply; 30+ messages in thread From: Jon Smirl @ 2006-07-10 23:04 UTC (permalink / raw) To: Alan Cox; +Cc: lkml On 7/10/06, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > Ar Llu, 2006-07-10 am 18:35 -0400, ysgrifennodd Jon Smirl: > > Assuming do_SAK has blocked anyone's ability to newly open the tty, > > why does it need to search every file handle in the system instead of > > just using tty->tty_files? tty->tty_files should contain a list of > > everyone who has the tty open. Is this global search needed because of > > duplicated handles? > > I don't actually know. Thats an area I've not dug too deeply into at > all. I wonder what the impact of banging on the SAK key on a box with 100K processes is like. You may be able to stall the whole system. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-10 23:04 ` Jon Smirl @ 2006-07-10 23:49 ` Jon Smirl 2006-07-11 1:29 ` Theodore Tso 0 siblings, 1 reply; 30+ messages in thread From: Jon Smirl @ 2006-07-10 23:49 UTC (permalink / raw) To: Alan Cox; +Cc: lkml How about the use of lock/unlock_kernel(). Is there some hidden global synchronization going on? Every time lock/unlock_kernel() is used there is a tty_struct available. My first thought would be to turn this into a per tty spinlock. Looking at where it is used it looks like it was added to protect all of the VFS calls. I see no obvious coordination with other ttys that isn't handled by other locks. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-10 23:49 ` Jon Smirl @ 2006-07-11 1:29 ` Theodore Tso 2006-07-11 2:16 ` Jon Smirl ` (2 more replies) 0 siblings, 3 replies; 30+ messages in thread From: Theodore Tso @ 2006-07-11 1:29 UTC (permalink / raw) To: Jon Smirl; +Cc: Alan Cox, lkml On Mon, Jul 10, 2006 at 07:49:31PM -0400, Jon Smirl wrote: > How about the use of lock/unlock_kernel(). Is there some hidden global > synchronization going on? Every time lock/unlock_kernel() is used > there is a tty_struct available. My first thought would be to turn > this into a per tty spinlock. Looking at where it is used it looks > like it was added to protect all of the VFS calls. I see no obvious > coordination with other ttys that isn't handled by other locks. No, it was just a case of not being worth it to get rid of the BKL for the tty subsystem, since opening and closing tty's isn't exactly a common event. Switching it to use a per-tty spinlock makes sense if we're going to rototill the code, but to be honest it's probably not going to make a noticeable difference on any benchmark and most workloads. - Ted ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 1:29 ` Theodore Tso @ 2006-07-11 2:16 ` Jon Smirl 2006-07-11 10:12 ` Alan Cox 2006-07-11 3:33 ` Jon Smirl 2006-07-11 19:44 ` Russell King 2 siblings, 1 reply; 30+ messages in thread From: Jon Smirl @ 2006-07-11 2:16 UTC (permalink / raw) To: Theodore Tso, Jon Smirl, Alan Cox, lkml On 7/10/06, Theodore Tso <tytso@mit.edu> wrote: > On Mon, Jul 10, 2006 at 07:49:31PM -0400, Jon Smirl wrote: > > How about the use of lock/unlock_kernel(). Is there some hidden global > > synchronization going on? Every time lock/unlock_kernel() is used > > there is a tty_struct available. My first thought would be to turn > > this into a per tty spinlock. Looking at where it is used it looks > > like it was added to protect all of the VFS calls. I see no obvious > > coordination with other ttys that isn't handled by other locks. > > No, it was just a case of not being worth it to get rid of the BKL for > the tty subsystem, since opening and closing tty's isn't exactly a > common event. Switching it to use a per-tty spinlock makes sense if > we're going to rototill the code, but to be honest it's probably not > going to make a noticeable difference on any benchmark and most > workloads. I'm not looking for performance gains, I'm looking more to isolate the tty code down to a minimal set of interactions with the rest of the kernel. RIght now it is all intertwined. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 2:16 ` Jon Smirl @ 2006-07-11 10:12 ` Alan Cox 2006-07-11 12:28 ` Jon Smirl 0 siblings, 1 reply; 30+ messages in thread From: Alan Cox @ 2006-07-11 10:12 UTC (permalink / raw) To: Jon Smirl; +Cc: Theodore Tso, lkml Ar Llu, 2006-07-10 am 22:16 -0400, ysgrifennodd Jon Smirl: > I'm not looking for performance gains, I'm looking more to isolate the > tty code down to a minimal set of interactions with the rest of the > kernel. RIght now it is all intertwined. That might be tricky given that hangup and SAK have to directly interact with the VFS locking. Alan ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 10:12 ` Alan Cox @ 2006-07-11 12:28 ` Jon Smirl 2006-07-11 13:15 ` Paulo Marques 0 siblings, 1 reply; 30+ messages in thread From: Jon Smirl @ 2006-07-11 12:28 UTC (permalink / raw) To: Alan Cox; +Cc: Theodore Tso, lkml [-- Attachment #1: Type: text/plain, Size: 1554 bytes --] On 7/11/06, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > Ar Llu, 2006-07-10 am 22:16 -0400, ysgrifennodd Jon Smirl: > > I'm not looking for performance gains, I'm looking more to isolate the > > tty code down to a minimal set of interactions with the rest of the > > kernel. RIght now it is all intertwined. > > That might be tricky given that hangup and SAK have to directly interact > with the VFS locking. I haven't got a working patch yet but I think there are a couple of functions that should live over in the kernel code and have tty call them. For example, make the walk every thread loop into a kernel function named revoke (hopefully it will get fully fleshed out at some point) and move it over to /kernel. The code that manipulates job control belongs in /kernel too with tty calling it. Right now the job control code is spread over three directories. It could be collected into a single file which would make it more understandable. The general idea is that the current tty code is not really a device driver, it has some kernel functions smeared into it. I'm looking at extracting those kernel functions and making an API for them. Doing that should make the locking needs more obvious and understandable. Of course I am going to need help decoding all of the current locking. Every time one of those global locks is taken I have worry about spooky quantum interaction at a distance. Have either of you looked at the refcount patch I posted earlier? I'll attach it again so you don't need to search. -- Jon Smirl jonsmirl@gmail.com [-- Attachment #2: tty-refcount.patch --] [-- Type: text/x-patch, Size: 21398 bytes --] diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c index bfdb902..5c673e0 100644 --- a/drivers/char/tty_io.c +++ b/drivers/char/tty_io.c @@ -126,9 +126,8 @@ EXPORT_SYMBOL(tty_std_termios); LIST_HEAD(tty_drivers); /* linked list of tty drivers */ -/* Semaphore to protect creating and releasing a tty. This is shared with - vt.c for deeply disgusting hack reasons */ -DEFINE_MUTEX(tty_mutex); +/* Spinlock to protect creating and releasing a tty. */ +static DEFINE_SPINLOCK(tty_lock); #ifdef CONFIG_UNIX98_PTYS extern struct tty_driver *ptm_driver; /* Unix98 pty masters; for /dev/ptmx */ @@ -158,21 +157,21 @@ static struct tty_struct *alloc_tty_stru { struct tty_struct *tty; - tty = kmalloc(sizeof(struct tty_struct), GFP_KERNEL); - if (tty) - memset(tty, 0, sizeof(struct tty_struct)); + tty = kmalloc(sizeof(*tty), GFP_KERNEL); return tty; } static void tty_buffer_free_all(struct tty_struct *); -static inline void free_tty_struct(struct tty_struct *tty) +void free_tty_struct(struct tty_struct *tty) { kfree(tty->write_buf); tty_buffer_free_all(tty); kfree(tty); } +EXPORT_SYMBOL(free_tty_struct); + #define TTY_NUMBER(tty) ((tty)->index + (tty)->driver->name_base) char *tty_name(struct tty_struct *tty, char *buf) @@ -1090,7 +1089,7 @@ static void do_tty_hangup(void *data) if (tty->session > 0) { do_each_task_pid(tty->session, PIDTYPE_SID, p) { if (p->signal->tty == tty) - p->signal->tty = NULL; + tty_put(&p->signal->tty, 0, __FILE__, __LINE__); if (!p->signal->leader) continue; group_send_sig_info(SIGHUP, SEND_SIG_PRIV, p); @@ -1184,11 +1183,9 @@ void disassociate_ctty(int on_exit) lock_kernel(); - mutex_lock(&tty_mutex); tty = current->signal->tty; if (tty) { tty_pgrp = tty->pgrp; - mutex_unlock(&tty_mutex); if (on_exit && tty->driver->type != TTY_DRIVER_TYPE_PTY) tty_vhangup(tty); } else { @@ -1196,7 +1193,6 @@ void disassociate_ctty(int on_exit) kill_pg(current->signal->tty_old_pgrp, SIGHUP, on_exit); kill_pg(current->signal->tty_old_pgrp, SIGCONT, on_exit); } - mutex_unlock(&tty_mutex); unlock_kernel(); return; } @@ -1206,8 +1202,6 @@ void disassociate_ctty(int on_exit) kill_pg(tty_pgrp, SIGCONT, on_exit); } - /* Must lock changes to tty_old_pgrp */ - mutex_lock(&tty_mutex); current->signal->tty_old_pgrp = 0; tty->session = 0; tty->pgrp = -1; @@ -1215,10 +1209,9 @@ void disassociate_ctty(int on_exit) /* Now clear signal->tty under the lock */ read_lock(&tasklist_lock); do_each_task_pid(current->signal->session, PIDTYPE_SID, p) { - p->signal->tty = NULL; + tty_put(&p->signal->tty, 0, __FILE__, __LINE__); } while_each_task_pid(current->signal->session, PIDTYPE_SID, p); read_unlock(&tasklist_lock); - mutex_unlock(&tty_mutex); unlock_kernel(); } @@ -1546,7 +1539,8 @@ static int init_dev(struct tty_driver *d * Everything allocated ... set up the o_tty structure. */ if (!(driver->other->flags & TTY_DRIVER_DEVPTS_MEM)) { - driver->other->ttys[idx] = o_tty; + driver->other->ttys[idx] = tty_get(o_tty, + __FILE__, __LINE__); } if (!*o_tp_loc) *o_tp_loc = o_tp; @@ -1559,8 +1553,8 @@ static int init_dev(struct tty_driver *d o_tty->count++; /* Establish the links in both directions */ - tty->link = o_tty; - o_tty->link = tty; + tty->link = tty_get(o_tty, __FILE__, __LINE__); + o_tty->link = tty_get(tty, __FILE__, __LINE__); } /* @@ -1569,7 +1563,7 @@ static int init_dev(struct tty_driver *d * there's no need to null out the local pointers. */ if (!(driver->flags & TTY_DRIVER_DEVPTS_MEM)) { - driver->ttys[idx] = tty; + driver->ttys[idx] = tty_get(tty, __FILE__, __LINE__); } if (!*tp_loc) @@ -1636,7 +1630,6 @@ fast_track: success: *ret_tty = tty; - /* All paths come through here to release the mutex */ end_init: return retval; @@ -1644,10 +1637,10 @@ end_init: free_mem_out: kfree(o_tp); if (o_tty) - free_tty_struct(o_tty); + tty_put(&o_tty, 1, __FILE__, __LINE__); kfree(ltp); kfree(tp); - free_tty_struct(tty); + tty_put(&tty, 1, __FILE__, __LINE__); fail_no_mem: module_put(driver->owner); @@ -1674,7 +1667,8 @@ static void release_mem(struct tty_struc if ((o_tty = tty->link) != NULL) { if (!devpts) - o_tty->driver->ttys[idx] = NULL; + tty_put(&o_tty->driver->ttys[idx], 0, + __FILE__, __LINE__); if (o_tty->driver->flags & TTY_DRIVER_RESET_TERMIOS) { tp = o_tty->termios; if (!devpts) @@ -1691,11 +1685,14 @@ static void release_mem(struct tty_struc file_list_lock(); list_del_init(&o_tty->tty_files); file_list_unlock(); - free_tty_struct(o_tty); + tty_put(&o_tty->link, 0, __FILE__, __LINE__); + tty_put(&tty->link, 0, __FILE__, __LINE__); + tty_put(&o_tty, 1, __FILE__, __LINE__); } if (!devpts) - tty->driver->ttys[idx] = NULL; + tty_put(&tty->driver->ttys[idx], 0, + __FILE__, __LINE__); if (tty->driver->flags & TTY_DRIVER_RESET_TERMIOS) { tp = tty->termios; if (!devpts) @@ -1714,7 +1711,7 @@ static void release_mem(struct tty_struc list_del_init(&tty->tty_files); file_list_unlock(); module_put(tty->driver->owner); - free_tty_struct(tty); + tty_put(&tty, 1, __FILE__, __LINE__); } /* @@ -1832,7 +1829,7 @@ #endif /* Guard against races with tty->count changes elsewhere and opens on /dev/tty */ - mutex_lock(&tty_mutex); + spin_lock(&tty_lock); tty_closing = tty->count <= 1; o_tty_closing = o_tty && (o_tty->count <= (pty_master ? 1 : 0)); @@ -1861,9 +1858,9 @@ #endif if (!do_sleep) break; + spin_unlock(&tty_lock); printk(KERN_WARNING "release_dev: %s: read/write wait queue " "active!\n", tty_name(tty, buf)); - mutex_unlock(&tty_mutex); schedule(); } @@ -1896,7 +1893,8 @@ #endif * something that needs to be handled for hangups. */ file_kill(filp); - filp->private_data = NULL; + tty_put((struct tty_struct **)(&filp->private_data), + 0, __FILE__, __LINE__); /* * Perform some housekeeping before deciding whether to return. @@ -1920,16 +1918,16 @@ #endif read_lock(&tasklist_lock); do_each_task_pid(tty->session, PIDTYPE_SID, p) { - p->signal->tty = NULL; + tty_put(&p->signal->tty, 0, __FILE__, __LINE__); } while_each_task_pid(tty->session, PIDTYPE_SID, p); if (o_tty) do_each_task_pid(o_tty->session, PIDTYPE_SID, p) { - p->signal->tty = NULL; + tty_put(&p->signal->tty, 0, __FILE__, __LINE__); } while_each_task_pid(o_tty->session, PIDTYPE_SID, p); read_unlock(&tasklist_lock); } - mutex_unlock(&tty_mutex); + spin_unlock(&tty_lock); /* check whether both sides are closing ... */ if (!tty_closing || (o_tty && !o_tty_closing)) @@ -2034,11 +2032,11 @@ retry_open: index = -1; retval = 0; - mutex_lock(&tty_mutex); - + spin_lock(&tty_lock); + if (device == MKDEV(TTYAUX_MAJOR,0)) { if (!current->signal->tty) { - mutex_unlock(&tty_mutex); + spin_unlock(&tty_lock); return -ENXIO; } driver = current->signal->tty->driver; @@ -2064,22 +2062,22 @@ #endif noctty = 1; goto got_driver; } - mutex_unlock(&tty_mutex); + spin_unlock(&tty_lock); return -ENODEV; } driver = get_tty_driver(device, &index); if (!driver) { - mutex_unlock(&tty_mutex); + spin_unlock(&tty_lock); return -ENODEV; } got_driver: retval = init_dev(driver, index, &tty); - mutex_unlock(&tty_mutex); + spin_unlock(&tty_lock); if (retval) return retval; - filp->private_data = tty; + filp->private_data = tty_get(tty, __FILE__, __LINE__); file_move(filp, &tty->tty_files); check_tty_count(tty, "tty_open"); if (tty->driver->type == TTY_DRIVER_TYPE_PTY && @@ -2122,7 +2120,7 @@ #endif !current->signal->tty && tty->session == 0) { task_lock(current); - current->signal->tty = tty; + current->signal->tty = tty_get(tty, __FILE__, __LINE__); task_unlock(current); current->signal->tty_old_pgrp = 0; tty->session = current->signal->session; @@ -2161,15 +2159,15 @@ static int ptmx_open(struct inode * inod } up(&allocated_ptys_lock); - mutex_lock(&tty_mutex); + spin_lock(&tty_lock); retval = init_dev(ptm_driver, index, &tty); - mutex_unlock(&tty_mutex); + spin_unlock(&tty_lock); if (retval) goto out; set_bit(TTY_PTY_LOCK, &tty->flags); /* LOCK THE SLAVE */ - filp->private_data = tty; + filp->private_data = tty_get(tty, __FILE__, __LINE__); file_move(filp, &tty->tty_files); retval = -ENOMEM; @@ -2359,14 +2357,14 @@ static int tiocsctty(struct tty_struct * read_lock(&tasklist_lock); do_each_task_pid(tty->session, PIDTYPE_SID, p) { - p->signal->tty = NULL; + tty_put(&p->signal->tty, 0, __FILE__, __LINE__); } while_each_task_pid(tty->session, PIDTYPE_SID, p); read_unlock(&tasklist_lock); } else return -EPERM; } task_lock(current); - current->signal->tty = tty; + current->signal->tty = tty_get(tty, __FILE__, __LINE__); task_unlock(current); current->signal->tty_old_pgrp = 0; tty->session = current->signal->session; @@ -2578,7 +2576,7 @@ int tty_ioctl(struct inode * inode, stru if (current->signal->leader) disassociate_ctty(0); task_lock(current); - current->signal->tty = NULL; + tty_put(¤t->signal->tty, 0, __FILE__, __LINE__); task_unlock(current); return 0; case TIOCSCTTY: @@ -2670,7 +2668,7 @@ #ifdef TTY_SOFT_SAK #else struct tty_struct *tty = arg; struct task_struct *g, *p; - int session; + pid_t session; int i; struct file *filp; struct tty_ldisc *disc; @@ -2914,6 +2912,11 @@ static void initialize_tty_struct(struct { memset(tty, 0, sizeof(struct tty_struct)); tty->magic = TTY_MAGIC; + atomic_set(&tty->ref_count, 1); +#ifdef DEBUG_TTY_REFCOUNT + printk("TTY ref count alloc: %p count %d\n", + tty, atomic_read(&tty->ref_count)); +#endif tty_ldisc_assign(tty, tty_ldisc_get(N_TTY)); tty->pgrp = -1; tty->overrun_time = jiffies; diff --git a/drivers/char/vc_screen.c b/drivers/char/vc_screen.c index a9247b5..97c36fc 100644 --- a/drivers/char/vc_screen.c +++ b/drivers/char/vc_screen.c @@ -474,14 +474,14 @@ static const struct file_operations vcs_ static struct class *vc_class; -void vcs_make_devfs(struct tty_struct *tty) +void vcs_make_sysfs(struct tty_struct *tty) { class_device_create(vc_class, NULL, MKDEV(VCS_MAJOR, tty->index + 1), NULL, "vcs%u", tty->index + 1); class_device_create(vc_class, NULL, MKDEV(VCS_MAJOR, tty->index + 129), NULL, "vcsa%u", tty->index + 1); } -void vcs_remove_devfs(struct tty_struct *tty) +void vcs_remove_sysfs(struct tty_struct *tty) { class_device_destroy(vc_class, MKDEV(VCS_MAJOR, tty->index + 1)); class_device_destroy(vc_class, MKDEV(VCS_MAJOR, tty->index + 129)); diff --git a/drivers/char/vt.c b/drivers/char/vt.c index da7e66a..dd39020 100644 --- a/drivers/char/vt.c +++ b/drivers/char/vt.c @@ -128,8 +128,8 @@ #define CTRL_ALWAYS 0x0800f501 /* Cannot #define DEFAULT_BELL_PITCH 750 #define DEFAULT_BELL_DURATION (HZ/8) -extern void vcs_make_devfs(struct tty_struct *tty); -extern void vcs_remove_devfs(struct tty_struct *tty); +extern void vcs_make_sysfs(struct tty_struct *tty); +extern void vcs_remove_sysfs(struct tty_struct *tty); extern void console_map_init(void); #ifdef CONFIG_PROM_CONSOLE @@ -2491,14 +2491,14 @@ static int con_open(struct tty_struct *t if (ret == 0) { struct vc_data *vc = vc_cons[currcons].d; tty->driver_data = vc; - vc->vc_tty = tty; + vc->vc_tty = tty_get(tty, __FILE__, __LINE__); if (!tty->winsize.ws_row && !tty->winsize.ws_col) { tty->winsize.ws_row = vc_cons[currcons].d->vc_rows; tty->winsize.ws_col = vc_cons[currcons].d->vc_cols; } release_console_sem(); - vcs_make_devfs(tty); + vcs_make_sysfs(tty); return ret; } } @@ -2510,30 +2510,25 @@ static int con_open(struct tty_struct *t * We take tty_mutex in here to prevent another thread from coming in via init_dev * and taking a ref against the tty while we're in the process of forgetting * about it and cleaning things up. - * - * This is because vcs_remove_devfs() can sleep and will drop the BKL. */ static void con_close(struct tty_struct *tty, struct file *filp) { - mutex_lock(&tty_mutex); acquire_console_sem(); - if (tty && tty->count == 1) { + if (tty && (tty->count == 1)) { struct vc_data *vc = tty->driver_data; if (vc) - vc->vc_tty = NULL; + tty_put(&vc->vc_tty, 0, __FILE__, __LINE__); tty->driver_data = NULL; release_console_sem(); - vcs_remove_devfs(tty); - mutex_unlock(&tty_mutex); + vcs_remove_sysfs(tty); /* - * tty_mutex is released, but we still hold BKL, so there is + * We still hold BKL, so there is * still exclusion against init_dev() */ return; } release_console_sem(); - mutex_unlock(&tty_mutex); } static void vc_init(struct vc_data *vc, unsigned int rows, diff --git a/drivers/net/ppp_async.c b/drivers/net/ppp_async.c index 23659fd..d0e804c 100644 --- a/drivers/net/ppp_async.c +++ b/drivers/net/ppp_async.c @@ -165,7 +165,7 @@ ppp_asynctty_open(struct tty_struct *tty /* initialize the asyncppp structure */ memset(ap, 0, sizeof(*ap)); - ap->tty = tty; + ap->tty = tty_get(tty, __FILE__, __LINE__); ap->mru = PPP_MRU; spin_lock_init(&ap->xmit_lock); spin_lock_init(&ap->recv_lock); @@ -218,6 +218,7 @@ ppp_asynctty_close(struct tty_struct *tt write_unlock_irq(&disc_data_lock); if (ap == 0) return; + tty_put(&ap->tty, 0, __FILE__, __LINE__); /* * We have now ensured that nobody can start using ap from now diff --git a/drivers/serial/serial_core.c b/drivers/serial/serial_core.c index d5f636f..0b6e4e4 100644 --- a/drivers/serial/serial_core.c +++ b/drivers/serial/serial_core.c @@ -1279,7 +1279,7 @@ static void uart_close(struct tty_struct tty_ldisc_flush(tty); tty->closing = 0; - state->info->tty = NULL; + tty_put(&state->info->tty, 0, __FILE__, __LINE__); if (state->info->blocked_open) { if (state->close_delay) @@ -1375,6 +1375,7 @@ static void uart_hangup(struct tty_struc uart_shutdown(state); state->count = 0; state->info->flags &= ~UIF_NORMAL_ACTIVE; + tty_put(&state->info->tty, 0, __FILE__, __LINE__); state->info->tty = NULL; wake_up_interruptible(&state->info->open_wait); wake_up_interruptible(&state->info->delta_msr_wait); @@ -1598,7 +1599,7 @@ static int uart_open(struct tty_struct * tty->driver_data = state; tty->low_latency = (state->port->flags & UPF_LOW_LATENCY) ? 1 : 0; tty->alt_speed = 0; - state->info->tty = tty; + state->info->tty = tty_get(tty, __FILE__, __LINE__); /* * If the port is in the middle of closing, bail out now. diff --git a/drivers/usb/class/cdc-acm.c b/drivers/usb/class/cdc-acm.c index 3670d77..06e96df 100644 --- a/drivers/usb/class/cdc-acm.c +++ b/drivers/usb/class/cdc-acm.c @@ -451,7 +451,7 @@ static int acm_tty_open(struct tty_struc rv = 0; tty->driver_data = acm; - acm->tty = tty; + acm->tty = tty_get(tty, __FILE__, __LINE__); /* force low_latency on so that our tty_push actually forces the data through, otherwise it is scheduled, and with high data rates data can get lost. */ @@ -519,6 +519,7 @@ static void acm_tty_close(struct tty_str return; nr = acm->rx_buflimit; + tty_put(&acm->tty, 0, __FILE__, __LINE__); mutex_lock(&open_mutex); if (!--acm->used) { if (acm->dev) { diff --git a/fs/devpts/inode.c b/fs/devpts/inode.c index f7aef5b..74eb9de 100644 --- a/fs/devpts/inode.c +++ b/fs/devpts/inode.c @@ -177,7 +177,7 @@ int devpts_pty_new(struct tty_struct *tt inode->i_gid = config.setgid ? config.gid : current->fsgid; inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME; init_special_inode(inode, S_IFCHR|config.mode, device); - inode->u.generic_ip = tty; + inode->u.generic_ip = tty_get(tty, __FILE__, __LINE__); dentry = get_node(number); if (!IS_ERR(dentry) && !dentry->d_inode) @@ -212,6 +212,8 @@ void devpts_pty_kill(int number) if (!IS_ERR(dentry)) { struct inode *inode = dentry->d_inode; if (inode) { + tty_put((struct tty_struct **)(&inode->u.generic_ip), + 0, __FILE__, __LINE__); inode->i_nlink--; d_delete(dentry); dput(dentry); diff --git a/include/linux/tty.h b/include/linux/tty.h index b3b807e..f159a12 100644 --- a/include/linux/tty.h +++ b/include/linux/tty.h @@ -181,14 +181,15 @@ struct device; */ struct tty_struct { int magic; + atomic_t ref_count; struct tty_driver *driver; int index; struct tty_ldisc ldisc; struct semaphore termios_sem; struct termios *termios, *termios_locked; char name[64]; - int pgrp; - int session; + pid_t pgrp; + pid_t session; unsigned long flags; int count; struct winsize winsize; @@ -267,6 +268,44 @@ #define TTY_HUPPED 18 /* Post driver-> #define TTY_WRITE_FLUSH(tty) tty_write_flush((tty)) +#undef DEBUG_TTY_REFCOUNT +static inline struct tty_struct * +tty_get(struct tty_struct * tty, char *file, int line) +{ + if (tty) { + WARN_ON(!atomic_read(&tty->ref_count)); + atomic_inc(&tty->ref_count); +#ifdef DEBUG_TTY_REFCOUNT + printk("TTY ref count get: %s:%d %s %p count %d\n", file, + line, tty->name, tty, atomic_read(&tty->ref_count)); +#endif + } + return tty; +} + +extern void free_tty_struct(struct tty_struct *tty); +static inline void +tty_put(struct tty_struct ** tty, int free_me, char *file, int line) +{ + int count; + + if (*tty) { + count = atomic_dec_return(&(*tty)->ref_count); + if (count == 0) { + WARN_ON(!free_me); + free_tty_struct(*tty); +#ifdef DEBUG_TTY_REFCOUNT + printk("TTY ref count put freed: %s:%d %s %p\n", + file, line, (*tty)->name, *tty); + } else { + printk("TTY ref count put: %s:%d %s %p count %d\n", + file, line, (*tty)->name, *tty, count); +#endif + } + *tty = NULL; + } +} + extern void tty_write_flush(struct tty_struct *); extern struct termios tty_std_termios; @@ -319,8 +358,6 @@ extern void tty_ldisc_put(int); extern void tty_wakeup(struct tty_struct *tty); extern void tty_ldisc_flush(struct tty_struct *tty); -extern struct mutex tty_mutex; - /* n_tty.c */ extern struct tty_ldisc tty_ldisc_N_TTY; diff --git a/kernel/exit.c b/kernel/exit.c index 6664c08..a6cc755 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -108,6 +108,7 @@ static void __exit_signal(struct task_st sig->nvcsw += tsk->nvcsw; sig->nivcsw += tsk->nivcsw; sig->sched_time += tsk->sched_time; + tty_put(&sig->tty, 0, __FILE__, __LINE__); sig = NULL; /* Marker for below. */ } @@ -391,9 +392,7 @@ void daemonize(const char *name, ...) exit_mm(current); set_special_pids(1, 1); - mutex_lock(&tty_mutex); - current->signal->tty = NULL; - mutex_unlock(&tty_mutex); + tty_put(¤t->signal->tty, 0, __FILE__, __LINE__); /* Block and flush all signals */ sigfillset(&blocked); diff --git a/kernel/fork.c b/kernel/fork.c index 56e4e07..90cc4e3 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -43,6 +43,7 @@ #include <linux/profile.h> #include <linux/rmap.h> #include <linux/acct.h> #include <linux/cn_proc.h> +#include <linux/tty.h> #include <asm/pgtable.h> #include <asm/pgalloc.h> @@ -856,6 +857,7 @@ static inline int copy_signal(unsigned l sig->leader = 0; /* session leadership doesn't inherit */ sig->tty_old_pgrp = 0; + sig->tty = NULL; sig->utime = sig->stime = sig->cutime = sig->cstime = cputime_zero; sig->nvcsw = sig->nivcsw = sig->cnvcsw = sig->cnivcsw = 0; @@ -885,6 +887,7 @@ static inline int copy_signal(unsigned l void __cleanup_signal(struct signal_struct *sig) { exit_thread_group_keys(sig); + tty_put(&sig->tty, 0, __FILE__, __LINE__); kmem_cache_free(signal_cachep, sig); } @@ -1227,7 +1230,8 @@ #endif __ptrace_link(p, current->parent); if (thread_group_leader(p)) { - p->signal->tty = current->signal->tty; + p->signal->tty = tty_get(current->signal->tty, + __FILE__, __LINE__); p->signal->pgrp = process_group(current); p->signal->session = current->signal->session; attach_pid(p, PIDTYPE_PGID, process_group(p)); diff --git a/kernel/sys.c b/kernel/sys.c index dbb3b9c..31f486a 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -1379,7 +1379,6 @@ asmlinkage long sys_setsid(void) pid_t session; int err = -EPERM; - mutex_lock(&tty_mutex); write_lock_irq(&tasklist_lock); /* Fail if I am already a session leader */ @@ -1399,12 +1398,11 @@ asmlinkage long sys_setsid(void) group_leader->signal->leader = 1; __set_special_pids(session, session); - group_leader->signal->tty = NULL; + tty_put(&group_leader->signal->tty, 0, __FILE__, __LINE__); group_leader->signal->tty_old_pgrp = 0; err = process_group(group_leader); out: write_unlock_irq(&tasklist_lock); - mutex_unlock(&tty_mutex); return err; } diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c index 24caaee..db53c4e 100644 --- a/security/selinux/hooks.c +++ b/security/selinux/hooks.c @@ -1638,7 +1638,8 @@ static inline void flush_unauthorized_fi if (inode_has_perm(current, inode, FILE__READ | FILE__WRITE, NULL)) { /* Reset controlling tty. */ - current->signal->tty = NULL; + tty_put(¤t->signal->tty, 0, + __FILE__, __LINE__); current->signal->tty_old_pgrp = 0; } } ^ permalink raw reply related [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 12:28 ` Jon Smirl @ 2006-07-11 13:15 ` Paulo Marques 2006-07-11 13:42 ` Jon Smirl 0 siblings, 1 reply; 30+ messages in thread From: Paulo Marques @ 2006-07-11 13:15 UTC (permalink / raw) To: Jon Smirl; +Cc: Alan Cox, Theodore Tso, lkml Jon Smirl wrote: >[...] > extern struct tty_driver *ptm_driver; /* Unix98 pty masters; for /dev/ptmx */ > @@ -158,21 +157,21 @@ static struct tty_struct *alloc_tty_stru > { > struct tty_struct *tty; > > - tty = kmalloc(sizeof(struct tty_struct), GFP_KERNEL); > - if (tty) > - memset(tty, 0, sizeof(struct tty_struct)); > + tty = kmalloc(sizeof(*tty), GFP_KERNEL); ^^^^^^^ kzalloc? I don't know this code very well, so you might be actually changing the initialization here. On the other hand, this might be just a simple bug... -- Paulo Marques - www.grupopie.com Pointy-Haired Boss: I don't see anything that could stand in our way. Dilbert: Sanity? Reality? The laws of physics? ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 13:15 ` Paulo Marques @ 2006-07-11 13:42 ` Jon Smirl 0 siblings, 0 replies; 30+ messages in thread From: Jon Smirl @ 2006-07-11 13:42 UTC (permalink / raw) To: Paulo Marques; +Cc: Alan Cox, Theodore Tso, lkml On 7/11/06, Paulo Marques <pmarques@grupopie.com> wrote: > Jon Smirl wrote: > >[...] > > extern struct tty_driver *ptm_driver; /* Unix98 pty masters; for /dev/ptmx */ > > @@ -158,21 +157,21 @@ static struct tty_struct *alloc_tty_stru > > { > > struct tty_struct *tty; > > > > - tty = kmalloc(sizeof(struct tty_struct), GFP_KERNEL); > > - if (tty) > > - memset(tty, 0, sizeof(struct tty_struct)); > > + tty = kmalloc(sizeof(*tty), GFP_KERNEL); > ^^^^^^^ > kzalloc? It needs to be cleaned up. Right now it clears the struct twice before using it, once in tty_alloc and another in tty_init. > I don't know this code very well, so you might be actually changing the > initialization here. On the other hand, this might be just a simple bug... -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 1:29 ` Theodore Tso 2006-07-11 2:16 ` Jon Smirl @ 2006-07-11 3:33 ` Jon Smirl 2006-07-11 19:52 ` Russell King 2006-07-11 19:44 ` Russell King 2 siblings, 1 reply; 30+ messages in thread From: Jon Smirl @ 2006-07-11 3:33 UTC (permalink / raw) To: Theodore Tso, Alan Cox, lkml On 7/10/06, Theodore Tso <tytso@mit.edu> wrote: > On Mon, Jul 10, 2006 at 07:49:31PM -0400, Jon Smirl wrote: > > How about the use of lock/unlock_kernel(). Is there some hidden global > > synchronization going on? Every time lock/unlock_kernel() is used > > there is a tty_struct available. My first thought would be to turn > > this into a per tty spinlock. Looking at where it is used it looks > > like it was added to protect all of the VFS calls. I see no obvious > > coordination with other ttys that isn't handled by other locks. > > No, it was just a case of not being worth it to get rid of the BKL for > the tty subsystem, since opening and closing tty's isn't exactly a > common event. Switching it to use a per-tty spinlock makes sense if > we're going to rototill the code, but to be honest it's probably not > going to make a noticeable difference on any benchmark and most > workloads. I tried changing it to a per tty spinlock. Now I know that the code relies on the BKL being broken when a task sleeps. That's a part of the BKL I don't like, unlocks are happening implicitly instead of explicitly. I always wonder if the code should have reacquired the BKL when it woke up. In this case the sleep happens inside the line discipline read functions. Now I'm wondering if read and write should have taken BKL to begin with. Read sleeps in a loop so after the first pass it has lost the BKL. Read and write also have the own locking code internally. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 3:33 ` Jon Smirl @ 2006-07-11 19:52 ` Russell King 0 siblings, 0 replies; 30+ messages in thread From: Russell King @ 2006-07-11 19:52 UTC (permalink / raw) To: Jon Smirl; +Cc: Theodore Tso, Alan Cox, lkml On Mon, Jul 10, 2006 at 11:33:18PM -0400, Jon Smirl wrote: > I tried changing it to a per tty spinlock. Now I know that the code > relies on the BKL being broken when a task sleeps. Not broken - it's unlocked only for the duration of the sleep. To put it another way, the BKL is dropped by the scheduler prior to switching away, and is reacquired by the scheduler when resuming a thread which previously was BKL-locked. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 1:29 ` Theodore Tso 2006-07-11 2:16 ` Jon Smirl 2006-07-11 3:33 ` Jon Smirl @ 2006-07-11 19:44 ` Russell King 2006-07-11 22:08 ` Jon Smirl 2 siblings, 1 reply; 30+ messages in thread From: Russell King @ 2006-07-11 19:44 UTC (permalink / raw) To: Theodore Tso, Jon Smirl, Alan Cox, lkml On Mon, Jul 10, 2006 at 09:29:04PM -0400, Theodore Tso wrote: > On Mon, Jul 10, 2006 at 07:49:31PM -0400, Jon Smirl wrote: > > How about the use of lock/unlock_kernel(). Is there some hidden global > > synchronization going on? Every time lock/unlock_kernel() is used > > there is a tty_struct available. My first thought would be to turn > > this into a per tty spinlock. Looking at where it is used it looks > > like it was added to protect all of the VFS calls. I see no obvious > > coordination with other ttys that isn't handled by other locks. > > No, it was just a case of not being worth it to get rid of the BKL for > the tty subsystem, since opening and closing tty's isn't exactly a > common event. Switching it to use a per-tty spinlock makes sense if > we're going to rototill the code, but to be honest it's probably not > going to make a noticeable difference on any benchmark and most > workloads. It's not that simple - remember that you must be able to open a tty in non-blocking mode while someone else is opening the same tty in blocking mode, _and_ succeed. (iow, the getty waiting for call-in and you want to dial out case.) If we go for merely replacing BKL with some other lock, each tty driver has to be able to drop that lock when it decides to sleep due to no carrier in its open method... which is kind'a yuck. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: 2.6 Serial core ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 19:44 ` Russell King @ 2006-07-11 22:08 ` Jon Smirl 2006-07-11 22:37 ` Alan Cox 0 siblings, 1 reply; 30+ messages in thread From: Jon Smirl @ 2006-07-11 22:08 UTC (permalink / raw) To: Theodore Tso, Jon Smirl, Alan Cox, lkml On 7/11/06, Russell King <rmk+lkml@arm.linux.org.uk> wrote: > On Mon, Jul 10, 2006 at 09:29:04PM -0400, Theodore Tso wrote: > > On Mon, Jul 10, 2006 at 07:49:31PM -0400, Jon Smirl wrote: > > > How about the use of lock/unlock_kernel(). Is there some hidden global > > > synchronization going on? Every time lock/unlock_kernel() is used > > > there is a tty_struct available. My first thought would be to turn > > > this into a per tty spinlock. Looking at where it is used it looks > > > like it was added to protect all of the VFS calls. I see no obvious > > > coordination with other ttys that isn't handled by other locks. > > > > No, it was just a case of not being worth it to get rid of the BKL for > > the tty subsystem, since opening and closing tty's isn't exactly a > > common event. Switching it to use a per-tty spinlock makes sense if > > we're going to rototill the code, but to be honest it's probably not > > going to make a noticeable difference on any benchmark and most > > workloads. > > It's not that simple - remember that you must be able to open a tty > in non-blocking mode while someone else is opening the same tty in > blocking mode, _and_ succeed. (iow, the getty waiting for call-in > and you want to dial out case.) > > If we go for merely replacing BKL with some other lock, each tty > driver has to be able to drop that lock when it decides to sleep due > to no carrier in its open method... which is kind'a yuck. Directly replacing BKL was my first strategy but that looks to be hard. Some of the sleeps are in the line discipline code which means I have to find all of them. After replacing BKL I'd try to simplify the locking. What about adjusting things so the BKL isn't required? I tried completely removing it and died in release_dev. tty_mutex is already locks a lot of stuff, maybe it can be adjusted to allow removal of the BKL. I'm also trying to decide if read/write really need BKL when they are called. Read/write already uses the atomic_read/write_lock mutexes. read_chan has this comment: * Perform reads for the line discipline. We are guaranteed that the * line discipline will not be closed under us but we may get multiple * parallel readers and must handle this ourselves. We may also get * a hangup. Always called in user context, may sleep. But read_chan is always called from tty_io.c with BKL held. So is atomic_read_lock really doing anything or is it masked by BKL? I see why no one works on this code, it is very intertwined with the rest of the kernel and a lot of the reasons for locking are non-obvious. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 22:08 ` Jon Smirl @ 2006-07-11 22:37 ` Alan Cox 2006-07-11 23:28 ` Paul Fulghum 2006-07-11 23:50 ` Jon Smirl 0 siblings, 2 replies; 30+ messages in thread From: Alan Cox @ 2006-07-11 22:37 UTC (permalink / raw) To: Jon Smirl; +Cc: Theodore Tso, lkml Ar Maw, 2006-07-11 am 18:08 -0400, ysgrifennodd Jon Smirl: > What about adjusting things so the BKL isn't required? I tried > completely removing it and died in release_dev. tty_mutex is already > locks a lot of stuff, maybe it can be adjusted to allow removal of the > BKL. Thats what is happening currently. However it is being done piece by piece, slowly and carefully. > I see why no one works on this code, it is very intertwined with the > rest of the kernel and a lot of the reasons for locking are > non-obvious. You should follow l/k more closely. Since 2.6.15 Paul Fulghum and I have completely rewritten the entire buffering logic. In 2.6.14 or so I rewrote the line discipline locking and support code. One hint by the way - stop looking at locks and code, look at locks and data structures. There is an old saying "lock data not code" and it really is true if you want to follow the locking and get it right. The open/close/hangup logic is last on the list to fix, because as you've noticed its the most horrible. Once the other locking is sane that bit should become more managable even with the strict and bizarre rules POSIX and SuS enforce on us in this area. Alan ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 22:37 ` Alan Cox @ 2006-07-11 23:28 ` Paul Fulghum 2006-07-12 0:00 ` Jon Smirl 2006-07-11 23:50 ` Jon Smirl 1 sibling, 1 reply; 30+ messages in thread From: Paul Fulghum @ 2006-07-11 23:28 UTC (permalink / raw) To: Alan Cox; +Cc: Jon Smirl, Theodore Tso, lkml Alan Cox wrote: > Ar Maw, 2006-07-11 am 18:08 -0400, ysgrifennodd Jon Smirl: > >>What about adjusting things so the BKL isn't required? I tried >>completely removing it and died in release_dev. tty_mutex is already >>locks a lot of stuff, maybe it can be adjusted to allow removal of the >>BKL. > > > Thats what is happening currently. However it is being done piece by > piece, slowly and carefully. I hate to chime in since I don't have time in the near term to contribute to the subject, but I do like the idea of removing the BKL dependence as a first step. I find its semantics akward to keep track of, and error prone. More explicit locking, even global, would clear things up for a later push to finer grained (per tty?) locking (where appropriate). Making the necessary changes to all the individual drivers, as Russel's comment about explicitly dropping the new lock when sleeping pointed out, would be a time consuming (and probably tedious) task. -- Paul ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 23:28 ` Paul Fulghum @ 2006-07-12 0:00 ` Jon Smirl 0 siblings, 0 replies; 30+ messages in thread From: Jon Smirl @ 2006-07-12 0:00 UTC (permalink / raw) To: Paul Fulghum; +Cc: Alan Cox, Theodore Tso, lkml On 7/11/06, Paul Fulghum <paulkf@microgate.com> wrote: > Alan Cox wrote: > > Ar Maw, 2006-07-11 am 18:08 -0400, ysgrifennodd Jon Smirl: > > > >>What about adjusting things so the BKL isn't required? I tried > >>completely removing it and died in release_dev. tty_mutex is already > >>locks a lot of stuff, maybe it can be adjusted to allow removal of the > >>BKL. > > > > > > Thats what is happening currently. However it is being done piece by > > piece, slowly and carefully. > > I hate to chime in since I don't have time in the near term > to contribute to the subject, but I do like the idea of removing > the BKL dependence as a first step. I find its semantics akward to keep > track of, and error prone. More explicit locking, even global, would clear things > up for a later push to finer grained (per tty?) locking (where appropriate). > > Making the necessary changes to all the individual drivers, > as Russel's comment about explicitly dropping the new lock when > sleeping pointed out, would be a time consuming (and probably > tedious) task. I'm still looking at doing work in the tty layer but it is turning out to be more complex that I initially thought. I may give removing the BLK another attempt when I understand things better. I have all of the lock debugging turned on in my kernel. I did get nice messages when I did dumb things like sleeping with locks held or deadlocking that led to quickly fixing the code. Much better than the old system of freezing the box up. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 22:37 ` Alan Cox 2006-07-11 23:28 ` Paul Fulghum @ 2006-07-11 23:50 ` Jon Smirl 2006-07-12 3:55 ` Jon Smirl 2006-07-12 11:37 ` Alan Cox 1 sibling, 2 replies; 30+ messages in thread From: Jon Smirl @ 2006-07-11 23:50 UTC (permalink / raw) To: Alan Cox; +Cc: Theodore Tso, lkml On 7/11/06, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > Ar Maw, 2006-07-11 am 18:08 -0400, ysgrifennodd Jon Smirl: > > What about adjusting things so the BKL isn't required? I tried > > completely removing it and died in release_dev. tty_mutex is already > > locks a lot of stuff, maybe it can be adjusted to allow removal of the > > BKL. > > Thats what is happening currently. However it is being done piece by > piece, slowly and carefully. > > > I see why no one works on this code, it is very intertwined with the > > rest of the kernel and a lot of the reasons for locking are > > non-obvious. > > You should follow l/k more closely. Since 2.6.15 Paul Fulghum and I have > completely rewritten the entire buffering logic. In 2.6.14 or so I > rewrote the line discipline locking and support code. I had noticed that code looked new and quite clean, I have not seen any problems with it. > One hint by the way - stop looking at locks and code, look at locks and > data structures. There is an old saying "lock data not code" and it > really is true if you want to follow the locking and get it right. > > The open/close/hangup logic is last on the list to fix, because as > you've noticed its the most horrible. Once the other locking is sane > that bit should become more managable even with the strict and bizarre > rules POSIX and SuS enforce on us in this area. My original goal was to do some work on the VT layer but I got sucked into the TTY code because of VT/TTY interactions. I think I understand enough now that I can make changes in the VT code without breaking everything. I also see now that the VT code wasn't as closely intertwined into the TTY code as much as I initially thought it was. So getting back to my VT problem, I want to fully decouple the VT code from the TTY code. By decoupling I mean make the VT code use APIs and not have #ifdef CONFIG_VT sprinkled all over the place. There are fifteen #ifdef CONFIG_VT's is general kernel code and about twenty more in arch specific code. One #ifdef CONFIG_VT is in tty_init(). The tty layer is being initialized via module_init(tty_init) which is the same as device_initcall(fn). Link order is pci, video, acpi, char, serial, base. This doesn't look right to me since the video console drivers need the tty code to function. This may also explain why the init functions are all chained together. tty_init() -> vty_init() -> vcs_init(), kbd_init(), prom_con_init(), etc... Since the link order is wrong the chained init functions are compensating. The fix for this is to get things linking in the right order, add module_init() where needed then all of the chained init()'s can be removed. -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 23:50 ` Jon Smirl @ 2006-07-12 3:55 ` Jon Smirl 2006-07-12 11:37 ` Alan Cox 1 sibling, 0 replies; 30+ messages in thread From: Jon Smirl @ 2006-07-12 3:55 UTC (permalink / raw) To: Alan Cox; +Cc: Theodore Tso, lkml How exactly are /dev/tty0 and /dev/console supposed to work? I thought that /dev/tty0 was the currently active tty. And /dev/console is where the printk output is going. /dev/tty0 appears to be snapshoting the currently active tty when it is opened. It does not track changes to the foreground console. It looks like TIOCCONS effects both /dev/tty0 and /dev/console. tty0 uses console_fops with redirected_tty_write(). Am I reading the code right, and are these the correct behaviors? What is the equivalent to the setconsole command for Fedora? -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-11 23:50 ` Jon Smirl 2006-07-12 3:55 ` Jon Smirl @ 2006-07-12 11:37 ` Alan Cox 1 sibling, 0 replies; 30+ messages in thread From: Alan Cox @ 2006-07-12 11:37 UTC (permalink / raw) To: Jon Smirl; +Cc: Theodore Tso, lkml Ar Maw, 2006-07-11 am 19:50 -0400, ysgrifennodd Jon Smirl: > My original goal was to do some work on the VT layer but I got sucked > into the TTY code because of VT/TTY interactions. I think I understand > enough now that I can make changes in the VT code without breaking > everything. I also see now that the VT code wasn't as closely > intertwined into the TTY code as much as I initially thought it was. VT is just an instance of a tty driver, at least from the tty layer viewpoint. > This may also explain why the init functions are all chained together. > tty_init() -> vty_init() -> vcs_init(), kbd_init(), prom_con_init(), > etc... Since the link order is wrong the chained init functions are > compensating. Its a bit more fundamental than that, there are various video side init functions that are done before the module_init calls are made. The VT is both a tty driver and a console. The only real "magic" hooks in there are the resize one and the ioctl hook. The resize one could easily be moved to be a tty driver method that is usually NULL and would be a nice cleanup. Alan ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-10 22:35 ` Jon Smirl 2006-07-10 23:15 ` Alan Cox @ 2006-07-10 23:39 ` Theodore Tso 2006-07-11 0:25 ` Jon Smirl 2006-07-12 6:27 ` Pekka Enberg 1 sibling, 2 replies; 30+ messages in thread From: Theodore Tso @ 2006-07-10 23:39 UTC (permalink / raw) To: Jon Smirl; +Cc: Alan Cox, lkml On Mon, Jul 10, 2006 at 06:35:50PM -0400, Jon Smirl wrote: > On 7/10/06, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > >We hold file_list_lock because we have to find everyone using that tty > >and hang up their instance of it, then flip the file operations not > >because we need to protect against tty structs going away. It's needed > >in order to walk the file list and protects against the file list itself > >changing rather than the tty structs. It may well be possible to move > >that to a tty layer private lock with care, but it would need care to > >deal with VFS operations. > > Assuming do_SAK has blocked anyone's ability to newly open the tty, > why does it need to search every file handle in the system instead of > just using tty->tty_files? tty->tty_files should contain a list of > everyone who has the tty open. Is this global search needed because of > duplicated handles? When I wrote the do_SAK code about 12-13 years ago, tty->tty_files didn't exist. It should be safe to do this, but I'll echo Alan's comment. We really ought to implement revoke(2) at the VFS layer, and then utilize to implement SAK and vhangup() functionality. - Ted ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-10 23:39 ` Theodore Tso @ 2006-07-11 0:25 ` Jon Smirl 2006-07-12 6:27 ` Pekka Enberg 1 sibling, 0 replies; 30+ messages in thread From: Jon Smirl @ 2006-07-11 0:25 UTC (permalink / raw) To: Theodore Tso; +Cc: Alan Cox, lkml On 7/10/06, Theodore Tso <tytso@mit.edu> wrote: > On Mon, Jul 10, 2006 at 06:35:50PM -0400, Jon Smirl wrote: > > On 7/10/06, Alan Cox <alan@lxorguk.ukuu.org.uk> wrote: > > >We hold file_list_lock because we have to find everyone using that tty > > >and hang up their instance of it, then flip the file operations not > > >because we need to protect against tty structs going away. It's needed > > >in order to walk the file list and protects against the file list itself > > >changing rather than the tty structs. It may well be possible to move > > >that to a tty layer private lock with care, but it would need care to > > >deal with VFS operations. > > > > Assuming do_SAK has blocked anyone's ability to newly open the tty, > > why does it need to search every file handle in the system instead of > > just using tty->tty_files? tty->tty_files should contain a list of > > everyone who has the tty open. Is this global search needed because of > > duplicated handles? > > When I wrote the do_SAK code about 12-13 years ago, tty->tty_files > didn't exist. It should be safe to do this, but I'll echo Alan's > comment. We really ought to implement revoke(2) at the VFS layer, and > then utilize to implement SAK and vhangup() functionality. I'll buy you lunch if you do it. My understanding of the subtleties is barely enough for me to work on the tty layer. > > - Ted > -- Jon Smirl jonsmirl@gmail.com ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-10 23:39 ` Theodore Tso 2006-07-11 0:25 ` Jon Smirl @ 2006-07-12 6:27 ` Pekka Enberg 2006-07-12 11:19 ` Alan Cox 1 sibling, 1 reply; 30+ messages in thread From: Pekka Enberg @ 2006-07-12 6:27 UTC (permalink / raw) To: Theodore Tso, Jon Smirl, Alan Cox, lkml Hi, On 7/11/06, Theodore Tso <tytso@mit.edu> wrote: > When I wrote the do_SAK code about 12-13 years ago, tty->tty_files > didn't exist. It should be safe to do this, but I'll echo Alan's > comment. We really ought to implement revoke(2) at the VFS layer, and > then utilize to implement SAK and vhangup() functionality. How is this supposed to work? What's stopping a process from re-opening the file after revoke(2) has been called? Or am I missing something here? Pekka ^ permalink raw reply [flat|nested] 30+ messages in thread
* Re: tty's use of file_list_lock and file_move 2006-07-12 6:27 ` Pekka Enberg @ 2006-07-12 11:19 ` Alan Cox 0 siblings, 0 replies; 30+ messages in thread From: Alan Cox @ 2006-07-12 11:19 UTC (permalink / raw) To: Pekka Enberg; +Cc: Theodore Tso, Jon Smirl, lkml Ar Mer, 2006-07-12 am 09:27 +0300, ysgrifennodd Pekka Enberg: > How is this supposed to work? What's stopping a process from > re-opening the file after revoke(2) has been called? File permissions. ^ permalink raw reply [flat|nested] 30+ messages in thread
end of thread, other threads:[~2006-07-12 11:19 UTC | newest] Thread overview: 30+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-07-10 15:10 tty's use of file_list_lock and file_move Jon Smirl 2006-07-10 17:33 ` Alan Cox 2006-07-10 17:27 ` Jon Smirl 2006-07-10 18:05 ` Alan Cox 2006-07-10 18:09 ` Jon Smirl 2006-07-10 23:18 ` Alan Cox 2006-07-10 22:35 ` Jon Smirl 2006-07-10 23:15 ` Alan Cox 2006-07-10 23:04 ` Jon Smirl 2006-07-10 23:49 ` Jon Smirl 2006-07-11 1:29 ` Theodore Tso 2006-07-11 2:16 ` Jon Smirl 2006-07-11 10:12 ` Alan Cox 2006-07-11 12:28 ` Jon Smirl 2006-07-11 13:15 ` Paulo Marques 2006-07-11 13:42 ` Jon Smirl 2006-07-11 3:33 ` Jon Smirl 2006-07-11 19:52 ` Russell King 2006-07-11 19:44 ` Russell King 2006-07-11 22:08 ` Jon Smirl 2006-07-11 22:37 ` Alan Cox 2006-07-11 23:28 ` Paul Fulghum 2006-07-12 0:00 ` Jon Smirl 2006-07-11 23:50 ` Jon Smirl 2006-07-12 3:55 ` Jon Smirl 2006-07-12 11:37 ` Alan Cox 2006-07-10 23:39 ` Theodore Tso 2006-07-11 0:25 ` Jon Smirl 2006-07-12 6:27 ` Pekka Enberg 2006-07-12 11:19 ` Alan Cox
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox