linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 2/2] make both atomic_write_lock and BTM lock acquirement sleepable at tty_write_message()
@ 2012-06-30  9:40 Jeff Liu
  2012-06-30 12:44 ` Alan Cox
  0 siblings, 1 reply; 3+ messages in thread
From: Jeff Liu @ 2012-06-30  9:40 UTC (permalink / raw)
  To: linux-serial
  Cc: linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, gregkh,
	Jan Kara, Ted Ts'o

Avoid lockdep warn by having both lock acquirements sleepable.
BTM also have to go to sleep if lock failed at the first time, otherwise it
will race with tty_open()->tty_lock().

Signed-off-by: Jie Liu <jeff.liu@oracle.com>

---
 drivers/tty/tty_io.c |   35 +++++++++++++++++++++++++++++++----
 1 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index b425c79..2b4664d 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1098,12 +1098,40 @@ out:
  *
  * We must still hold the BTM and test the CLOSING flag for the moment.
  */
-
 void tty_write_message(struct tty_struct *tty, char *msg)
 {
+retry:
 	if (tty) {
-		mutex_lock(&tty->atomic_write_lock);
-		tty_lock();
+		/*
+		 * tty_write_message() will invoked by print_warning()
+		 * at fs/quota/dquot.c if CONFIG_PRINT_QUOTA_WARNING
+		 * is enabled when a user running out of disk quota limits.
+		 * It will end up call tty_write().  Here is a potential race
+		 * situation since tty_write() call copy_from_user() which
+		 * might produce page faults and turn to invoke inodes dirty
+		 * process on the underlying file systems if needed, and
+		 * it will try to acquire JBD/JBD2 lock accordingly.  This
+		 * might make lockdep unhappy to print dead lock warning.
+		 * To solve this issue, we have to go to sleep and relinquish
+		 * the CPU power to another process until the atomic_write_lock
+		 * became available.
+		 */
+		if (!mutex_trylock(&tty->atomic_write_lock)) {
+			DEFINE_WAIT(wait);
+			prepare_to_wait_exclusive(&tty->write_wait, &wait,
+						  TASK_INTERRUPTIBLE);
+			schedule();
+			finish_wait(&tty->write_wait, &wait);
+			goto retry;
+		}
+
+		/*
+		 * Call tty_lock() directly might race with tty_open() even
+		 * if we have already got the atomic_write_lock. However, we
+		 * must perform TTY_CLOSING check up with the BTM hold, so call
+		 * tty_lock_wait() which is sleepable instead.
+		 */
+		tty_lock_wait();
 		if (tty->ops->write && !test_bit(TTY_CLOSING, &tty->flags)) {
 			tty_unlock();
 			tty->ops->write(tty, msg, strlen(msg));
@@ -1114,7 +1142,6 @@ void tty_write_message(struct tty_struct *tty, char *msg)
 	return;
 }
 
-
 /**
  *	tty_write		-	write method for tty device file
  *	@file: tty file pointer
-- 
1.7.9

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH 2/2] make both atomic_write_lock and BTM lock acquirement sleepable at tty_write_message()
  2012-06-30  9:40 [PATCH 2/2] make both atomic_write_lock and BTM lock acquirement sleepable at tty_write_message() Jeff Liu
@ 2012-06-30 12:44 ` Alan Cox
  2012-06-30 13:35   ` Jeff Liu
  0 siblings, 1 reply; 3+ messages in thread
From: Alan Cox @ 2012-06-30 12:44 UTC (permalink / raw)
  To: jeff.liu
  Cc: linux-serial, linux-fsdevel@vger.kernel.org,
	linux-ext4@vger.kernel.org, gregkh, Jan Kara, Ted Ts'o

> +		 * tty_write_message() will invoked by print_warning()
> +		 * at fs/quota/dquot.c if CONFIG_PRINT_QUOTA_WARNING
> +		 * is enabled when a user running out of disk quota limits.
> +		 * It will end up call tty_write().  Here is a potential race

tty->ops->write is the low level write method, not tty_write.

This appears to be even more wrong than the other one in other ways too -
it uses interruptible sleeps but doesn't handle the signal case so will
spin on a signal and kill the box.

NAK

Looking gat the traces I suspect what you've actually got is a much more
complicated deadlock where a process doing perfectly normal I/O to the
tty has faulted and there is a chain of dependancies through the file
system code to the thread which is doing the dquot_alloc_inode.

If that is the case then dquot_alloc_inode shouldn't be making blocking
calls to tty_write_message and probably the right thing to do is to queue
work for it so the tty_write_message is done asynchronously.

There are a very limited number of events that need reporting so probably
something like a per mount flags and workqueue would allow you to do

	set_bit(DQUOT_INODEOVER, &foo->events);
	schedule_work()

and the work queue can just xchg the events long for 0 and spew any
messages required.

Alan

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH 2/2] make both atomic_write_lock and BTM lock acquirement sleepable at tty_write_message()
  2012-06-30 12:44 ` Alan Cox
@ 2012-06-30 13:35   ` Jeff Liu
  0 siblings, 0 replies; 3+ messages in thread
From: Jeff Liu @ 2012-06-30 13:35 UTC (permalink / raw)
  To: Alan Cox
  Cc: linux-serial, linux-fsdevel@vger.kernel.org,
	linux-ext4@vger.kernel.org, gregkh, Jan Kara, Ted Ts'o

Hey Alan,

On 06/30/2012 08:44 PM, Alan Cox wrote:

>> +		 * tty_write_message() will invoked by print_warning()
>> +		 * at fs/quota/dquot.c if CONFIG_PRINT_QUOTA_WARNING
>> +		 * is enabled when a user running out of disk quota limits.
>> +		 * It will end up call tty_write().  Here is a potential race
> 
> tty->ops->write is the low level write method, not tty_write.

I was wondering if below call trace is come from tty_write_message()->tty->ops->write()?
[ 2739.802106] -> #1 (&mm->mmap_sem){++++++}:
[ 2739.802120]        [<c10fa825>] lock_acquire+0x14e/0x189
[ 2739.802133]        [<c11e3e83>] might_fault+0xbf/0xf8
[ 2739.802154]        [<c13bcbdf>] _copy_from_user+0x40/0x8a
[ 2739.802175]        [<c14addc3>] copy_from_user+0x16/0x26
[ 2739.802195]        [<c14b00b4>] tty_write+0x282/0x3c7
[ 2739.802212]        [<c14b02dd>] redirected_tty_write+0xe4/0xfd
[ 2739.802226]        [<c1231879>] vfs_write+0xf5/0x1a3
[ 2739.802239]        [<c1231bdc>] sys_write+0x6c/0xa9
[ 2739.802253]        [<c186281f>] sysenter_do_call+0x12/0x38

> 
> This appears to be even more wrong than the other one in other ways too -
> it uses interruptible sleeps but doesn't handle the signal case so will
> spin on a signal and kill the box.
> 
> NAK
> 
> Looking gat the traces I suspect what you've actually got is a much more
> complicated deadlock where a process doing perfectly normal I/O to the
> tty has faulted and there is a chain of dependancies through the file
> system code to the thread which is doing the dquot_alloc_inode.
> 
> If that is the case then dquot_alloc_inode shouldn't be making blocking
> calls to tty_write_message and probably the right thing to do is to queue
> work for it so the tty_write_message is done asynchronously.
> 
> There are a very limited number of events that need reporting so probably
> something like a per mount flags and workqueue would allow you to do
> 
> 	set_bit(DQUOT_INODEOVER, &foo->events);
> 	schedule_work()
> 
> and the work queue can just xchg the events long for 0 and spew any
> messages required.

Thanks for the teaching, I'll give a try. 

-Jeff

> 
> Alan
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-06-30 13:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-06-30  9:40 [PATCH 2/2] make both atomic_write_lock and BTM lock acquirement sleepable at tty_write_message() Jeff Liu
2012-06-30 12:44 ` Alan Cox
2012-06-30 13:35   ` Jeff Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).