All of lore.kernel.org
 help / color / mirror / Atom feed
From: Stanislav Kozina <skozina@redhat.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: linux-serial@vger.kernel.org
Subject: Patch for panic in n_tty_read()
Date: Mon, 25 Jun 2012 17:41:58 +0200	[thread overview]
Message-ID: <4FE886C6.7090606@redhat.com> (raw)

[-- Attachment #1: Type: text/plain, Size: 2781 bytes --]

Greg,

(my first suggested patch for Linux kernel, so please bear with me if I 
don't follow process correctly. Thank you.)

We had few customers who met panics in n_tty_read() with following 
backtrace:

      #8 [ffff880018b8dcd0] page_fault at ffffffff814ddfe5
         [exception RIP: n_tty_read+0x2c9]
<register output removed>
      #9 [ffff880018b8dea0] tty_read at ffffffff81300b16
     #10 [ffff880018b8def0] vfs_read at ffffffff81172f85
     #11 [ffff880018b8df30] sys_read at ffffffff811730c1
     #12 [ffff880018b8df80] system_call_fastpath at ffffffff8100b172

My patch for this panic is attached (tty_panic.patch), in short - I 
believe that we need to hold &tty->read_lock while checking 
tty->read_cnt in while-loop condition in n_tty_read() here:

1835                         while (nr && tty->read_cnt) {
1836                                 int eol;
1837
1838                                 eol = 
test_and_clear_bit(tty->read_tail,
1839                                                 tty->read_flags);
1840                                 c = tty->read_buf[tty->read_tail];
1841                                 spin_lock_irqsave(&tty->read_lock, 
flags);

We gave this patch to the customers, they were testing it for a month on 
several tens of machines without being able to reproduce the problem.

Please can you integrate the patch into the kernel?

My testing
====
The patch pulls the spinlock out of the while loop. This makes 
reset_buffer_flags() and others wait before changing either read_buf or 
read_cnt. So this should solve the issue - the question is if this can 
cause any deadlock.

I inserted the msleep(2000) when the lock is held. This should trigger 
any deadlock of possible.

I found out that:
1) n_tty_read() runs when I e.g. log on the serial console
2) reset_buffer_flags() is running when I push CTRL+C on any terminal

That's why my plan was to log on serial console, and than (in shorter 
time than 2 seconds) press CTRL+C on any terminal. I wrote a stap script 
to verify that both functions were running, the stap script is attached.
And this is part of the output from attached stap script:

1336070859652 -> reset_buffer_flags (PID: 225)
1336070859652 <- reset_buffer_flags (PID: 225)
1336070859652 <- n_tty_read (PID: 7527, retval: -512)
1336070859652 -> n_tty_read (PID: 7527)
1336070859654 -> n_tty_read (PID: 7502)
1336070859654 <- n_tty_read (PID: 7502, retval: 53)
1336070867135 -> n_tty_read (PID: 7498)
1336070867135 <- n_tty_read (PID: 7498, retval: 1)
1336070868260 -> reset_buffer_flags (PID: 237)
1336070868260 <- reset_buffer_flags (PID: 237)

It's clear to see when we were in the msleep(2000) - it forced 
n_tty_read to be delayed by 2 seconds in PID 7498. That's why even 
reset_buffer_flags was not running.

[-- Attachment #2: tty_panic.patch --]
[-- Type: text/plain, Size: 987 bytes --]

diff --git a/drivers/char/n_tty.c b/drivers/char/n_tty.c
index 2e50f4d..ace0c19 100644
--- a/drivers/char/n_tty.c
+++ b/drivers/char/n_tty.c
@@ -1813,13 +1813,13 @@ do_it_again:
 
 		if (tty->icanon) {
 			/* N.B. avoid overrun if nr == 0 */
+			spin_lock_irqsave(&tty->read_lock, flags);
 			while (nr && tty->read_cnt) {
 				int eol;
 
 				eol = test_and_clear_bit(tty->read_tail,
 						tty->read_flags);
 				c = tty->read_buf[tty->read_tail];
-				spin_lock_irqsave(&tty->read_lock, flags);
 				tty->read_tail = ((tty->read_tail+1) &
 						  (N_TTY_BUF_SIZE-1));
 				tty->read_cnt--;
@@ -1831,7 +1831,6 @@ do_it_again:
 					if (--tty->canon_data < 0)
 						tty->canon_data = 0;
 				}
-				spin_unlock_irqrestore(&tty->read_lock, flags);
 
 				if (!eol || (c != __DISABLED_CHAR)) {
 					if (tty_put_user(tty, c, b++)) {
@@ -1846,6 +1845,7 @@ do_it_again:
 					break;
 				}
 			}
+			spin_unlock_irqrestore(&tty->read_lock, flags);
 			if (retval)
 				break;
 		} else {

[-- Attachment #3: trace.stp --]
[-- Type: text/plain, Size: 511 bytes --]

probe kernel.function("reset_buffer_flags").call
{
	printf("%d -> %s (PID: %d)\n", gettimeofday_ms(), probefunc(), pid());
}

probe kernel.function("reset_buffer_flags").return
{
	printf("%d <- %s (PID: %d)\n", gettimeofday_ms(), probefunc(), pid());
}

probe kernel.function("n_tty_read").call
{
	printf("%d -> %s (PID: %d)\n", gettimeofday_ms(), probefunc(), pid());
}

probe kernel.function("n_tty_read").return
{
	printf("%d <- %s (PID: %d, retval: %d)\n", gettimeofday_ms(), probefunc(), pid(), $return)
}

             reply	other threads:[~2012-06-25 15:42 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-06-25 15:41 Stanislav Kozina [this message]
2012-06-26 14:21 ` Patch for panic in n_tty_read() Alan Cox
2012-07-20 12:18   ` Stanislav Kozina
2012-07-20 15:11     ` Alan Cox
2012-07-27 12:05       ` Stanislav Kozina
2012-07-27 12:50         ` Alan Cox
2012-07-30 11:58           ` Stanislav Kozina
2012-08-08  7:58       ` Stanislav Kozina
2012-08-08  9:00         ` Alan Cox
2012-08-08 12:09           ` Stanislav Kozina
2012-08-08 12:26             ` Alan Cox
2012-08-08 14:32               ` Stanislav Kozina
2012-08-08 14:28             ` [PATCH V2] [tty] Fix possible race " Stanislav Kozina
2012-08-08 15:27               ` Alan Cox
2012-08-09 11:16                 ` Stanislaw Gruszka
2012-08-13 15:26                   ` Stanislaw Gruszka
2012-08-14 11:15                     ` Stanislav Kozina
2012-08-09 11:24                 ` Stanislav Kozina
2012-08-09 12:35                   ` Alan Cox
2012-08-10 10:52                     ` Stanislav Kozina
2012-08-10 10:51                   ` [PATCH] Remove BUG_ON from n_tty_read() Stanislav Kozina
2012-08-10 12:29                     ` Stanislaw Gruszka
2012-08-10 14:53                       ` Stanislav Kozina
2012-08-10 14:38                     ` [PATCH V2] " Stanislav Kozina
2012-08-16  7:52                       ` Stanislav Kozina

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FE886C6.7090606@redhat.com \
    --to=skozina@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-serial@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.