linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* RE: Problem of concurrency in arch/ppc/8260_io/uart.c
@ 2003-09-15 16:26 Jean-Denis Boyer
  2003-09-15 18:37 ` Joakim Tjernlund
  2003-09-15 18:53 ` Dan Malek
  0 siblings, 2 replies; 42+ messages in thread
From: Jean-Denis Boyer @ 2003-09-15 16:26 UTC (permalink / raw)
  To: Steffen Rumler; +Cc: linuxppc


I have just tested your patch on 8260 and it fixes the problem when printk is called from interrupt context while data is printed from user mode application. GREAT!

I also ported the patch to 860 (as said earlier, code is almost the same), but it hung at boot up, just after printing "Calibrating delay loop..." :-(

I remember I had such problems in the past (on 860), when calling "spin_lock_irqsave/restore" from interrupt context. If I use function "in_interrupt()" to avoid calling them from interrupt, everything works fine, and the bug is effectively fixed.

I just don't know where is the problem??? It looks like there is a bug on 860 using "local_irq_save/restore" in interrupt context?!?

I'm working with 2.4.19.

--------------------------------------------
 Jean-Denis Boyer, Eng.
 Software Designer
 M5T Centre d'Excellence en Télécom Inc.
 4283 Garlock Street
 Sherbrooke (Québec)
 J1L 2C8  CANADA
 (819)829-3972 x241
--------------------------------------------


> -----Original Message-----
> From: Steffen Rumler [mailto:Steffen.Rumler@siemens.com]
> Sent: 15 septembre, 2003 06:18
> To: linuxppc
> Subject: Re: Problem of concurrency in arch/ppc/8260_io/uart.c
>
>
> Hi,
>
> I have seen a similar problem for the 2.4.20.
>
> When I force a lot of console output via the following command:
>
>    while true; do cd /; ls -R; done
>
> and type-in some letters in parallel, the console
> becomes crazy.
>
> I have added some instrumentation in order to dump the
> TX Buffer Descriptor Table. I have found that the
> hardware pointer (TBPTR) and the software pointer (tx_cur)
> are not more synchronized together:
>
>  >> make new rlogin session <<
> /root# cd /proc/driver
>
> /root# cat uart-bdtables; cat mpc82xx/smc1_pram | grep SMC1_PRAM_TBPTR
>
> TX BD table
>    (000 at 0xfff005f0) status: 0x1000 len: 0001 addr: 0x001bb084
>    (001 at 0xfff005f8) status: 0x1000 len: 0001 addr: 0x001bb0a4
> * (002 at 0xfff00600) status: 0x1000 len: 0001 addr: 0x001bb0c4
>    (003 at 0xfff00608) status: 0x3000 len: 0004 addr: 0x001bb0e4
>     SMC1_PRAM_TBPTR            0x20         2     0600
>
>     --> hardware and software pointer still synchronized
>
>     >> force console to become crazy (see above) <<
>
> /root# cat uart-bdtables; cat mpc82xx/smc1_pram | grep SMC1_PRAM_TBPTR
>
> TX BD table (tbptr: 0x00000088)
>    (000 at 0xfff005f0) status: 0x1000 len: 0003 addr: 0x001bb084
> * (001 at 0xfff005f8) status: 0x1000 len: 0021 addr: 0x001bb0a4
>    (002 at 0xfff00600) status: 0x1000 len: 0001 addr: 0x001bb0c4
>    (003 at 0xfff00608) status: 0x3000 len: 0001 addr: 0x001bb0e4
>     SMC1_PRAM_TBPTR            0x20         2     0600
>
>     --> hardware and software pointer NOT more synchronized
>
>     >> make additional console output: echo foo >/dev/console <<
>
> /root# cat uart-bdtables; cat mpc82xx/smc1_pram | grep SMC1_PRAM_TBPTR
>
> * (000 at 0xfff005f0) status: 0x1000 len: 0003 addr: 0x001bb084
>    (001 at 0xfff005f8) status: 0x9000 len: 0004 addr: 0x001bb0a4
>    (002 at 0xfff00600) status: 0x1000 len: 0001 addr: 0x001bb0c4
>    (003 at 0xfff00608) status: 0x3000 len: 0001 addr: 0x001bb0e4
>     SMC1_PRAM_TBPTR            0x20         2     05f0
>
>     --> hardware pointer hangs at 0x5f0 because R-Bit not set, but
>         at 0x5f8
>
> Inside uart.c, there are the following output routines:
>
>    rs_8xx_put_char()
>    rs_8xx_write()
>    rs_8xx_send_xchar()
>    my_console_write()
>
> I think there must be a synchronization accessing the
> TX BD table. I suggest the patch attached.
>
>
> Best Regards
> Steffen
> --
>
>
> --------------------------------------------------------------
>
> Steffen Rumler
> ICN CP D NT SW 7
> Siemens AG
> Hofmannstr. 51                 Email: Steffen.Rumler@siemens.com
> D-81359 Munich                 Phone: +49 89 722-44061
> Germany                        Fax  : +49 89 722-36703
>
> --------------------------------------------------------------
>
>
>

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 42+ messages in thread
[parent not found: <3F8E8817.5ACD7ACB@siemens.com>]
* Re: Problem of concurrency in arch/ppc/8260_io/uart.c
@ 2003-09-15 10:18 Steffen Rumler
  2003-09-15 13:30 ` Joakim Tjernlund
  0 siblings, 1 reply; 42+ messages in thread
From: Steffen Rumler @ 2003-09-15 10:18 UTC (permalink / raw)
  To: linuxppc

[-- Attachment #1: Type: text/plain, Size: 2619 bytes --]

Hi,

I have seen a similar problem for the 2.4.20.

When I force a lot of console output via the following command:

   while true; do cd /; ls -R; done

and type-in some letters in parallel, the console
becomes crazy.

I have added some instrumentation in order to dump the
TX Buffer Descriptor Table. I have found that the
hardware pointer (TBPTR) and the software pointer (tx_cur)
are not more synchronized together:

 >> make new rlogin session <<
/root# cd /proc/driver

/root# cat uart-bdtables; cat mpc82xx/smc1_pram | grep SMC1_PRAM_TBPTR

TX BD table
   (000 at 0xfff005f0) status: 0x1000 len: 0001 addr: 0x001bb084
   (001 at 0xfff005f8) status: 0x1000 len: 0001 addr: 0x001bb0a4
* (002 at 0xfff00600) status: 0x1000 len: 0001 addr: 0x001bb0c4
   (003 at 0xfff00608) status: 0x3000 len: 0004 addr: 0x001bb0e4
    SMC1_PRAM_TBPTR            0x20         2     0600

    --> hardware and software pointer still synchronized

    >> force console to become crazy (see above) <<

/root# cat uart-bdtables; cat mpc82xx/smc1_pram | grep SMC1_PRAM_TBPTR

TX BD table (tbptr: 0x00000088)
   (000 at 0xfff005f0) status: 0x1000 len: 0003 addr: 0x001bb084
* (001 at 0xfff005f8) status: 0x1000 len: 0021 addr: 0x001bb0a4
   (002 at 0xfff00600) status: 0x1000 len: 0001 addr: 0x001bb0c4
   (003 at 0xfff00608) status: 0x3000 len: 0001 addr: 0x001bb0e4
    SMC1_PRAM_TBPTR            0x20         2     0600

    --> hardware and software pointer NOT more synchronized

    >> make additional console output: echo foo >/dev/console <<

/root# cat uart-bdtables; cat mpc82xx/smc1_pram | grep SMC1_PRAM_TBPTR

* (000 at 0xfff005f0) status: 0x1000 len: 0003 addr: 0x001bb084
   (001 at 0xfff005f8) status: 0x9000 len: 0004 addr: 0x001bb0a4
   (002 at 0xfff00600) status: 0x1000 len: 0001 addr: 0x001bb0c4
   (003 at 0xfff00608) status: 0x3000 len: 0001 addr: 0x001bb0e4
    SMC1_PRAM_TBPTR            0x20         2     05f0

    --> hardware pointer hangs at 0x5f0 because R-Bit not set, but
        at 0x5f8

Inside uart.c, there are the following output routines:

   rs_8xx_put_char()
   rs_8xx_write()
   rs_8xx_send_xchar()
   my_console_write()

I think there must be a synchronization accessing the
TX BD table. I suggest the patch attached.


Best Regards
Steffen
--


--------------------------------------------------------------

Steffen Rumler
ICN CP D NT SW 7
Siemens AG
Hofmannstr. 51                 Email: Steffen.Rumler@siemens.com
D-81359 Munich                 Phone: +49 89 722-44061
Germany                        Fax  : +49 89 722-36703

--------------------------------------------------------------



[-- Attachment #2: uart.c.patch --]
[-- Type: text/plain, Size: 3270 bytes --]

diff -Naur old/uart.c new/uart.c
--- old/uart.c	Mon Sep 15 11:52:02 2003
+++ new/uart.c	Mon Sep 15 11:51:32 2003
@@ -44,6 +44,7 @@
 #include <linux/slab.h>
 #include <linux/init.h>
 #include <linux/delay.h>
+#include <linux/spinlock.h>
 #include <asm/uaccess.h>
 #include <asm/immap_8260.h>
 #include <asm/mpc8260.h>
@@ -253,6 +254,11 @@
 	cbd_t			*rx_cur;
 	cbd_t			*tx_bd_base;
 	cbd_t			*tx_cur;
+
+        /*  for output synchronization
+         */
+        spinlock_t output_lock;
+        
 } ser_info_t;
 
 static void change_speed(ser_info_t *info);
@@ -1010,6 +1016,7 @@
 {
 	ser_info_t *info = (ser_info_t *)tty->driver_data;
 	volatile cbd_t	*bdp;
+        unsigned long flags;
 
 	if (serial_paranoia_check(info, tty->device, "rs_put_char"))
 		return;
@@ -1017,6 +1024,8 @@
 	if (!tty)
 		return;
 
+        spin_lock_irqsave(&(info->output_lock), flags);
+
 	bdp = info->tx_cur;
 	while (bdp->cbd_sc & BD_SC_READY);
 
@@ -1033,6 +1042,8 @@
 
 	info->tx_cur = (cbd_t *)bdp;
 
+        spin_unlock_irqrestore(&(info->output_lock), flags);
+
 }
 
 static int rs_8xx_write(struct tty_struct * tty, int from_user,
@@ -1041,6 +1052,7 @@
 	int	c, ret = 0;
 	ser_info_t *info = (ser_info_t *)tty->driver_data;
 	volatile cbd_t *bdp;
+        unsigned long flags;
 
 	if (serial_paranoia_check(info, tty->device, "rs_write"))
 		return 0;
@@ -1048,6 +1060,8 @@
 	if (!tty) 
 		return 0;
 
+        spin_lock_irqsave(&(info->output_lock), flags);
+
 	bdp = info->tx_cur;
 
 	while (1) {
@@ -1086,6 +1100,9 @@
 			bdp++;
 		info->tx_cur = (cbd_t *)bdp;
 	}
+
+        spin_unlock_irqrestore(&(info->output_lock), flags);
+
 	return ret;
 }
 
@@ -1143,12 +1160,15 @@
 static void rs_8xx_send_xchar(struct tty_struct *tty, char ch)
 {
 	volatile cbd_t	*bdp;
+        unsigned long flags;
 
 	ser_info_t *info = (ser_info_t *)tty->driver_data;
 
 	if (serial_paranoia_check(info, tty->device, "rs_send_char"))
 		return;
 
+        spin_lock_irqsave(&(info->output_lock), flags);
+
 	bdp = info->tx_cur;
 	while (bdp->cbd_sc & BD_SC_READY);
 
@@ -1164,6 +1184,8 @@
 		bdp++;
 
 	info->tx_cur = (cbd_t *)bdp;
+
+        spin_unlock_irqrestore(&(info->output_lock), flags);
 }
 
 /*
@@ -2227,9 +2249,11 @@
 	volatile	cbd_t		*bdp, *bdbase;
 	volatile	smc_uart_t	*up;
 	volatile	u_char		*cp;
+        unsigned long   flags;
 
 	ser = rs_table + idx;
 
+
 	/* If the port has been initialized for general use, we have
 	 * to use the buffer descriptors allocated there.  Otherwise,
 	 * we simply use the single buffer allocated.
@@ -2237,6 +2261,7 @@
 	if ((info = (ser_info_t *)ser->info) != NULL) {
 		bdp = info->tx_cur;
 		bdbase = info->tx_bd_base;
+                spin_lock_irqsave(&(info->output_lock), flags);
 	}
 	else {
 		/* Pointer to UART in parameter ram.
@@ -2309,6 +2334,9 @@
 
 	if (info)
 		info->tx_cur = (cbd_t *)bdp;
+
+        if (info)
+            spin_unlock_irqrestore(&(info->output_lock), flags);
 }
 
 static void serial_console_write(struct console *c, const char *s,
@@ -2764,6 +2792,7 @@
 			info->tqueue_hangup.data = info;
 			info->line = i;
 			info->state = state;
+                        spin_lock_init(&(info->output_lock));
 			state->info = (struct async_struct *)info;
 
 			/* We need to allocate a transmit and receive buffer

^ permalink raw reply	[flat|nested] 42+ messages in thread
* RE: Problem of concurrency in arch/ppc/8260_io/uart.c
@ 2003-03-07 15:18 Jean-Denis Boyer
  0 siblings, 0 replies; 42+ messages in thread
From: Jean-Denis Boyer @ 2003-03-07 15:18 UTC (permalink / raw)
  To: Dayton, Dean; +Cc: Linux PPC embedded

[-- Attachment #1: Type: text/plain, Size: 1184 bytes --]


> Oh, I almost forgot. Yes I would like to see your test 
> program. One of the
> reasons I haven't chased the problem is because it keeps 
> disappearing;-)
> 
> Dean Dayton

Here is, in attachement, the test program.
There are two files in the tar:

  -- testprintk.c --
    A module that adds a device /dev/tpk.
    Once a file descriptor is obtained on this device,
    an internal timer starts, which calls "printk",
    each 10 ticks (100ms in my setup). When the file descriptor
    is released, the timer stops.

  -- main.c --
    A simple user mode program that opens /dev/tpk,
    and then performs an infinite loop printing to the stdout.
    When the program stops (Ctrl-C), the descriptor will be
    released by Linux, stopping the internal timer.

With that test, the problem almost immediatly appears.

BTW, the same problem is on the 860 also, which can be diagnosed the same way.

Regards,
--------------------------------------------
 Jean-Denis Boyer, B.Eng., Technical Leader
 Mediatrix Telecom Inc.
 4229 Garlock Street
 Sherbrooke (Québec)
 J1L 2C8  CANADA
 (819)829-8749 x241
--------------------------------------------

[-- Attachment #2: testprintk.tar.gz --]
[-- Type: application/x-gzip, Size: 1230 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread
* RE: Problem of concurrency in arch/ppc/8260_io/uart.c
@ 2003-03-07 12:52 Dayton, Dean
  0 siblings, 0 replies; 42+ messages in thread
From: Dayton, Dean @ 2003-03-07 12:52 UTC (permalink / raw)
  To: 'Jean-Denis Boyer', Linux PPC embedded


Yes, I have seen this problem when I have a lot of printks from interrupt
context on my 8255 board using an SCC as the console. It's been on my list
for awhile but I haven't done anything about it yet. I think the easy
approach would be to disable interrupts when manipulating the buffer
descriptors (in fact, there are comments in the driver that make me suspect
this used to be the done). But there would be a performance hit for this.
Anyone got a more elegant suggestion?

Dean Dayton

> -----Original Message-----
> From: Jean-Denis Boyer [mailto:jdboyer@mediatrix.com]
> Sent: Thursday, March 06, 2003 4:39 PM
> To: Linux PPC embedded
> Subject: Problem of concurrency in arch/ppc/8260_io/uart.c
>
>
>
> I have a problem of concurrency on my 8250 based custom
> board. I'm using kernel 2.4.19, and the output on the serial
> port goes crappy.
>
> It seems to appear when printk is called from interrupt
> context, while a user mode process is writing to stdout.
>
> I wrote a user mode program + a kernel module to reproduce
> the problem. The kernel module starts a timer that calls
> printk (34 chars) each 100ms. The user mode program performs
> a while(1) printf(...); That combination almost immediatly
> mess the serial driver, that begins to output characters in
> an unpredictable order, until we reboot. It never crashes,
> however, but it is annoying.
>
> Has anyone encoutered that problem?
> or has any suggestion before I go further?
> or wish to have the test program to give it a try?
>
> Thanks,
> --------------------------------------------
>  Jean-Denis Boyer, B.Eng., Technical Leader
>  Mediatrix Telecom Inc.
>  4229 Garlock Street
>  Sherbrooke (Québec)
>  J1L 2C8  CANADA
>  (819)829-8749 x241
> --------------------------------------------
>

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 42+ messages in thread
* Problem of concurrency in arch/ppc/8260_io/uart.c
@ 2003-03-06 21:39 Jean-Denis Boyer
  0 siblings, 0 replies; 42+ messages in thread
From: Jean-Denis Boyer @ 2003-03-06 21:39 UTC (permalink / raw)
  To: Linux PPC embedded


I have a problem of concurrency on my 8250 based custom board.
I'm using kernel 2.4.19, and the output on the serial port goes crappy.

It seems to appear when printk is called from interrupt context,
while a user mode process is writing to stdout.

I wrote a user mode program + a kernel module to reproduce the problem.
The kernel module starts a timer that calls printk (34 chars) each 100ms.
The user mode program performs a while(1) printf(...);
That combination almost immediatly mess the serial driver,
that begins to output characters in an unpredictable order, until we reboot.
It never crashes, however, but it is annoying.

Has anyone encoutered that problem?
or has any suggestion before I go further?
or wish to have the test program to give it a try?

Thanks,
--------------------------------------------
 Jean-Denis Boyer, B.Eng., Technical Leader
 Mediatrix Telecom Inc.
 4229 Garlock Street
 Sherbrooke (Québec)
 J1L 2C8  CANADA
 (819)829-8749 x241
--------------------------------------------

** Sent via the linuxppc-embedded mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2003-10-16 14:11 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-09-15 16:26 Problem of concurrency in arch/ppc/8260_io/uart.c Jean-Denis Boyer
2003-09-15 18:37 ` Joakim Tjernlund
2003-09-15 18:53 ` Dan Malek
2003-09-15 20:02   ` Joakim Tjernlund
2003-09-16 12:25     ` Joakim Tjernlund
2003-09-17  9:33       ` Joakim Tjernlund
2003-09-17 13:38         ` Joakim Tjernlund
2003-09-17 14:58         ` Dan Malek
2003-09-17 15:22           ` Joakim Tjernlund
2003-09-17 16:33             ` Joakim Tjernlund
2003-09-23  8:28               ` Joakim Tjernlund
2003-09-26 16:31               ` Tom Rini
2003-09-26 18:34                 ` Joakim Tjernlund
2003-09-26 18:38                   ` Tom Rini
2003-09-26 21:24               ` Tom Rini
2003-09-26 22:20                 ` Dan Malek
2003-09-26 22:39                   ` Joakim Tjernlund
2003-09-26 23:12                     ` Dan Malek
2003-09-27  8:07                       ` Joakim Tjernlund
2003-09-27 13:43                         ` Joakim Tjernlund
2003-09-29 15:33                           ` Tom Rini
2003-09-30 15:28                             ` Dan Malek
2003-10-01 14:26                               ` Kumar Gala
2003-10-01 14:32                                 ` Tom Rini
2003-10-01 14:48                                   ` Gary Thomas
2003-10-01 21:20                                     ` Joakim Tjernlund
2003-10-01 21:32                                       ` Tom Rini
2003-10-01 21:51                                         ` Joakim Tjernlund
2003-10-01 22:00                                           ` Tom Rini
2003-10-01 22:17                                             ` Joakim Tjernlund
2003-10-01 22:31                                               ` Tom Rini
2003-10-01 23:23                                                 ` Robin Gilks
2003-10-01 23:51                                                   ` Tom Rini
2003-10-02  0:47                                                     ` Wolfgang Denk
2003-10-02  6:03                                         ` Dan Kegel
2003-10-02 19:15                                           ` Tom Rini
     [not found] <3F8E8817.5ACD7ACB@siemens.com>
2003-10-16 14:11 ` Joakim Tjernlund
  -- strict thread matches above, loose matches on Subject: below --
2003-09-15 10:18 Steffen Rumler
2003-09-15 13:30 ` Joakim Tjernlund
2003-03-07 15:18 Jean-Denis Boyer
2003-03-07 12:52 Dayton, Dean
2003-03-06 21:39 Jean-Denis Boyer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).