All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bill Pringlemeir <bpringlemeir@nbsps.com>
To: "Wiedemer, Thorsten (Lawo AG)" <Thorsten.Wiedemer@lawo.com>
Cc: Richard Weinberger <richard@nod.at>,
	"linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>
Subject: Re: UBI leb_write_unlock NULL pointer Oops (continuation)
Date: Thu, 20 Feb 2014 12:26:42 -0500	[thread overview]
Message-ID: <8738jdofu5.fsf@nbsps.com> (raw)
In-Reply-To: D7B1B5F4F3F27A4CB073BF422331203F2A250B53F4@Exchange1.lawo.de


>> Bill Pringlemeir wrote:

>> Disassembly of section .data:

>> 00000000 <.data>:
>> 0:   e48a7004        str     r7, [sl], #4
>> 4:   e5985004        ldr     r5, [r8, #4]
>> 8:   e15a0005        cmp     sl, r5
>> c:   0a000029        beq     0xb8
>> 10:   e595300c        ldr     r3, [r5, #12]

>> 'r5' is NULL.  It seems to be the same symptom.  If you run your ARM objdump
>> 	with -S on either vmlinux or '__up_write', it will help confirm that
>> 	it is the list corrupted again.  The assembler above should match.

On 20 Feb 2014, Thorsten.Wiedemer@lawo.com wrote:

> I don't have running a objdump on my ARM system at the moment, but
> 	rwsem-spinlock.c compiled with debug info, objdump -S -D gives for
> 	__up_write():
> ...
> 	sem->activity = 0;
> 29c:	e3a07000 	mov	r7, #0
> 2a0:	e1a0a008 	mov	sl, r8

> 2a4:	e48a7004 	str	r7, [sl], #4
> 2a8:	e5985004 	ldr	r5, [r8, #4]
> 	if (!list_empty(&sem->wait_list))
> 2ac:	e15a0005 	cmp	sl, r5
> 2b0: 0a000029 beq 35c <__up_write+0xe0> /* if we are allowed to wake writers
> 	try to grant a single write lock * if there's a writer at the front of
> 	the queue * - we leave the 'waiting count' incremented to signify
> 	potential * contention */ if (waiter->flags & RWSEM_WAITING_FOR_WRITE)
> 	{
> 2b4:	e595300c 	ldr	r3, [r5, #12]
> {
> ...

> Seems to match ...

It doesn't matter where it runs.  I just want to make sure it is always
the 'waiter' variable.

>> What is 'RAVENNA_streame'?  Is this your standard test and not the
>> '8k binary' copy test or are you doing the copy test with this
>> process also running?

> This is an application which runs parallel to our copy test. The last
> days, Emanuel set up another test environment which seems to reproduce
> the error more reliably (at least on some hardwares, not on all).  At
> the moment, there are running proprietary applications in parallel,
> but I'll try to strip it down to a sequence which I can provide you,
> if you like.

I think scheduling is important to this issue, that is why I asked.

> We could reproduce the error now with function tracing enabled, so we
> have two hopefully valuable traces. But they are rather big (around
> 4MB each). Shall I use pastebin and cut them in several peaces to
> provide them? Or off-list as email attachment?  The trace Emanuel
> posted Wednesday may be not valuable. Perhaps there is a (different)
> error triggered due to memory pressure caused by the function tracing.

After looking, the allocation is not due to memory pressure.  It is due
to different tasks waiting on the rwsem with 'waiter' allocated on the
stack; I guess the task is gone, handling a signal or something
else. However, the function traces are great.  As you note they are
rather big, so it will take anyone some time to analyze them.

You could alter '__rwsem_do_wake',

static inline struct rw_semaphore *
__rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
{
	struct rwsem_waiter *waiter;
	struct task_struct *tsk;
	int woken;

	waiter = list_entry(sem->wait_list.next, struct rwsem_waiter, list);
+       if(!waiter) {
+          printk("Bad rwsem\n");
+          printk("activity is %d.\n", sem->activity);
+          BUG();
+       }
	if (waiter->type == RWSEM_WAITING_FOR_WRITE) {
		if (wakewrite)

... or something like that.

 * the rw-semaphore definition
 * - if activity is 0 then there are no active readers or writers
 * - if activity is +ve then that is the number of active readers
 * - if activity is -1 then there is one active writer
 * - if wait_list is not empty, then there are processes waiting...

It seems inconsistent to have a non-empty list with activity as 0 as
well?  The above is trying to trace when we find a 'NULL' in the
'wait_list', which always seems to be the issue, but probably not the
root cause.

You can also put similar code in '__rwsem_wake_one_writer' if you
instead get the 'up_read()' fault.

Fwiw,
Bill Pringlemeir.

  reply	other threads:[~2014-02-20 17:34 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-02-03  8:51 UBI leb_write_unlock NULL pointer Oops (continuation) Wiedemer, Thorsten (Lawo AG)
2014-02-03  9:38 ` Richard Weinberger
2014-02-03 10:31   ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-03 11:02     ` Richard Weinberger
2014-02-03 12:51       ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-03 13:56         ` Richard Weinberger
2014-02-04  7:22           ` Artem Bityutskiy
2014-02-04  7:46             ` Richard Weinberger
2014-02-04  7:54               ` Artem Bityutskiy
2014-02-04 15:45                 ` UBI leb_write_unlock NULL pointer Oops (continuation) on ARM926 Bill Pringlemeir
2014-02-04 15:45                   ` Bill Pringlemeir
2014-02-04 17:05                   ` Bill Pringlemeir
2014-02-04 17:05                     ` Bill Pringlemeir
2014-02-04 19:57                     ` Bill Pringlemeir
2014-02-04 19:57                       ` Bill Pringlemeir
2014-02-04 20:07                       ` Richard Weinberger
2014-02-04 20:07                         ` Richard Weinberger
2014-02-04 17:01           ` AW: UBI leb_write_unlock NULL pointer Oops (continuation) Wiedemer, Thorsten (Lawo AG)
2014-02-04 17:52             ` Wiedemer, Thorsten (Lawo AG)
2014-02-05  8:29             ` Richard Weinberger
2014-02-05 21:45               ` Bill Pringlemeir
2014-02-05 22:13                 ` Richard Weinberger
2014-02-05 22:23                   ` Bill Pringlemeir
2014-02-06 13:05                     ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-06 16:00                       ` Bill Pringlemeir
2014-02-11  8:01               ` Wiedemer, Thorsten (Lawo AG)
2014-02-11 15:25                 ` Bill Pringlemeir
2014-02-12 15:18                   ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-12 17:46                     ` Richard Weinberger
2014-02-12 18:11                     ` AW: AW: " Bill Pringlemeir
2014-02-12 18:21                       ` Bill Pringlemeir
2014-02-12 20:48                         ` Richard Weinberger
2014-02-14 17:11                           ` Bill Pringlemeir
2014-02-18  8:25                           ` Ziegler, Emanuel (Lawo AG)
2014-02-19 11:09                             ` Ziegler, Emanuel (Lawo AG)
2014-02-20 15:21                       ` AW: AW: AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-20 17:26                         ` Bill Pringlemeir [this message]
2014-02-20 17:38                           ` Bill Pringlemeir
2014-02-21  8:55                         ` AW: AW: AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-21  9:28                           ` Quiniou, Benoit (Lawo AG)
2014-02-21 17:53                           ` AW: " Bill Pringlemeir
2014-02-21 18:12                             ` Richard Weinberger
2014-02-21 19:45                               ` Bill Pringlemeir
2014-02-22  0:49                                 ` Bill Pringlemeir
2014-02-22  8:32                                   ` Richard Weinberger
2014-02-24 15:09                                     ` Bill Pringlemeir
2014-02-24 15:36                                       ` Richard Weinberger
2014-02-24 15:45                                         ` Bill Pringlemeir
2014-02-24 15:48                                           ` Bill Pringlemeir
2014-03-05 20:57                                             ` Richard Weinberger
2014-03-05 21:30                                               ` Bill Pringlemeir
2014-03-05 21:42                                                 ` Bill Pringlemeir
2014-03-05 23:11                                                   ` Richard Weinberger
2014-03-05 23:12                                                   ` Richard Weinberger
2014-02-04 19:49     ` Andrew Ruder
2014-02-05  8:39       ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-05 20:13         ` Andrew Ruder
2015-10-16 12:17 ` Wojciech Nizinski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8738jdofu5.fsf@nbsps.com \
    --to=bpringlemeir@nbsps.com \
    --cc=Thorsten.Wiedemer@lawo.com \
    --cc=linux-mtd@lists.infradead.org \
    --cc=richard@nod.at \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.