From: Bill Pringlemeir <bpringlemeir@nbsps.com>
To: "Wiedemer, Thorsten (Lawo AG)" <Thorsten.Wiedemer@lawo.com>
Cc: Richard Weinberger <richard@nod.at>,
"linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>
Subject: Re: UBI leb_write_unlock NULL pointer Oops (continuation)
Date: Thu, 20 Feb 2014 12:26:42 -0500 [thread overview]
Message-ID: <8738jdofu5.fsf@nbsps.com> (raw)
In-Reply-To: D7B1B5F4F3F27A4CB073BF422331203F2A250B53F4@Exchange1.lawo.de
>> Bill Pringlemeir wrote:
>> Disassembly of section .data:
>> 00000000 <.data>:
>> 0: e48a7004 str r7, [sl], #4
>> 4: e5985004 ldr r5, [r8, #4]
>> 8: e15a0005 cmp sl, r5
>> c: 0a000029 beq 0xb8
>> 10: e595300c ldr r3, [r5, #12]
>> 'r5' is NULL. It seems to be the same symptom. If you run your ARM objdump
>> with -S on either vmlinux or '__up_write', it will help confirm that
>> it is the list corrupted again. The assembler above should match.
On 20 Feb 2014, Thorsten.Wiedemer@lawo.com wrote:
> I don't have running a objdump on my ARM system at the moment, but
> rwsem-spinlock.c compiled with debug info, objdump -S -D gives for
> __up_write():
> ...
> sem->activity = 0;
> 29c: e3a07000 mov r7, #0
> 2a0: e1a0a008 mov sl, r8
> 2a4: e48a7004 str r7, [sl], #4
> 2a8: e5985004 ldr r5, [r8, #4]
> if (!list_empty(&sem->wait_list))
> 2ac: e15a0005 cmp sl, r5
> 2b0: 0a000029 beq 35c <__up_write+0xe0> /* if we are allowed to wake writers
> try to grant a single write lock * if there's a writer at the front of
> the queue * - we leave the 'waiting count' incremented to signify
> potential * contention */ if (waiter->flags & RWSEM_WAITING_FOR_WRITE)
> {
> 2b4: e595300c ldr r3, [r5, #12]
> {
> ...
> Seems to match ...
It doesn't matter where it runs. I just want to make sure it is always
the 'waiter' variable.
>> What is 'RAVENNA_streame'? Is this your standard test and not the
>> '8k binary' copy test or are you doing the copy test with this
>> process also running?
> This is an application which runs parallel to our copy test. The last
> days, Emanuel set up another test environment which seems to reproduce
> the error more reliably (at least on some hardwares, not on all). At
> the moment, there are running proprietary applications in parallel,
> but I'll try to strip it down to a sequence which I can provide you,
> if you like.
I think scheduling is important to this issue, that is why I asked.
> We could reproduce the error now with function tracing enabled, so we
> have two hopefully valuable traces. But they are rather big (around
> 4MB each). Shall I use pastebin and cut them in several peaces to
> provide them? Or off-list as email attachment? The trace Emanuel
> posted Wednesday may be not valuable. Perhaps there is a (different)
> error triggered due to memory pressure caused by the function tracing.
After looking, the allocation is not due to memory pressure. It is due
to different tasks waiting on the rwsem with 'waiter' allocated on the
stack; I guess the task is gone, handling a signal or something
else. However, the function traces are great. As you note they are
rather big, so it will take anyone some time to analyze them.
You could alter '__rwsem_do_wake',
static inline struct rw_semaphore *
__rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
{
struct rwsem_waiter *waiter;
struct task_struct *tsk;
int woken;
waiter = list_entry(sem->wait_list.next, struct rwsem_waiter, list);
+ if(!waiter) {
+ printk("Bad rwsem\n");
+ printk("activity is %d.\n", sem->activity);
+ BUG();
+ }
if (waiter->type == RWSEM_WAITING_FOR_WRITE) {
if (wakewrite)
... or something like that.
* the rw-semaphore definition
* - if activity is 0 then there are no active readers or writers
* - if activity is +ve then that is the number of active readers
* - if activity is -1 then there is one active writer
* - if wait_list is not empty, then there are processes waiting...
It seems inconsistent to have a non-empty list with activity as 0 as
well? The above is trying to trace when we find a 'NULL' in the
'wait_list', which always seems to be the issue, but probably not the
root cause.
You can also put similar code in '__rwsem_wake_one_writer' if you
instead get the 'up_read()' fault.
Fwiw,
Bill Pringlemeir.
next prev parent reply other threads:[~2014-02-20 17:34 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-03 8:51 UBI leb_write_unlock NULL pointer Oops (continuation) Wiedemer, Thorsten (Lawo AG)
2014-02-03 9:38 ` Richard Weinberger
2014-02-03 10:31 ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-03 11:02 ` Richard Weinberger
2014-02-03 12:51 ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-03 13:56 ` Richard Weinberger
2014-02-04 7:22 ` Artem Bityutskiy
2014-02-04 7:46 ` Richard Weinberger
2014-02-04 7:54 ` Artem Bityutskiy
2014-02-04 15:45 ` UBI leb_write_unlock NULL pointer Oops (continuation) on ARM926 Bill Pringlemeir
2014-02-04 15:45 ` Bill Pringlemeir
2014-02-04 17:05 ` Bill Pringlemeir
2014-02-04 17:05 ` Bill Pringlemeir
2014-02-04 19:57 ` Bill Pringlemeir
2014-02-04 19:57 ` Bill Pringlemeir
2014-02-04 20:07 ` Richard Weinberger
2014-02-04 20:07 ` Richard Weinberger
2014-02-04 17:01 ` AW: UBI leb_write_unlock NULL pointer Oops (continuation) Wiedemer, Thorsten (Lawo AG)
2014-02-04 17:52 ` Wiedemer, Thorsten (Lawo AG)
2014-02-05 8:29 ` Richard Weinberger
2014-02-05 21:45 ` Bill Pringlemeir
2014-02-05 22:13 ` Richard Weinberger
2014-02-05 22:23 ` Bill Pringlemeir
2014-02-06 13:05 ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-06 16:00 ` Bill Pringlemeir
2014-02-11 8:01 ` Wiedemer, Thorsten (Lawo AG)
2014-02-11 15:25 ` Bill Pringlemeir
2014-02-12 15:18 ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-12 17:46 ` Richard Weinberger
2014-02-12 18:11 ` AW: AW: " Bill Pringlemeir
2014-02-12 18:21 ` Bill Pringlemeir
2014-02-12 20:48 ` Richard Weinberger
2014-02-14 17:11 ` Bill Pringlemeir
2014-02-18 8:25 ` Ziegler, Emanuel (Lawo AG)
2014-02-19 11:09 ` Ziegler, Emanuel (Lawo AG)
2014-02-20 15:21 ` AW: AW: AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-20 17:26 ` Bill Pringlemeir [this message]
2014-02-20 17:38 ` Bill Pringlemeir
2014-02-21 8:55 ` AW: AW: AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-21 9:28 ` Quiniou, Benoit (Lawo AG)
2014-02-21 17:53 ` AW: " Bill Pringlemeir
2014-02-21 18:12 ` Richard Weinberger
2014-02-21 19:45 ` Bill Pringlemeir
2014-02-22 0:49 ` Bill Pringlemeir
2014-02-22 8:32 ` Richard Weinberger
2014-02-24 15:09 ` Bill Pringlemeir
2014-02-24 15:36 ` Richard Weinberger
2014-02-24 15:45 ` Bill Pringlemeir
2014-02-24 15:48 ` Bill Pringlemeir
2014-03-05 20:57 ` Richard Weinberger
2014-03-05 21:30 ` Bill Pringlemeir
2014-03-05 21:42 ` Bill Pringlemeir
2014-03-05 23:11 ` Richard Weinberger
2014-03-05 23:12 ` Richard Weinberger
2014-02-04 19:49 ` Andrew Ruder
2014-02-05 8:39 ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-05 20:13 ` Andrew Ruder
2015-10-16 12:17 ` Wojciech Nizinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8738jdofu5.fsf@nbsps.com \
--to=bpringlemeir@nbsps.com \
--cc=Thorsten.Wiedemer@lawo.com \
--cc=linux-mtd@lists.infradead.org \
--cc=richard@nod.at \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.