From: Richard Weinberger <richard@nod.at>
To: Bill Pringlemeir <bpringlemeir@nbsps.com>,
"Wiedemer, Thorsten (Lawo AG)" <Thorsten.Wiedemer@lawo.com>
Cc: "linux-mtd@lists.infradead.org" <linux-mtd@lists.infradead.org>
Subject: Re: UBI leb_write_unlock NULL pointer Oops (continuation)
Date: Fri, 21 Feb 2014 19:12:53 +0100 [thread overview]
Message-ID: <53079725.6080105@nod.at> (raw)
In-Reply-To: <877g8ojqsn.fsf@nbsps.com>
Am 21.02.2014 18:53, schrieb Bill Pringlemeir:
> On 21 Feb 2014, Thorsten.Wiedemer@lawo.com wrote:
>
>> Here I have the last parts of the two hopefully "valuable" traces. The
>> pieces show what happens between the leb_write_lock() und the
>> leb_write unlock() of the process which triggers the oops. If this is
>> not enough, I can provide also more ... Take care of the PIDs, there
>> are several processes with the same name running. Boths traces did
>> not result from identical test cases, so there are some different
>> processes running.
>
>> The part of the first trace fits in one paste:
>
>> http://pastebin.com/TL3yNVcw
>
> I am not sure if this is a complete trace? I don't understand why we
> would start off with 'leb_write_unlock'... However,
>
> $ grep -E 'leb|ltree' ubi.crash.raw # slight edits
> sync-7493 0.... 1348758715us : leb_write_unlock <-ubi_eba_write_leb
> sync-7493 0.... 1348758719us : add_preempt_count <-leb_write_unlock
> sync-7493 0...1 1348758724us : up_write <-leb_write_unlock
> sync-7493 0...1 1348758739us : kfree <-leb_write_unlock
> sync-7493 0...1 1348758746us : sub_preempt_count <-leb_write_unlock
> sync-7493 0.... 1348758954us : ubifs_leb_write <-ubifs_wbuf_sync_nolock
> sync-7493 0.... 1348758960us : ubi_leb_write <-ubifs_leb_write
> sync-7493 0.... 1348758965us : ubi_eba_write_leb <-ubi_leb_write
> sync-7493 0.... 1348758969us : leb_write_lock <-ubi_eba_write_leb
> sync-7493 0.... 1348758973us : ltree_add_entry <-leb_write_lock
> sync-7493 0.... 1348758977us : kmem_cache_alloc_trace <-ltree_add_entry
> sync-7493 0.... 1348758983us : add_preempt_count <-ltree_add_entry
> sync-7493 0...1 1348758989us : sub_preempt_count <-ltree_add_entry
> sync-7493 0.... 1348758994us : kfree <-ltree_add_entry
> sync-7493 0.... 1348758998us : down_write <-leb_write_lock
> sync-7493 0.... 1348759010us : ubi_io_write <-ubi_eba_write_leb
> < many reschedules, but sync-7493 still active.>
> sync-7492 0.... 1348761397us : leb_write_unlock <-ubi_eba_write_leb
> sync-7492 0.... 1348761400us : add_preempt_count <-leb_write_unlock
> sync-7492 0...1 1348761406us : up_write <-leb_write_unlock
> sync-7492 0...1 1348761419us : kfree <-leb_write_unlock
> sync-7492 0...1 1348761425us : sub_preempt_count <-leb_write_unlock
> sync-7492 0.... 1348761580us : ubifs_leb_write <-ubifs_write_node
> sync-7492 0.... 1348761585us : ubi_leb_write <-ubifs_leb_write
> sync-7492 0.... 1348761591us : ubi_eba_write_leb <-ubi_leb_write
> sync-7492 0.... 1348761595us : leb_write_lock <-ubi_eba_write_leb
> sync-7492 0.... 1348761599us : ltree_add_entry <-leb_write_lock
> sync-7492 0.... 1348761603us : kmem_cache_alloc_trace <-ltree_add_entry
> sync-7492 0.... 1348761609us : add_preempt_count <-ltree_add_entry
> sync-7492 0...1 1348761615us : sub_preempt_count <-ltree_add_entry
> sync-7492 0.... 1348761619us : kfree <-ltree_add_entry
> sync-7492 0.... 1348761622us : down_write <-leb_write_lock
> sync-7492 0.... 1348761635us : ubi_io_write <-ubi_eba_write_leb
> sync-7492 0.... 1348771081us : leb_write_unlock <-ubi_eba_write_leb
> sync-7492 0.... 1348771084us : add_preempt_count <-leb_write_unlock
> sync-7492 0...1 1348771090us : up_write <-leb_write_unlock
> sync-7492 0...1 1348771102us : kfree <-leb_write_unlock
> sync-7492 0...1 1348771109us : sub_preempt_count <-leb_write_unlock
> sync-7492 0.... 1348771269us : ubifs_leb_write <-ubifs_write_node
> sync-7492 0.... 1348771275us : ubi_leb_write <-ubifs_leb_write
> sync-7492 0.... 1348771280us : ubi_eba_write_leb <-ubi_leb_write
> sync-7492 0.... 1348771285us : leb_write_lock <-ubi_eba_write_leb
> sync-7492 0.... 1348771289us : ltree_add_entry <-leb_write_lock
> sync-7492 0.... 1348771292us : kmem_cache_alloc_trace <-ltree_add_entry
> sync-7492 0.... 1348771299us : add_preempt_count <-ltree_add_entry
> sync-7492 0...1 1348771304us : sub_preempt_count <-ltree_add_entry
> sync-7492 0.... 1348771308us : kfree <-ltree_add_entry
> sync-7492 0.... 1348771311us : down_write <-leb_write_lock
> sync-7492 0.... 1348771324us : ubi_io_write <-ubi_eba_write_leb
> < many reschedules and sync-7493 still in ubi_eba_write_leb>
> sync-7493 0.... 1348781410us : leb_write_unlock <-ubi_eba_write_leb
> sync-7493 0.... 1348781413us : add_preempt_count <-leb_write_unlock
> sync-7493 0...1 1348781418us : up_write <-leb_write_unlock
> ...
>
> I am not sure that we can know which LEBs are involved. However, I see
> cases for a double free and other issues. I think we need
> 'atomic_dec_and_test()' on the leb->users and to check this in the tree
> lookup. For instance, in 'leb_write_unlock()', the call to 'up_write()'
> can cause a reschedule. Say we enter 'leb_write_unlock()' with 'users =
> 2'. The count will decrement to one and then we may reschedule in
> 'up_write'. The 2nd UBI task may decrement users and inspect the count
> and call 'kfree'. Then we return to the original 'leb_write_unlock()',
> we will inspect 'le->users' and it will be zero.
>
> static void leb_write_unlock(struct ubi_device *ubi, int vol_id, int lnum)
> {
> struct ubi_ltree_entry *le;
>
> spin_lock(&ubi->ltree_lock);
> le = ltree_lookup(ubi, vol_id, lnum);
> le->users -= 1;
> ubi_assert(le->users >= 0);
> up_write(&le->mutex); /* can cause reschedule */
> if (le->users == 0) {
> rb_erase(&le->rb, &ubi->ltree);
> kfree(le);
> }
> spin_unlock(&ubi->ltree_lock);
> }
>
> Maybe there are better kernel design's/APIs that can wrap this.
> However, I don't think the 'ubi_ltree_entry' is perfectly race free.
>
> On the IMX series, the MTD driver does a 'read-modify-write' to support
> sub-pages. The driver can not write to a subpage at a time. The MTD
> driver is also interrupt driven and several reschedules happened during
> ubi_io_write().
>
> I don't think the spin_lock() will do anything on a UP system like the
> ARM926's that have encountered this issue.
Also on UP a spin_lock disables preemption.
So, le->users is protected.
Thanks,
//richard
next prev parent reply other threads:[~2014-02-21 18:13 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-03 8:51 UBI leb_write_unlock NULL pointer Oops (continuation) Wiedemer, Thorsten (Lawo AG)
2014-02-03 9:38 ` Richard Weinberger
2014-02-03 10:31 ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-03 11:02 ` Richard Weinberger
2014-02-03 12:51 ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-03 13:56 ` Richard Weinberger
2014-02-04 7:22 ` Artem Bityutskiy
2014-02-04 7:46 ` Richard Weinberger
2014-02-04 7:54 ` Artem Bityutskiy
2014-02-04 15:45 ` UBI leb_write_unlock NULL pointer Oops (continuation) on ARM926 Bill Pringlemeir
2014-02-04 15:45 ` Bill Pringlemeir
2014-02-04 17:05 ` Bill Pringlemeir
2014-02-04 17:05 ` Bill Pringlemeir
2014-02-04 19:57 ` Bill Pringlemeir
2014-02-04 19:57 ` Bill Pringlemeir
2014-02-04 20:07 ` Richard Weinberger
2014-02-04 20:07 ` Richard Weinberger
2014-02-04 17:01 ` AW: UBI leb_write_unlock NULL pointer Oops (continuation) Wiedemer, Thorsten (Lawo AG)
2014-02-04 17:52 ` Wiedemer, Thorsten (Lawo AG)
2014-02-05 8:29 ` Richard Weinberger
2014-02-05 21:45 ` Bill Pringlemeir
2014-02-05 22:13 ` Richard Weinberger
2014-02-05 22:23 ` Bill Pringlemeir
2014-02-06 13:05 ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-06 16:00 ` Bill Pringlemeir
2014-02-11 8:01 ` Wiedemer, Thorsten (Lawo AG)
2014-02-11 15:25 ` Bill Pringlemeir
2014-02-12 15:18 ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-12 17:46 ` Richard Weinberger
2014-02-12 18:11 ` AW: AW: " Bill Pringlemeir
2014-02-12 18:21 ` Bill Pringlemeir
2014-02-12 20:48 ` Richard Weinberger
2014-02-14 17:11 ` Bill Pringlemeir
2014-02-18 8:25 ` Ziegler, Emanuel (Lawo AG)
2014-02-19 11:09 ` Ziegler, Emanuel (Lawo AG)
2014-02-20 15:21 ` AW: AW: AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-20 17:26 ` Bill Pringlemeir
2014-02-20 17:38 ` Bill Pringlemeir
2014-02-21 8:55 ` AW: AW: AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-21 9:28 ` Quiniou, Benoit (Lawo AG)
2014-02-21 17:53 ` AW: " Bill Pringlemeir
2014-02-21 18:12 ` Richard Weinberger [this message]
2014-02-21 19:45 ` Bill Pringlemeir
2014-02-22 0:49 ` Bill Pringlemeir
2014-02-22 8:32 ` Richard Weinberger
2014-02-24 15:09 ` Bill Pringlemeir
2014-02-24 15:36 ` Richard Weinberger
2014-02-24 15:45 ` Bill Pringlemeir
2014-02-24 15:48 ` Bill Pringlemeir
2014-03-05 20:57 ` Richard Weinberger
2014-03-05 21:30 ` Bill Pringlemeir
2014-03-05 21:42 ` Bill Pringlemeir
2014-03-05 23:11 ` Richard Weinberger
2014-03-05 23:12 ` Richard Weinberger
2014-02-04 19:49 ` Andrew Ruder
2014-02-05 8:39 ` AW: " Wiedemer, Thorsten (Lawo AG)
2014-02-05 20:13 ` Andrew Ruder
2015-10-16 12:17 ` Wojciech Nizinski
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=53079725.6080105@nod.at \
--to=richard@nod.at \
--cc=Thorsten.Wiedemer@lawo.com \
--cc=bpringlemeir@nbsps.com \
--cc=linux-mtd@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.