From: Peter Hurley <peter@hurleysoftware.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Dave Jones <davej@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
Rik van Riel <riel@surriel.com>,
Davidlohr Bueso <davidlohr.bueso@hp.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
hhuang@redhat.com, "Low, Jason" <jason.low2@hp.com>,
Michel Lespinasse <walken@google.com>,
Larry Woodman <lwoodman@redhat.com>,
"Vinod, Chegu" <chegu_vinod@hp.com>,
Stanislav Kinsbursky <skinsbursky@parallels.com>
Subject: Re: ipc,sem: sysv semaphore scalability
Date: Fri, 29 Mar 2013 15:36:14 -0400 [thread overview]
Message-ID: <1364585774.31320.9.camel@thor.lan> (raw)
In-Reply-To: <CA+55aFyj2zfumVUJ7JMhJkQY_LiyQYX+tK4Cu4Jg=g72-amC2g@mail.gmail.com>
On Fri, 2013-03-29 at 12:26 -0700, Linus Torvalds wrote:
> On Fri, Mar 29, 2013 at 12:06 PM, Dave Jones <davej@redhat.com> wrote:
> >
> > Here's an oops I just hit..
> >
> > BUG: unable to handle kernel NULL pointer dereference at 000000000000000f
> > IP: [<ffffffff812c24ca>] testmsg.isra.5+0x1a/0x60
>
> Btw, looking at the code leading up to this, what the f*ck is wrong
> with the IPC stuff?
>
> It's using the generic list stuff for most of the lists, but then it
> open-codes the accesses.
>
> So instead of using
>
> for_each_entry(walk_msg, &msq->q_messages, m_list) {
> ..
> }
>
> the ipc/msg.c code does all that by hand, with
>
> tmp = msq->q_messages.next;
> while (tmp != &msq->q_messages) {
> struct msg_msg *walk_msg;
>
> walk_msg = list_entry(tmp, struct msg_msg, m_list);
> ...
> tmp = tmp->next;
> }
>
> Ugh. The code is near unreadable. And then it has magic memory
> barriers etc, implying that it doesn't lock the data structures, but
> no comments about them. See expunge_all() and pipelined_send().
>
> The code seems entirely random, and it's badly set up (annoyance of
> the day: crazy helper functions in ipc/msgutil.c to make sure that (a)
> you have to spend more effort looking for them, and (b) they won't get
> inlined).
>
> Clearly nobody has cared for the crazy IPC message code in a long time.
Exactly that's what my patch series does; clean this mess up.
This is what I wrote to Andrew a couple of days ago.
On Tue, 2013-03-26 at 22:33 -0400, Peter Hurley wrote:
I just figured out how the queue is being corrupted and why my series
> fixes it.
>
>
> With MSG_COPY set, the queue scan can exit with the local variable
'msg'
> pointing to a real msg if the msg_counter never reaches the
copy_number.
>
> The remaining execution looks like this:
>
> if (!IS_ERR(msg)) {
> ....
> if (msgflg & MSG_COPY)
> goto out_unlock;
> ....
>
> out_unlock:
> msg_unlock(msq);
> break;
> }
> }
> if (IS_ERR(msg))
> ....
>
> bufsz = msg_handler();
> free_msg(msg); <<---- msg never unlinked
>
>
> Since the msg should not have been found (because it failed the match
> criteria), the if (!IS_ERR(msg)) clause should never have executed.
>
> That's why my refactor fixes resolve this; because msg is not
> inadvertently treated as a found msg.
>
> But let's be honest; the real bug here is the poor structure of this
> function that even made this possible. The deepest nesting executes a
> goto to a label in the middle of an if clause. Yuck! No wonder this
> thing's fragile.
>
> So my recommendation still stands. The series that fixes this has been
> getting tested in linux-next for a month. Fixing this some other way
is
> just asking for more trouble.
>
> But why not just revert MSG_COPY altogether for 3.9?
Regards,
Peter Hurley
next prev parent reply other threads:[~2013-03-29 19:36 UTC|newest]
Thread overview: 129+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-03-20 19:55 ipc,sem: sysv semaphore scalability Rik van Riel
2013-03-20 19:55 ` [PATCH 1/7] ipc: remove bogus lock comment for ipc_checkid Rik van Riel
2013-03-20 19:55 ` [PATCH 2/7] ipc: introduce obtaining a lockless ipc object Rik van Riel
2013-03-20 19:55 ` [PATCH 3/7] ipc: introduce lockless pre_down ipcctl Rik van Riel
2013-03-20 19:55 ` [PATCH 4/7] ipc,sem: do not hold ipc lock more than necessary Rik van Riel
2013-03-20 19:55 ` [PATCH 5/7] ipc,sem: open code and rename sem_lock Rik van Riel
2013-03-22 1:14 ` Davidlohr Bueso
2013-03-20 19:55 ` [PATCH 6/7] ipc,sem: have only one list in struct sem_queue Rik van Riel
2013-03-22 1:14 ` Davidlohr Bueso
2013-03-20 19:55 ` [PATCH 7/7] ipc,sem: fine grained locking for semtimedop Rik van Riel
2013-03-22 1:14 ` Davidlohr Bueso
2013-03-22 23:01 ` Michel Lespinasse
2013-03-22 23:38 ` Rik van Riel
2013-03-22 23:42 ` [PATCH 7/7 part3] fix for sem_lock Rik van Riel
2013-03-20 20:49 ` ipc,sem: sysv semaphore scalability Linus Torvalds
2013-03-20 20:56 ` Linus Torvalds
2013-03-20 20:57 ` Davidlohr Bueso
2013-03-21 21:10 ` Andrew Morton
2013-03-21 21:47 ` Peter Hurley
2013-03-21 21:50 ` Peter Hurley
2013-03-21 22:01 ` Andrew Morton
2013-03-22 3:38 ` Rik van Riel
2013-03-26 19:28 ` Dave Jones
2013-03-26 19:43 ` Andrew Morton
2013-03-29 16:17 ` Dave Jones
2013-03-29 18:00 ` Linus Torvalds
2013-03-29 18:04 ` Dave Jones
2013-03-29 18:10 ` Linus Torvalds
2013-03-29 18:43 ` Linus Torvalds
2013-03-29 19:06 ` Dave Jones
2013-03-29 19:13 ` Linus Torvalds
2013-03-29 19:26 ` Linus Torvalds
2013-03-29 19:36 ` Peter Hurley [this message]
2013-04-02 16:08 ` Sasha Levin
2013-04-02 17:24 ` Linus Torvalds
2013-04-02 17:52 ` Linus Torvalds
2013-04-02 19:53 ` Sasha Levin
2013-04-02 20:00 ` Dave Jones
2013-03-29 19:33 ` Peter Hurley
2013-03-29 19:54 ` Linus Torvalds
2013-04-01 7:40 ` Stanislav Kinsbursky
2013-03-29 20:41 ` Linus Torvalds
2013-03-29 21:12 ` Linus Torvalds
2013-03-29 23:16 ` Linus Torvalds
2013-03-30 1:36 ` Emmanuel Benisty
2013-03-30 2:08 ` Davidlohr Bueso
2013-03-30 3:02 ` Emmanuel Benisty
2013-03-30 3:46 ` Linus Torvalds
2013-03-30 4:33 ` Emmanuel Benisty
2013-03-30 5:10 ` Linus Torvalds
2013-03-30 5:57 ` Emmanuel Benisty
2013-03-30 17:22 ` Linus Torvalds
2013-03-31 2:38 ` Emmanuel Benisty
2013-03-31 5:01 ` Davidlohr Bueso
2013-03-31 13:45 ` Rik van Riel
2013-03-31 17:10 ` Linus Torvalds
2013-03-31 17:02 ` Emmanuel Benisty
2013-03-30 2:09 ` Linus Torvalds
2013-03-30 2:55 ` Davidlohr Bueso
2013-03-29 19:01 ` Dave Jones
2013-05-03 15:03 ` Peter Hurley
2013-03-22 1:12 ` Davidlohr Bueso
2013-03-22 1:23 ` Linus Torvalds
2013-03-22 3:40 ` Rik van Riel
2013-03-22 7:30 ` Mike Galbraith
2013-03-22 11:04 ` Emmanuel Benisty
2013-03-22 15:37 ` Linus Torvalds
2013-03-23 3:19 ` Emmanuel Benisty
2013-03-23 19:45 ` Linus Torvalds
2013-03-24 13:46 ` Emmanuel Benisty
2013-03-24 17:10 ` Linus Torvalds
2013-03-25 13:47 ` Emmanuel Benisty
2013-03-25 14:00 ` Rik van Riel
2013-03-25 14:03 ` Rik van Riel
2013-03-25 15:20 ` Emmanuel Benisty
2013-03-25 15:53 ` Rik van Riel
2013-03-25 17:09 ` Emmanuel Benisty
2013-03-25 14:01 ` Rik van Riel
2013-03-25 14:21 ` Emmanuel Benisty
2013-03-26 17:59 ` Davidlohr Bueso
2013-03-26 18:14 ` Rik van Riel
2013-03-26 18:35 ` Andrew Morton
2013-04-16 23:30 ` Andrew Morton
2013-05-04 15:55 ` Jörn Engel
2013-05-04 18:12 ` Borislav Petkov
2013-05-06 14:47 ` Jörn Engel
2013-03-22 17:51 ` Davidlohr Bueso
2013-03-25 20:21 ` Sasha Levin
2013-03-25 20:38 ` [PATCH -mm -next] ipc,sem: fix lockdep false positive Rik van Riel
2013-03-25 21:42 ` Michel Lespinasse
2013-03-25 21:51 ` Michel Lespinasse
2013-03-25 21:56 ` Sasha Levin
2013-03-25 21:52 ` Sasha Levin
2013-03-26 13:19 ` Peter Zijlstra
2013-03-26 13:40 ` Michel Lespinasse
2013-03-26 14:27 ` Peter Zijlstra
2013-03-26 15:19 ` Rik van Riel
2013-03-27 8:40 ` Peter Zijlstra
2013-03-27 8:42 ` Peter Zijlstra
2013-03-27 11:22 ` Michel Lespinasse
2013-03-27 12:02 ` Peter Zijlstra
2013-03-27 20:00 ` Rik van Riel
2013-03-28 20:23 ` [PATCH v2 " Rik van Riel
2013-03-29 2:50 ` Michel Lespinasse
2013-03-29 9:57 ` Peter Zijlstra
2013-03-29 13:21 ` Michel Lespinasse
2013-03-29 12:07 ` Rik van Riel
2013-03-29 13:08 ` Michel Lespinasse
2013-03-29 13:24 ` Rik van Riel
2013-03-29 13:55 ` [PATCH v3 " Rik van Riel
2013-03-29 13:59 ` Michel Lespinasse
2013-03-26 14:25 ` [PATCH " Rik van Riel
2013-03-26 17:33 ` ipc,sem: sysv semaphore scalability Sasha Levin
2013-03-26 17:51 ` Davidlohr Bueso
2013-03-26 18:07 ` Sasha Levin
2013-03-26 18:17 ` Rik van Riel
2013-03-26 20:00 ` [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo Rik van Riel
2013-04-05 4:38 ` Mike Galbraith
2013-04-05 13:21 ` Rik van Riel
2013-04-05 16:26 ` Mike Galbraith
2013-04-16 12:37 ` Mike Galbraith
2013-03-26 17:55 ` ipc,sem: sysv semaphore scalability Paul E. McKenney
2013-03-28 15:32 ` [PATCH -mm -next] ipc,sem: untangle RCU locking with find_alloc_undo Rik van Riel
2013-03-28 21:05 ` Davidlohr Bueso
2013-03-29 1:00 ` Michel Lespinasse
2013-03-29 1:14 ` Sasha Levin
2013-03-30 13:35 ` Sasha Levin
2013-03-31 1:30 ` Rik van Riel
2013-03-31 4:09 ` Davidlohr Bueso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1364585774.31320.9.camel@thor.lan \
--to=peter@hurleysoftware.com \
--cc=akpm@linux-foundation.org \
--cc=chegu_vinod@hp.com \
--cc=davej@redhat.com \
--cc=davidlohr.bueso@hp.com \
--cc=hhuang@redhat.com \
--cc=jason.low2@hp.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lwoodman@redhat.com \
--cc=riel@surriel.com \
--cc=skinsbursky@parallels.com \
--cc=torvalds@linux-foundation.org \
--cc=walken@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.