All of lore.kernel.org
 help / color / mirror / Atom feed
From: Manfred Spraul <manfred@colorfullife.com>
To: Linus Torvalds <torvalds@osdl.org>
Cc: Falk Hueffner <falk@debian.org>, Ingo Molnar <mingo@elte.hu>,
	Andrew Morton <akpm@osdl.org>,
	"Paul E. McKenney" <paulmck@us.ibm.com>,
	linux-kernel@vger.kernel.org
Subject: Re: ipc/msg.c "cleanup" breaks fakeroot on Alpha
Date: Sun, 05 Nov 2006 17:02:39 +0100	[thread overview]
Message-ID: <454E0B1F.7090106@colorfullife.com> (raw)
In-Reply-To: <Pine.LNX.4.64.0611041019180.25218@g5.osdl.org>

Linus Torvalds wrote:

>[ Removed the kernel mailing list ]
>  
>
[kernel mailing list added back]

>As far as I can tell, people hold one or the other, but not both, and 
>happily do strange things to "r_msg". The code seems to _know_ that it is 
>racy, since in addition to the volatile, it does things like:
>
>		...
>                msr->r_msg = NULL;
>                wake_up_process(msr->r_tsk);
>                smp_mb();
>                msr->r_msg = ERR_PTR(res);
>		...
>
>and that memory barrier again doesn't really seem to make a whole lot of 
>sense.
>
>  
>
msr is a msg_receiver structure. The structure is stored on the stack of 
msr->r_tsk.
The smp_mb() guarantees that the wake_up_process is complete before 
ERR_PTR(res) is stored into msr->r_msg.

>But I don't know. It clearly _tries_ to do some smart locking, I just 
>don't see what the rules are. 
>
>  
>
The codes tries to to a lockless receive:
- the mutex is only required to create/destroy queues.
- normal queue operations are protected by msg_lock(msqid), which is a 
spinlock. One spinlock for each queue.
- if a receiving thread doesn't find a message, then it adds a 
msg_receiver structure into msq->q_receivers. This linked list is stored 
in the message queue structure and protected by msg_lock(msqid).
- if a sending thread finds a msg_receiver structure, then it removes 
the structure from the msq->q_receivers linked list, places the message 
into msr->r_msg and wakes up the receiver. All operations happen under 
msg_lock(msqid)
- the receiver notices that there is a message in it's msr->r_msg 
structure and copies it to user space, without acquiring msg_lock(msqid).

ipc/sem.c uses the same idea, I added a longer block with documentation 
(around line 150 in ipc/sem.c)

I'm only aware of one tricky race:
- the sender calls wake_up_process().
- as soon as the receiver finds something in r_msg, it can return to 
user space. user space can call exit(3). do_exit destroys the task 
structure.
- wake_up_process will cause an oops if it's called after do_exit().
This race happened on s390. The solution is this block:

                msr->r_msg = NULL;
                wake_up_process(msr->r_tsk);
                smp_mb();
                msr->r_msg = ERR_PTR(res);

Initially, r_msg is ERR_PTR(-EAGAIN). The sender first sets it to NULL 
("message pending"), then calls wake_up_process(), then a memory 
barrier, then the final value.

Back to the bug report:
"volatile" shouln't be necessary. The critical part is this loop:

                msg = (struct msg_msg*)msr_d.r_msg;
                while (msg == NULL) {
                        cpu_relax();
                        msg = (struct msg_msg *)msr_d.r_msg;
                }
And cpu_relax is a barrier().
On i386, removing the "volatile" has no effect, the .o remains identical.
Falk, could you compare the .o files with/without volatile? Are there 
any differences?

The oops was caused by try_to_wake_up, called by expunge_all.
I.e.:
- either the msq->q_receivers linked list got corrupted
- or the target thread was destroyed before wake_up_process completed.
Theoretically, both things are impossible:
- msq->q_receivers is protected by msq_lock()
- the target thread task_struct is guaranteed to remain in scope due to 
the "msg == NULL" loop.

I'll try to reproduce the oops on i386 - but I don't see a bug right now.

--
    Manfred

  parent reply	other threads:[~2006-11-05 16:03 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-11-04 16:49 ipc/msg.c "cleanup" breaks fakeroot on Alpha Falk Hueffner
2006-11-04 17:29 ` Ingo Molnar
2006-11-04 17:41   ` Linus Torvalds
2006-11-04 18:12     ` Falk Hueffner
     [not found]       ` <Pine.LNX.4.64.0611041019180.25218@g5.osdl.org>
2006-11-05 16:02         ` Manfred Spraul [this message]
2006-11-06  5:57           ` Paul E. McKenney
2006-11-06  6:23             ` Manfred Spraul
2006-11-06 16:18               ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=454E0B1F.7090106@colorfullife.com \
    --to=manfred@colorfullife.com \
    --cc=akpm@osdl.org \
    --cc=falk@debian.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@elte.hu \
    --cc=paulmck@us.ibm.com \
    --cc=torvalds@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.