All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jason Baron <jbaron@akamai.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Sasha Levin <sasha.levin@oracle.com>,
	Ingo Molnar <mingo@kernel.org>,
	"acme@ghostprotocols.net" <acme@ghostprotocols.net>,
	LKML <linux-kernel@vger.kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Dave Jones <davej@redhat.com>,
	"edumazet@google.com" <edumazet@google.com>,
	"viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>
Subject: Re: eventpoll __list_del_entry corruption
Date: Tue, 03 Jun 2014 11:07:11 -0400	[thread overview]
Message-ID: <538DE49F.7040904@akamai.com> (raw)
In-Reply-To: <20140515181102.GH11096@twins.programming.kicks-ass.net>

On 05/15/2014 02:11 PM, Peter Zijlstra wrote:
> On Mon, May 12, 2014 at 11:42:33AM -0400, Sasha Levin wrote:
>> Hi all,
>>
>> While fuzzing with trinity inside a KVM tools guest running the latest -next
>> kernel I've stumbled on the following spew. Maybe related to the very recent
>> change in freeing on task exit?
>>
> 
> While fuzzing to reproduce; I hit this one, is it a known one or should
> I go poke the right people about it?
> 
> ---
> [ 5823.689985] ------------[ cut here ]------------
> [ 5823.690004] WARNING: CPU: 3 PID: 2508 at /usr/src/linux-2.6/lib/list_debug.c:59 __list_del_entry+0xa1/0xd0()
> [ 5823.690004] list_del corruption. prev->next should be ffff880131111de0, but was 6b6b6b6b6b6b6b6b
> [ 5823.690004] Modules linked in:
> [ 5823.690004] CPU: 3 PID: 2508 Comm: trinity-main Not tainted 3.15.0-rc5-01700-g505011124ad0-dirty #1072
> [ 5823.690004] Hardware name: Supermicro X8DTN/X8DTN, BIOS 4.6.3 01/08/2010
> [ 5823.690004]  0000000000000009 ffff880432709ca8 ffffffff81681aa2 ffff880432709cf0
> [ 5823.690004]  ffff880432709ce0 ffffffff8109807c ffff880131111de0 ffff880131111dc8
> [ 5823.690004]  0000000000000286 ffff8800b9dd5618 ffff88023699b720 ffff880432709d40
> [ 5823.690004] Call Trace:
> [ 5823.690004]  [<ffffffff81681aa2>] dump_stack+0x4e/0x7a
> [ 5823.690004]  [<ffffffff8109807c>] warn_slowpath_common+0x8c/0xc0
> [ 5823.690004]  [<ffffffff8109816c>] warn_slowpath_fmt+0x4c/0x50
> [ 5823.690004]  [<ffffffff810ec8bf>] ? do_raw_spin_lock+0x13f/0x160
> [ 5823.690004]  [<ffffffff8138c661>] __list_del_entry+0xa1/0xd0
> [ 5823.690004]  [<ffffffff8138c69d>] list_del+0xd/0x30
> [ 5823.690004]  [<ffffffff810dfa71>] remove_wait_queue+0x31/0x50
> [ 5823.690004]  [<ffffffff812152aa>] ep_unregister_pollwait.isra.9+0x6a/0xb0
> [ 5823.690004]  [<ffffffff81215268>] ? ep_unregister_pollwait.isra.9+0x28/0xb0
> [ 5823.690004]  [<ffffffff8121531f>] ep_remove+0x2f/0xe0
> [ 5823.690004]  [<ffffffff81215705>] eventpoll_release_file+0x65/0xa0
> [ 5823.690004]  [<ffffffff811cf259>] __fput+0x1d9/0x1e0
> [ 5823.690004]  [<ffffffff811cf2ae>] ____fput+0xe/0x10
> [ 5823.690004]  [<ffffffff810b91f4>] task_work_run+0xc4/0xe0
> [ 5823.690004]  [<ffffffff8109a544>] do_exit+0x2d4/0xa90
> [ 5823.690004]  [<ffffffff813825c4>] ? lockdep_sys_exit_thunk+0x35/0x67
> [ 5823.690004]  [<ffffffff8109ae2c>] do_group_exit+0x4c/0xc0
> [ 5823.690004]  [<ffffffff8109aeb7>] SyS_exit_group+0x17/0x20
> [ 5823.690004]  [<ffffffff8168a2c2>] system_call_fastpath+0x16/0x1b
> [ 5823.690004] ---[ end trace 515b7fa3169c0906 ]---
> 


Hi Peter,

If its possible to reproduce maybe we can apply the following debug
patch to at least have a clue about which wait queue has gotten
corrupted. The bug could also be isolated to the epoll core too (ie
not specific to a particular wait queue), but I think its worth a
shot...

Thanks,

-Jason


diff --git a/fs/eventpoll.c b/fs/eventpoll.c
index af90312..e8d5ea7 100644
--- a/fs/eventpoll.c
+++ b/fs/eventpoll.c
@@ -237,6 +237,9 @@ struct eppoll_entry {
 
 	/* The wait queue head that linked the "wait" wait queue item */
 	wait_queue_head_t *whead;
+
+	/* DEBUG: save address of ep_ptable_queue_proc() caller */
+	unsigned long poll_wait_addr;
 };
 
 /* Wrapper struct used by poll queueing */
@@ -513,6 +516,21 @@ static void ep_poll_safewake(wait_queue_head_t *wq)
 	put_cpu();
 }
 
+static void check_pwq(struct eppoll_entry *pwq)
+{
+	unsigned long flags;
+	struct list_head *prev, *entry;
+
+	spin_lock_irqsave(&pwq->whead->lock, flags);
+	entry = &pwq->wait.task_list;
+	prev = entry->prev;
+	if (prev->next != entry)
+		pr_err("epoll: list corruption: queue caller addr: 0x%lx, "
+			"function: %pS\n", pwq->poll_wait_addr,
+			(void *)pwq->poll_wait_addr);
+	spin_unlock_irqrestore(&pwq->whead->lock, flags);
+}
+
 static void ep_remove_wait_queue(struct eppoll_entry *pwq)
 {
 	wait_queue_head_t *whead;
@@ -520,8 +538,10 @@ static void ep_remove_wait_queue(struct eppoll_entry *pwq)
 	rcu_read_lock();
 	/* If it is cleared by POLLFREE, it should be rcu-safe */
 	whead = rcu_dereference(pwq->whead);
-	if (whead)
+	if (whead) {
+		check_pwq(pwq);
 		remove_wait_queue(whead, &pwq->wait);
+	}
 	rcu_read_unlock();
 }
 
@@ -1101,6 +1121,7 @@ static void ep_ptable_queue_proc(struct file *file, wait_queue_head_t *whead,
 		add_wait_queue(whead, &pwq->wait);
 		list_add_tail(&pwq->llink, &epi->pwqlist);
 		epi->nwait++;
+		pwq->poll_wait_addr = (unsigned long)__builtin_return_address(0);
 	} else {
 		/* We have to signal that an error occurred */
 		epi->nwait = -1;

  parent reply	other threads:[~2014-06-03 15:15 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-12 15:42 perf: use after free in perf_remove_from_context Sasha Levin
2014-05-14 16:29 ` Peter Zijlstra
2014-05-14 16:32   ` Sasha Levin
2014-05-14 16:35     ` Peter Zijlstra
2014-05-14 16:38       ` Sasha Levin
2014-05-14 16:52         ` Peter Zijlstra
2014-05-14 17:09           ` Sasha Levin
2014-05-14 17:20             ` Dave Jones
2014-05-14 18:37               ` Peter Zijlstra
2014-05-28 23:52       ` Sasha Levin
2014-05-29  2:31         ` Sasha Levin
2014-05-29  7:59           ` Peter Zijlstra
2014-05-29  7:57         ` Peter Zijlstra
2014-05-29 14:47           ` Sasha Levin
2014-05-29 15:07             ` Peter Zijlstra
2014-05-29 16:44               ` Sasha Levin
2014-05-29 16:50                 ` Peter Zijlstra
2014-05-29 16:52                   ` Sasha Levin
2014-05-29 17:00                   ` Peter Zijlstra
2014-05-29 22:37                     ` Sasha Levin
2014-06-05 14:38                     ` [tip:perf/core] perf: Fix use after free in perf_remove_from_context() tip-bot for Peter Zijlstra
2014-05-15 18:11 ` eventpoll __list_del_entry corruption (was: perf: use after free in perf_remove_from_context) Peter Zijlstra
2014-05-15 18:16   ` eventpoll __list_del_entry corruption Sasha Levin
2014-06-16  9:44     ` Eric Wong
2014-05-21  8:25   ` BUG at /usr/src/linux-2.6/mm/filemap.c:202 (was: perf: use after free in perf_remove_from_context) Peter Zijlstra
2014-05-21 13:02     ` BUG at /usr/src/linux-2.6/mm/filemap.c:202 Sasha Levin
2014-06-03 15:07   ` Jason Baron [this message]
2014-06-03 15:11     ` eventpoll __list_del_entry corruption Peter Zijlstra
2014-05-16 15:34 ` BUG_ON drivers/char/random.c:986 (Was: perf: use after free in perf_remove_from_context) Peter Zijlstra
2014-05-16 16:06   ` H. Peter Anvin
2014-05-16 16:21     ` Peter Zijlstra
2014-05-17  0:46       ` Hannes Frederic Sowa
2014-05-17  2:18         ` Theodore Ts'o
2014-05-17 16:24           ` Sasha Levin
2014-05-17 17:00             ` Peter Zijlstra
2014-07-15  4:36           ` BUG_ON drivers/char/random.c:986 Dave Jones
2014-07-15 20:29             ` Hannes Frederic Sowa
2014-07-16  8:33               ` Theodore Ts'o
2014-07-16 19:18                 ` [PATCH] random: check for increase of entropy_count because of signed conversion Hannes Frederic Sowa
2014-07-18 21:25                   ` Theodore Ts'o
2014-07-18 21:43                     ` Hannes Frederic Sowa
2014-07-18 21:50                     ` Theodore Ts'o
2014-07-18 22:07                       ` Theodore Ts'o
2014-07-18 23:35                         ` Hannes Frederic Sowa
2014-07-19  5:42                           ` Theodore Ts'o
2014-07-19  6:20                             ` Hannes Frederic Sowa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=538DE49F.7040904@akamai.com \
    --to=jbaron@akamai.com \
    --cc=acme@ghostprotocols.net \
    --cc=davej@redhat.com \
    --cc=edumazet@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=sasha.levin@oracle.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.