public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Robert Richter <rric@kernel.org>
To: Weng Meiling <wengmeiling.weng@huawei.com>
Cc: oprofile-list@lists.sf.net, linux-kernel@vger.kernel.org,
	Li Zefan <lizefan@huawei.com>,
	wangnan0@huawei.com, "zhangwei(Jovi)" <jovi.zhangwei@huawei.com>,
	Huang Qiang <h.huangqiang@huawei.com>
Subject: Re: [PATCH] oprofile: check whether oprofile perf enabled in op_overflow_handler()
Date: Tue, 14 Jan 2014 16:05:53 +0100	[thread overview]
Message-ID: <20140114150553.GC20315@rric.localhost> (raw)
In-Reply-To: <52D4984B.9090600@huawei.com>

On 14.01.14 09:52:11, Weng Meiling wrote:
> On 2014/1/13 16:45, Robert Richter wrote:
> > On 20.12.13 15:49:01, Weng Meiling wrote:

> >> The problem was once triggered on kernel 2.6.34, the main information:
> >> <3>BUG: soft lockup - CPU#0 stuck for 60005ms! [opcontrol:8673]
> >>
> >> Pid: 8673, comm:            opcontrol
> >> =====================SOFTLOCKUP INFO BEGIN=======================
> >> [CPU#0] the task [opcontrol] is not waiting for a lock,maybe a delay or deadcricle!
> >> <6>opcontrol     R<c> running  <c>    0  8673   7603 0x00000002
> >> locked:
> >> bf0e1928   mutex            0  [<bf0de0d8>] oprofile_start+0x10/0x68 [oprofile]
> >> bf0e1a24   mutex            0  [<bf0e07f0>] op_arm_start+0x10/0x48 [oprofile]
> >> c0628020   &ctx->mutex      0  [<c00af85c>] perf_event_create_kernel_counter+0xa4/0x14c
> > 
> > I rather suspect the code of perf_install_in_context() of 2.6.34 to
> > cause the locking issue. There was a lot of rework in between there.
> > Can you further explain the locking and why your fix should solve it?
> > 
> Thanks for your answer!
> The locking happens when the event's sample_period is small which leads to cpu
> keeping printing the warning for the triggered unregistered event. So the thread
> context can't be executed and trigger softlockup.
> As you said below, the patch is not appropriate, and the patch just
> prevents printing the warning and thus stays shorter in the interrupt handler,
> it can't solve the problem. The problem was once triggered on kernel 2.6.34, I'll
> try to trigger it in current kernel and resend a correct patch.

Weng,

so an interrupt storm due to warning messages causes the lock.

I was looking further at it and wrote a patch that enables the event
after it was added to the perf_events list. This should fix spurious
overflows and its warning messages. Could you reproduce the issue with
a mainline kernel and then test with the patch below applied?

Thanks,

-Robert


From: Robert Richter <rric@kernel.org>
Date: Tue, 14 Jan 2014 15:19:54 +0100
Subject: [PATCH] oprofile_perf

Signed-off-by: Robert Richter <rric@kernel.org>
---
 drivers/oprofile/oprofile_perf.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/oprofile/oprofile_perf.c b/drivers/oprofile/oprofile_perf.c
index d5b2732..2b07c95 100644
--- a/drivers/oprofile/oprofile_perf.c
+++ b/drivers/oprofile/oprofile_perf.c
@@ -38,6 +38,9 @@ static void op_overflow_handler(struct perf_event *event,
 	int id;
 	u32 cpu = smp_processor_id();
 
+	/* sync perf_events with op_create_counter(): */
+	smp_rmb();
+
 	for (id = 0; id < num_counters; ++id)
 		if (per_cpu(perf_events, cpu)[id] == event)
 			break;
@@ -68,6 +71,7 @@ static void op_perf_setup(void)
 		attr->config		= counter_config[i].event;
 		attr->sample_period	= counter_config[i].count;
 		attr->pinned		= 1;
+		attr->disabled		= 1;
 	}
 }
 
@@ -94,6 +98,11 @@ static int op_create_counter(int cpu, int event)
 
 	per_cpu(perf_events, cpu)[event] = pevent;
 
+	/* sync perf_events with overflow handler: */
+	smp_wmb();
+
+	perf_event_enable(pevent);
+
 	return 0;
 }
 
-- 
1.8.4.2


  reply	other threads:[~2014-01-14 15:06 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-20  7:49 [PATCH] oprofile: check whether oprofile perf enabled in op_overflow_handler() Weng Meiling
2013-12-30  9:06 ` Weng Meiling
2014-01-13  8:45 ` Robert Richter
2014-01-14  1:52   ` Weng Meiling
2014-01-14 15:05     ` Robert Richter [this message]
2014-01-15  2:02       ` Weng Meiling
2014-01-15 10:24         ` Robert Richter
2014-01-16  1:09           ` Weng Meiling
2014-01-16  9:33             ` Weng Meiling
2014-01-16 11:52               ` Robert Richter
2014-01-16 19:36                 ` Will Deacon
2014-01-17  3:37                   ` Weng Meiling
2014-02-11  4:33                     ` Weng Meiling
2014-02-11 15:52                       ` Will Deacon
2014-02-11 18:05                         ` Will Deacon
2014-02-15  2:41                         ` Weng Meiling
2014-02-17 10:08                           ` Will Deacon
2014-02-17 11:39                             ` Weng Meiling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140114150553.GC20315@rric.localhost \
    --to=rric@kernel.org \
    --cc=h.huangqiang@huawei.com \
    --cc=jovi.zhangwei@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=oprofile-list@lists.sf.net \
    --cc=wangnan0@huawei.com \
    --cc=wengmeiling.weng@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox