All of lore.kernel.org
 help / color / mirror / Atom feed
From: Robert Richter <rric@kernel.org>
To: Weng Meiling <wengmeiling.weng@huawei.com>
Cc: oprofile-list@lists.sf.net, linux-kernel@vger.kernel.org,
	Li Zefan <lizefan@huawei.com>,
	wangnan0@huawei.com, "zhangwei(Jovi)" <jovi.zhangwei@huawei.com>,
	Huang Qiang <h.huangqiang@huawei.com>
Subject: Re: [PATCH] oprofile: check whether oprofile perf enabled in op_overflow_handler()
Date: Tue, 14 Jan 2014 16:05:53 +0100	[thread overview]
Message-ID: <20140114150553.GC20315@rric.localhost> (raw)
In-Reply-To: <52D4984B.9090600@huawei.com>

On 14.01.14 09:52:11, Weng Meiling wrote:
> On 2014/1/13 16:45, Robert Richter wrote:
> > On 20.12.13 15:49:01, Weng Meiling wrote:

> >> The problem was once triggered on kernel 2.6.34, the main information:
> >> <3>BUG: soft lockup - CPU#0 stuck for 60005ms! [opcontrol:8673]
> >>
> >> Pid: 8673, comm:            opcontrol
> >> =====================SOFTLOCKUP INFO BEGIN=======================
> >> [CPU#0] the task [opcontrol] is not waiting for a lock,maybe a delay or deadcricle!
> >> <6>opcontrol     R<c> running  <c>    0  8673   7603 0x00000002
> >> locked:
> >> bf0e1928   mutex            0  [<bf0de0d8>] oprofile_start+0x10/0x68 [oprofile]
> >> bf0e1a24   mutex            0  [<bf0e07f0>] op_arm_start+0x10/0x48 [oprofile]
> >> c0628020   &ctx->mutex      0  [<c00af85c>] perf_event_create_kernel_counter+0xa4/0x14c
> > 
> > I rather suspect the code of perf_install_in_context() of 2.6.34 to
> > cause the locking issue. There was a lot of rework in between there.
> > Can you further explain the locking and why your fix should solve it?
> > 
> Thanks for your answer!
> The locking happens when the event's sample_period is small which leads to cpu
> keeping printing the warning for the triggered unregistered event. So the thread
> context can't be executed and trigger softlockup.
> As you said below, the patch is not appropriate, and the patch just
> prevents printing the warning and thus stays shorter in the interrupt handler,
> it can't solve the problem. The problem was once triggered on kernel 2.6.34, I'll
> try to trigger it in current kernel and resend a correct patch.

Weng,

so an interrupt storm due to warning messages causes the lock.

I was looking further at it and wrote a patch that enables the event
after it was added to the perf_events list. This should fix spurious
overflows and its warning messages. Could you reproduce the issue with
a mainline kernel and then test with the patch below applied?

Thanks,

-Robert


From: Robert Richter <rric@kernel.org>
Date: Tue, 14 Jan 2014 15:19:54 +0100
Subject: [PATCH] oprofile_perf

Signed-off-by: Robert Richter <rric@kernel.org>
---
 drivers/oprofile/oprofile_perf.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/drivers/oprofile/oprofile_perf.c b/drivers/oprofile/oprofile_perf.c
index d5b2732..2b07c95 100644
--- a/drivers/oprofile/oprofile_perf.c
+++ b/drivers/oprofile/oprofile_perf.c
@@ -38,6 +38,9 @@ static void op_overflow_handler(struct perf_event *event,
 	int id;
 	u32 cpu = smp_processor_id();
 
+	/* sync perf_events with op_create_counter(): */
+	smp_rmb();
+
 	for (id = 0; id < num_counters; ++id)
 		if (per_cpu(perf_events, cpu)[id] == event)
 			break;
@@ -68,6 +71,7 @@ static void op_perf_setup(void)
 		attr->config		= counter_config[i].event;
 		attr->sample_period	= counter_config[i].count;
 		attr->pinned		= 1;
+		attr->disabled		= 1;
 	}
 }
 
@@ -94,6 +98,11 @@ static int op_create_counter(int cpu, int event)
 
 	per_cpu(perf_events, cpu)[event] = pevent;
 
+	/* sync perf_events with overflow handler: */
+	smp_wmb();
+
+	perf_event_enable(pevent);
+
 	return 0;
 }
 
-- 
1.8.4.2


  reply	other threads:[~2014-01-14 15:06 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-20  7:49 [PATCH] oprofile: check whether oprofile perf enabled in op_overflow_handler() Weng Meiling
2013-12-30  9:06 ` Weng Meiling
2014-01-13  8:45 ` Robert Richter
2014-01-14  1:52   ` Weng Meiling
2014-01-14 15:05     ` Robert Richter [this message]
2014-01-15  2:02       ` Weng Meiling
2014-01-15 10:24         ` Robert Richter
2014-01-16  1:09           ` Weng Meiling
2014-01-16  9:33             ` Weng Meiling
2014-01-16 11:52               ` Robert Richter
2014-01-16 19:36                 ` Will Deacon
2014-01-17  3:37                   ` Weng Meiling
2014-02-11  4:33                     ` Weng Meiling
2014-02-11 15:52                       ` Will Deacon
2014-02-11 18:05                         ` Will Deacon
2014-02-15  2:41                         ` Weng Meiling
2014-02-17 10:08                           ` Will Deacon
2014-02-17 11:39                             ` Weng Meiling

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140114150553.GC20315@rric.localhost \
    --to=rric@kernel.org \
    --cc=h.huangqiang@huawei.com \
    --cc=jovi.zhangwei@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizefan@huawei.com \
    --cc=oprofile-list@lists.sf.net \
    --cc=wangnan0@huawei.com \
    --cc=wengmeiling.weng@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.