All of lore.kernel.org
 help / color / mirror / Atom feed
From: Borislav Petkov <bp@amd64.org>
To: Mauro Carvalho Chehab <mchehab@redhat.com>
Cc: Borislav Petkov <bp@amd64.org>, Ingo Molnar <mingo@kernel.org>,
	Linux Edac Mailing List <linux-edac@vger.kernel.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Aristeu Rozanski <arozansk@redhat.com>,
	Doug Thompson <norsk5@yahoo.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Ingo Molnar <mingo@redhat.com>
Subject: Re: [PATCH v24b] RAS: Add a tracepoint for reporting memory controller events
Date: Fri, 18 May 2012 20:52:58 +0200	[thread overview]
Message-ID: <20120518185258.GE20215@aftab.osrc.amd.com> (raw)
In-Reply-To: <4FB6866E.7090503@redhat.com>

On Fri, May 18, 2012 at 02:27:10PM -0300, Mauro Carvalho Chehab wrote:
> Em 18-05-2012 13:40, Borislav Petkov escreveu:
> > On Fri, May 18, 2012 at 11:31:46AM -0300, Mauro Carvalho Chehab wrote:
> >> Ok, but you won't use trace_sched_switch() as a memory tracepoint, as
> >> they represent different things.
> >>
> >> Memory errors are different than CPU errors. So, their tracepoints
> >> will be different.
> > 
> > WTF? A tracepoint is a tracepoint.
> 
> This is the event you've mentioned:
> 
> TRACE_EVENT(sched_switch,
> 
> 	TP_PROTO(struct task_struct *prev,
> 		 struct task_struct *next),
> 
> It doesn't have the same arguments of a memory tracepoint, like:
> 
> TRACE_EVENT(mm_vmscan_wakeup_kswapd,
> 
> 	TP_PROTO(int nid, int zid, int order),
> 
> trace events for different things will have different arguments.

It seems you and I speak two different english languages. Here's what I meant:

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 0533a688ce22..96947fb50328 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -1918,6 +1918,7 @@ prepare_task_switch(struct rq *rq, struct task_struct *prev,
        prepare_lock_switch(rq, next);
        prepare_arch_switch(next);
        trace_sched_switch(prev, next);
+       trace_sched_switch_rq(prev, next, rq);
 }

because for some reason we want to dump something from the runqueue
pointer now. We can't change the current tracepoint because userspace
scripts use it.

Now imagine the bloat levels when people start adding
tracepoints left and right...

> > 
> >>> Right, and this is why I'm asking you to have the following tracepoint proto:
> >>>
> >>> +       TP_PROTO(const unsigned int err_type,
> >>> +                const unsigned int mc_index,
> >>> +                const char *error_msg,
> >>> +                const char *label,
> >>> +                const char *location,
> >>> +                const char *detail)
> >>>
> >>> where detail contains all the crap one driver adds for technical people
> >>> to pinpoint where the error is.
> >>>
> >>> And not have _TWO_ detail arguments!
> >>
> >> And what I'm saying is that this should be, instead:
> >>
> >> + TP_PROTO(const unsigned int err_type,
> >> +          const unsigned int mc_index,
> >> +          const char *error_msg,
> >> +          const char *label,
> >> +          int layer0,
> >> +          int layer1,
> >> +          int layer2,
> >> +          unsigned long pfn,
> >> +          unsigned long offset,
> >> +          unsigned long grain,
> >> +          unsigned long syndrome,
> >> +          const char *driver_detail),
> >>
> >> So, having just one detail argument, filled by the driver, and not
> >> folding "location" and core "details" into strings, but keeping as they
> >> are.
> > 
> > And this way you're enforcing an interface that all drivers will have
> > to adhere to. What if "grain" doesn't mean a thing for a driver,  or
> > "syndrome" or whatever? What if some other entity wants to use that
> > tracepoint?
> 
> This enforcement is already there at the EDAC API/ABI. This patch series
> won't change that.
> 
> Memory controllers that can't get the error address (page/offset/grain)
> or the syndrome fills those information with 0.

Setting something to 0 because it is N/A is the first sign that
something is wrong with your ABI. Having an arbitrary string which each
driver parses as it sees fit is perfectly fine.

> > See what I'm sayin?
> > 
> > Having
> > 
> >        TP_PROTO(const unsigned int err_type,
> >                 const unsigned int mc_index,
> >                 const char *error_msg,
> >                 const char *label,
> >                 const char *location,
> >                 const char *detail)
> > 
> > is a bit more generic and userspace can parse it however it likes.
> > 
> > Actually, I'd slim this up even more:
> > 
> >        TP_PROTO(const unsigned int mc_index,
> >                 const char *error_msg,
> >                 const char *label,
> >                 const char *location,
> >                 const char *detail)
> 
> Well, this even more generic:
> 	TP_PROTO(const char *msg)
> 
> but the more we convert other fields into strings, the harder is for
> userspace to handle it,

For the millionth time, it is not hard for userspace to parse anything!

> as fields may be on different order, string conversion patches might
> change or break the parsing behavior, different drivers will require
> different parsing expressions, etc.

That's why _each_ _driver_ will have its format and the userspace tools
parsing it will know about it!

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

  reply	other threads:[~2012-05-18 18:53 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-05-17 20:41 [PATCH v24b] RAS: Add a tracepoint for reporting memory controller events Mauro Carvalho Chehab
2012-05-17 21:48 ` Borislav Petkov
2012-05-18  7:12   ` Ingo Molnar
2012-05-18  9:56     ` Borislav Petkov
2012-05-18 10:59       ` Mauro Carvalho Chehab
2012-05-18 12:43         ` Borislav Petkov
2012-05-18 13:23           ` Mauro Carvalho Chehab
2012-05-18 14:05             ` Borislav Petkov
2012-05-18 14:31               ` Mauro Carvalho Chehab
2012-05-18 16:40                 ` Borislav Petkov
2012-05-18 17:27                   ` Mauro Carvalho Chehab
2012-05-18 18:52                     ` Borislav Petkov [this message]
2012-05-18 19:10                       ` Luck, Tony
2012-05-18 21:12                         ` Borislav Petkov
2012-05-19  9:26                           ` Borislav Petkov
2012-05-21 15:29                             ` Mauro Carvalho Chehab
2012-05-21 16:00                               ` Borislav Petkov
2012-05-21 16:40                                 ` Mauro Carvalho Chehab
2012-05-21 20:40                                   ` Borislav Petkov
2012-05-22  3:04                                     ` Mauro Carvalho Chehab
2012-05-22  9:28                                       ` Borislav Petkov
2012-05-22 10:18                                         ` Mauro Carvalho Chehab
2012-05-22 13:05                                           ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120518185258.GE20215@aftab.osrc.amd.com \
    --to=bp@amd64.org \
    --cc=arozansk@redhat.com \
    --cc=fweisbec@gmail.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@redhat.com \
    --cc=mingo@kernel.org \
    --cc=mingo@redhat.com \
    --cc=norsk5@yahoo.com \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.