All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Jan Beulich <JBeulich@novell.com>
Cc: xen-devel@lists.xensource.com, crrodriguez@suse.de,
	nhorman@tuxdriver.com, bwalle@suse.de, pbrobinson@gmail.com,
	notting@redhat.com, arjan@linux.intel.com, anibal@debian.org
Subject: Re: irqbalance seg faults with 2.6.38 or later kernels [patch + solution included] if running under Xen hypervisor
Date: Wed, 11 May 2011 09:10:58 -0400	[thread overview]
Message-ID: <20110511131058.GA4130@dumpdata.com> (raw)
In-Reply-To: <4DCA62150200007800040EEA@vpn.id2.novell.com>

On Wed, May 11, 2011 at 09:16:53AM +0100, Jan Beulich wrote:
> >>> On 11.05.11 at 02:33, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote:
> > The reason behind it is that irqbalance parses the /proc/interrupts
> > and whenever it hits something it can't understand:
> > 
> >  RES:  191614137   73904910    Rescheduling interrupts
> > 
> > It will count the number of interrupts towards the IRQ 0. That IRQ does 
> > exist
> > when the kernel boots under baremetal:
> > 
> >   0:         46          0       IO-APIC-edge      timer
> > 
> > but under Xen, the timer interrupts are initialized much later:
> > 
> >  272:   41197188          0        xen-percpu-virq      timer0
> > 
> > and the first IRQ that is used is not zero, but rather one:
> > 
> >    1:      73037          0          0          0          0          0  
> > xen-pirq-ioapic-edge  i8042
> > 
> > so when irqbalance tries to account for the IRQ 'RES' to the IRQ 0
> > it fails and segfaults. The attached patch fixes it for whoever else is
> > hitting this problem.
> 
> In the svn snapshot I have, I see
> 
> 		/* lines with letters in front are special, like NMI count. Ignore */
> 		if (!(line[0]==' ' || (line[0]>='0' && line[0]<='9')))
> 			break;
> 
> which I would think should be taking care of your problem (or
> I mis-read your description), and which was there already before

Not anymore. In kernels 2.6.37:

           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       
.. snip.
NMI:          0          0          0          0   Non-maskable interrupts
LOC:   12413629   12858323   16296183   11098466   Local timer interrupts

In 2.6.38 and later:
            CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       
 TRM:          0          0          0          0          0          0   Thermal event interrupts
 THR:          0          0          0          0          0          0   Threshold APIC interrupts
 MCE:          0          0          0          0          0          0   Machine check exceptions

They added in a space before the name. The check you mentioned
above could be augmented for this of course, as another solution
for this.

> 0.56. Or are you perhaps having the problem because you have
> 1000+ interrupts, thus causing even the non-numeric strings to
> get space padded on their left? In that case I'd rather think above
> check should be either improved or removed (replaced by your
> solution).
> 
> > I am not sure who the upstream maintainer is for this so
> > I am sending this patch to the different distros as well.
> 
> Copying Neil and Arjan.
> 
> Jan
> 
> > 
> > --- irqbalance-0.56.orig/procinterrupts.c	2010-06-10 10:45:55.000000000 -0400
> > +++ irqbalance-0.56/procinterrupts.c	2011-05-10 20:22:06.897465003 -0400
> > @@ -50,7 +50,7 @@ void parse_proc_interrupts(void)
> >  		int cpunr;
> >  		int	 number;
> >  		uint64_t count;
> > -		char *c, *c2;
> > +		char *c, *c2, *err;
> >  
> >  		if (getline(&line, &size, file)==0)
> >  			break;
> > @@ -64,7 +64,11 @@ void parse_proc_interrupts(void)
> >  			continue;
> >  		*c = 0;
> >  		c++;
> > -		number = strtoul(line, NULL, 10);
> > +		number = strtoul(line, &err, 10);
> > +		/* Man page says that if that happens and number == 0, then it
> > +		 * failed to parse. */
> > +		if (err == line && number == 0)
> > +			continue;
> >  		count = 0;
> >  		cpunr = 0;
> >  
> 
> 

  reply	other threads:[~2011-05-11 13:10 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-11  0:33 irqbalance seg faults with 2.6.38 or later kernels [patch + solution included] if running under Xen hypervisor Konrad Rzeszutek Wilk
2011-05-11  8:16 ` Jan Beulich
2011-05-11 13:10   ` Konrad Rzeszutek Wilk [this message]
2011-05-11 13:41     ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110511131058.GA4130@dumpdata.com \
    --to=konrad.wilk@oracle.com \
    --cc=JBeulich@novell.com \
    --cc=anibal@debian.org \
    --cc=arjan@linux.intel.com \
    --cc=bwalle@suse.de \
    --cc=crrodriguez@suse.de \
    --cc=nhorman@tuxdriver.com \
    --cc=notting@redhat.com \
    --cc=pbrobinson@gmail.com \
    --cc=xen-devel@lists.xensource.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.