All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sonny Rao <sonnyrao@us.ibm.com>
To: linuxppc-dev@lists.ozlabs.org
Cc: markn@au1.ibm.com, sonnyrao@us.ibm.com, miltonm@bga.com
Subject: [PATCH] powerpc/pseries: Add support for IO Event interrupt drivers
Date: Tue, 18 May 2010 23:27:44 -0500	[thread overview]
Message-ID: <20100519042744.GA26215@us.ibm.com> (raw)
In-Reply-To: <1274189851.26143.63.camel@concordia>

On Tue, 18 May 2010 23:37:31 +1000, Michael Ellerman wrote:
> 
> On Mon, 2010-05-17 at 22:53 +1000, Mark Nelson wrote:
> > This patch adds support for handling IO Event interrupts which come
> > through at the /event-sources/ibm,io-events device tree node.
> 
> Hi Mark,
> 
> You'll have to explain to me offline sometime how it is we ran out of
> interrupts and started needing to multiplex them ..

Firmware has decided to multiplex all i/o error reporting through a single
interrupt for reasons unknown, that is the primary reason for this patch.

One question is, we already register a few RAS interrupts which call
RTAS using check-exception for getting error information.  Those live
in platforms/pseries/ras.c -- should we combine the two into a common
implementation somehow?

> > There is one ibm,io-events interrupt, but this interrupt might be used
> > for multiple I/O devices, each with their own separate driver. So, we
> > create a platform interrupt handler that will do the RTAS check-exception
> > call and then call the appropriate driver's interrupt handler (the one(s)
> > that registered with a scope that matches the scope of the incoming
> > interrupt).
> > 
> > So, a driver for a device that uses IO Event interrupts will register
> > it's interrupt service routine (or interrupt handler) with the platform
> > code using ioei_register_isr(). This register function takes a function
> > pointer to the driver's handler and the scope that the driver is
> > interested in (scopes defined in arch/powerpc/include/asm/io_events.h).
> > The driver's handler must take a pointer to a struct io_events_section
> > and must not return anything.
> > 
> > The platform code registers io_event_interrupt() as the interrupt handler
> > for the ibm,io-events interrupt. Upon receiving an IO Event interrupt, it
> > checks the scope of the incoming interrupt and only calls those drivers'
> > handlers that have registered as being interested in that scope.
> 
> The "checks the scope" requires an RTAS call, which takes a global lock
> (and you add another) - these aren't going to be used for anything
> performance critical I hope?

Nope it shouldn't be performance critical, but it does raise the point
that the current RTAS implementation in Linux *always* uses the global
lock.  There is a set of calls which are not required to be serialized
against each other.  This would be a totally different patchset to fix that
problem.  The "check-exception" call is one that doesn't require the global
RTAS lock.  I might work on that someday :-)

<snip>

> > +	/* check to see if we've already registered this function with
> > +	 * this scope. If we have, don't register it again
> > +	 */
> > +	iter = ioei_isr_list;
> > +	while (iter) {
> > +		if (iter->ioei_isr == isr && iter->scope == scope)
> > +			break;
> > +		iter = iter->next;
> > +	}
> > +
> > +	if (iter) {
> > +		ret = -EEXIST;
> > +		goto out;
> > +	}
> > +
> > +	cons = kmalloc(sizeof(struct ioei_consumer), GFP_KERNEL);
> 
> But you don't want to kmalloc while holding the lock and with interrupts
> off.

Well, he could use GFP_ATOMIC but that's the wrong approach.  You should
allocate the buffer (using GFP_KERNEL) before taking the spin lock.

<snip>

> > +#define EXT_INT_VECTOR_OFFSET	0x500
> > +#define RTAS_TYPE_IO_EVENT	0xE1

These defines should probably go in <asm/rtas.h>

I noticed the code in ras.c has it's own define too RAS_VECTOR_OFFSET
for 0x500 and asm/rtas.h actually has RTAS_TYPE_IO for 0xE1

> > +
> > +static irqreturn_t io_event_interrupt(int irq, void *dev_id)
> > +{
> > +	struct rtas_error_log *rtas_elog;
> > +	struct io_events_section *ioei_sec;
> > +	char *ch_ptr;
> > +	int status;
> > +	u16 *sec_len;
> > +
> > +	spin_lock(&ioei_log_buf_lock);
> > +
> > +	status = rtas_call(ioei_check_exception_token, 6, 1, NULL,
> > +			   EXT_INT_VECTOR_OFFSET,
> > +			   irq_map[irq].hwirq,
> 
> This is going to be  slow anyway, you may as well use virq_to_hw().
> 
> > +			   RTAS_IO_EVENTS, 1 /*Time Critical */,
> 
> Missing space before the T                      ^
> 
> > +			   __pa(&ioei_log_buf),
> 
> Does the buffer need to be aligned, and/or inside the RMO? I'd guess
> yes.

The docs for check-exception don't particularly specify alignment
requirements, but RTAS in generally going to want it in the RMO

Also, if we're going to go ahead and use rtas_call() which locks
it's own buffer which meets the requirements, why do we even need
a separate buffer?  Really, we should make this call, then copy
the content of the buffer before handing it over to the drivers.


> > +				rtas_get_error_log_max());

Here, we're passing back what RTAS told us what it's max is
which doesn't necessarily equal the static buffer size we
allocated which can cause a buffer overflow.  So this
argument should be the static size of the buffer.

> > +
> > +	rtas_elog = (struct rtas_error_log *)ioei_log_buf;
> > +
> > +	if (status != 0)
> > +		goto out;
> > +
> > +	/* We assume that we will only ever get called for io-event
> > +	 * interrupts. But if we get called with something else
> > +	 * make some noise about it.
> > +	 */
> 
> That would mean we'd requested the wrong interrupt wouldn't it? I'd
> almost BUG, but benh always bitches that I do that too often so .. :)
> 
> > +	if (rtas_elog->type != RTAS_TYPE_IO_EVENT) {
> > +		pr_warning("IO Events: We got called with an event type of %d"
> > +			   " rather than %d!\n", rtas_elog->type,
> > +			   RTAS_TYPE_IO_EVENT);
> > +		WARN_ON(1);
> > +		goto out;
> > +	}

Should we try to process this instead of just warning?  
The type we get might be one of the the ones we recognize in
ras.c; so this is an argument for combining ras.c with this code
or at least report this in the same manner we report any other
RTAS error log.

> > +
> > +	/* there are 24 bytes of event log data before the first section
> > +	 * (Main-A) begins
> > +	 */
> > +	ch_ptr = (char *)ioei_log_buf + 24;
> 
> Any reason you're casting from unsigned char to char?
> 
> > +	/* loop through all the sections until we get to the IO Events
> > +	 * Section, with section ID "IE"
> > +	 */
> > +	while (*ch_ptr != 'I' && *(ch_ptr + 1) != 'E') {
> > +		sec_len = (u16 *)(ch_ptr + 2);
> > +		ch_ptr += *sec_len;
> > +	}
> 
> This would be neater if you cast to io_events_section and used the
> fields I think.
> 
> And even better if you know the length will be a multiple of the
> section_header size, you can do the arithmetic in those terms.
> 
> > +	ioei_sec = (struct io_events_section *)ch_ptr;
> > +
> > +	ioei_call_consumers(ioei_sec->scope, ioei_sec);
> 
> Guaranteed to be only one section returned to us per call?
> 
> We /could/ copy the ioei_sec and drop the buf lock, which would allow
> another interrupt to come in and start doing the RTAS call (on another
> cpu, and iff there are actually multiple interrupts). But we probably
> don't care.

I think we *have* to copy it because we don't want our lock held when we
call random handlers.

> > +out:
> > +	spin_unlock(&ioei_log_buf_lock);
> > +
> > +	return IRQ_HANDLED;
> > +}
> > +
> > +static int __init init_ioei_IRQ(void)
> 
> Never understood why IRQ always (sometimes) gets caps ..
> 
> > +{
> > +	struct device_node *np;
> > +
> > +	ioei_check_exception_token = rtas_token("check-exception");
> 
> Meep, need to check if it actually exists.
> 
> > +	np = of_find_node_by_path("/event-sources/ibm,io-events");
> > +	if (np != NULL) {
> 
> if (np) would usually do it
> 
> > +		request_event_sources_irqs(np, io_event_interrupt, "IO_EVENT");
> > +		of_node_put(np);
> > +	}
> > +
> > +	return 0;
> > +}
> > +device_initcall(init_ioei_IRQ);
> 
> Should probably be a machine_initcall().
> 
> > Index: upstream/arch/powerpc/platforms/pseries/Makefile
> > ===================================================================
> > --- upstream.orig/arch/powerpc/platforms/pseries/Makefile
> > +++ upstream/arch/powerpc/platforms/pseries/Makefile
> > @@ -8,7 +8,7 @@ endif
> >  
> >  obj-y			:= lpar.o hvCall.o nvram.o reconfig.o \
> >  			   setup.o iommu.o event_sources.o ras.o \
> > -			   firmware.o power.o dlpar.o
> > +			   firmware.o power.o dlpar.o io_events.o
> 
> The BML guys might appreciate an option to turn it off?

Or, we might subvert it for our own evil purposes ;-)

Sonny

  reply	other threads:[~2010-05-19  4:27 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-05-17 12:53 [PATCH] powerpc/pseries: Add support for IO Event interrupt drivers Mark Nelson
2010-05-18 13:37 ` Michael Ellerman
2010-05-19  4:27   ` Sonny Rao [this message]
2010-05-19 10:44     ` Mark Nelson
2010-05-19 12:10     ` Michael Ellerman
2010-05-19 18:34       ` Sonny Rao
2010-05-19 10:24   ` Mark Nelson
2010-05-19 12:02     ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20100519042744.GA26215@us.ibm.com \
    --to=sonnyrao@us.ibm.com \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=markn@au1.ibm.com \
    --cc=miltonm@bga.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.