From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751253AbWFCUVv (ORCPT ); Sat, 3 Jun 2006 16:21:51 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751797AbWFCUVv (ORCPT ); Sat, 3 Jun 2006 16:21:51 -0400 Received: from ms-smtp-02.nyroc.rr.com ([24.24.2.56]:33684 "EHLO ms-smtp-02.nyroc.rr.com") by vger.kernel.org with ESMTP id S1751253AbWFCUVu (ORCPT ); Sat, 3 Jun 2006 16:21:50 -0400 Subject: Re: [patch 2/5] [PREEMPT_RT] Changing interrupt handlers from running in thread to hardirq and back runtime. From: Steven Rostedt Reply-To: rostedt@goodmis.org To: Esben Nielsen Cc: linux-kernel@vger.kernel.org, Ingo Molnar In-Reply-To: References: <20060602165336.147812000@localhost> Content-Type: text/plain Organization: Kihon Technologies Inc. Date: Sat, 03 Jun 2006 16:21:45 -0400 Message-Id: <1149366105.13993.115.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.4.2.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Disclaimer: I haven't read all your patches yet. I'm going one at a time to comment, and then I will probably send more emails about the overall response. So now, my comments are not on the big picture, but simple atomic views. On Fri, 2006-06-02 at 23:23 +0100, Esben Nielsen wrote: > This patch makes it possible to change which context the interrupt handlers > for each interrupt run in, hard-irq or threaded if CONFIG_PREEMPT_HARDIRQS is > set. > > The interface is the file > /proc/irq//threaded > You can read it to see what the context is now or write one of > > irq > fifo > rr > normal > batch > > where one of the latter makes the interrupt handler threaded. > > A replacement for request_irq(), called request_irq2(), is added. When a driver request_irq2 ... yuck! Perhaps request_irq_convertible() or something similar? But irq2, no way! > uses this to install it's irq-handler it can also give a change_irq_context > call-back. This call-back is called whenever the irq-context is changed or > going to be changed. The call-back must be of the form > > int driver_change_context(int irq, void *dev_id, enum change_context_cmd cmd) Eeee, looks like a big change to make on drivers, and something that can keep -rt from mainline forever. But I'll see more as I read. This looks optional but still it will make things more complex. > > where > > enum change_context_cmd { > IRQ_TO_HARDIRQ, > IRQ_CAN_THREAD, > IRQ_TO_THREADED > }; > > The call-back is supposed to do the following on > IRQ_TO_HARDIRQ: make sure everything in the interrupt handler is non-blocking > or return a non-zero error code. > IRQ_CAN_THREAD: Return 0 if it is ok for the interrupt handler to be run in > thread context. Return a non-zero error code otherwise. > IRQ_TO_THREAD: Now the interrupt handler does run in thread context. The > driver can now change it's locks to being mutexes. The return > value is ignored as the driver already got a chance to protest > above. > > Index: linux-2.6.16-rt23.spin_mutex/include/linux/interrupt.h > =================================================================== > --- linux-2.6.16-rt23.spin_mutex.orig/include/linux/interrupt.h > +++ linux-2.6.16-rt23.spin_mutex/include/linux/interrupt.h > @@ -34,21 +34,38 @@ typedef int irqreturn_t; > #define IRQ_HANDLED (1) > #define IRQ_RETVAL(x) ((x) != 0) > > +enum change_context_cmd { > + IRQ_TO_HARDIRQ, > + IRQ_CAN_THREAD, > + IRQ_TO_THREADED > +}; > + > struct irqaction { > irqreturn_t (*handler)(int, void *, struct pt_regs *); > unsigned long flags; > cpumask_t mask; > const char *name; > void *dev_id; > +#ifdef CONFIG_CHANGE_IRQ_CONTEXT > + int (*change_context)(int, void *, > + enum change_context_cmd); > +#endif > struct irqaction *next; > int irq; > - struct proc_dir_entry *dir, *threaded; > + struct proc_dir_entry *dir; > + struct rcu_head rcu; > }; > > extern irqreturn_t no_action(int cpl, void *dev_id, struct pt_regs *regs); > extern int request_irq(unsigned int, > irqreturn_t (*handler)(int, void *, struct pt_regs *), > unsigned long, const char *, void *); > +extern int request_irq2(unsigned int irq, > + irqreturn_t (*handler)(int, void *, struct pt_regs *), > + unsigned long irqflags, const char * devname, > + void *dev_id, > + int (*change_context)(int, void *, > + enum change_context_cmd)); > extern void free_irq(unsigned int, void *); > > > Index: linux-2.6.16-rt23.spin_mutex/include/linux/irq.h > =================================================================== > --- linux-2.6.16-rt23.spin_mutex.orig/include/linux/irq.h > +++ linux-2.6.16-rt23.spin_mutex/include/linux/irq.h > @@ -47,6 +47,18 @@ > # define SA_NODELAY 0x01000000 > #endif > > +#ifndef SA_MUST_THREAD > +# define SA_MUST_THREAD 0x02000000 > +#endif > + > +/* Set this flag if the irq handler must thread under preempt-rt otherwise not > + */ > +#ifdef CONFIG_PREEMPT_RT > +# define SA_MUST_THREAD_RT SA_MUST_THREAD > +#else > +# define SA_MUST_THREAD_RT 0 > +#endif > + > /* > * IRQ types > */ > @@ -147,12 +159,13 @@ struct irq_type { > * @chip: low level hardware access functions - comes from type > * @action: the irq action chain > * @status: status information > + * (protected by the spinlock ) > * @depth: disable-depth, for nested irq_disable() calls > * @irq_count: stats field to detect stalled irqs > * @irqs_unhandled: stats field for spurious unhandled interrupts > * @thread: Thread pointer for threaded preemptible irq handling > * @wait_for_handler: Waitqueue to wait for a running preemptible handler > - * @lock: locking for SMP > + * @lock: lock around the action list > * @move_irq: Flag need to re-target interrupt destination > * > * Pad this out to 32 bytes for cache and indexing reasons. > Index: linux-2.6.16-rt23.spin_mutex/kernel/Kconfig.preempt > =================================================================== > --- linux-2.6.16-rt23.spin_mutex.orig/kernel/Kconfig.preempt > +++ linux-2.6.16-rt23.spin_mutex/kernel/Kconfig.preempt > @@ -119,6 +119,17 @@ config PREEMPT_HARDIRQS > > Say N if you are unsure. > > +config CHANGE_IRQ_CONTEXT > + bool "Change the irq context runtime" > + depends on PREEMPT_HARDIRQS && (PREEMPT_RCU || !SPIN_MUTEXES) > + help > + You can change wether the IRQ handler(s) for each IRQ number is > + running in hardirq context or as threaded by writing to > + /proc/irq//threaded > + If PREEMPT_RT is selected the drivers involved must be ready for it, > + though, or the write will fail. Remember to switch on SPIN_MUTEXES as > + well in that case as the drivers which are ready uses spin-mutexes. > + > config SPINLOCK_BKL > bool "Old-Style Big Kernel Lock" > depends on (PREEMPT || SMP) && !PREEMPT_RT > Index: linux-2.6.16-rt23.spin_mutex/kernel/irq/internals.h > =================================================================== > --- linux-2.6.16-rt23.spin_mutex.orig/kernel/irq/internals.h > +++ linux-2.6.16-rt23.spin_mutex/kernel/irq/internals.h > @@ -22,6 +22,10 @@ static inline void end_irq(irq_desc_t *d > } > > #ifdef CONFIG_PROC_FS > +#ifdef CONFIG_CHANGE_IRQ_CONTEXT > +extern int make_irq_nodelay(int irq, struct irq_desc *desc); > +extern int make_irq_threaded(int irq, struct irq_desc *desc); > +#endif > extern void register_irq_proc(unsigned int irq); > extern void register_handler_proc(unsigned int irq, struct irqaction *action); > extern void unregister_handler_proc(unsigned int irq, struct irqaction *action); > Index: linux-2.6.16-rt23.spin_mutex/kernel/irq/manage.c > =================================================================== > --- linux-2.6.16-rt23.spin_mutex.orig/kernel/irq/manage.c > +++ linux-2.6.16-rt23.spin_mutex/kernel/irq/manage.c > @@ -206,6 +206,153 @@ int can_request_irq(unsigned int irq, un > return !action; > } > > + > +#ifdef CONFIG_CHANGE_IRQ_CONTEXT > +int make_action_nodelay(int irq, struct irqaction *act) > +{ > + unsigned long flags; > + > + if(!(act->flags & SA_MUST_THREAD)) { > + return 0; > + } Remove the brackets. Also, let me get this straight. If the action does not have SA_MUST_THREAD, we return? Or does it mean that if SA_MUST_THREAD is not set, then SA_NODELAY must already be set? If that is the case, then why are we not checking for SA_NODELAY here? > + > + if( act->change_context ) { > + int ret = > + act->change_context(irq, act->dev_id, IRQ_TO_HARDIRQ); > + if(ret) { > + printk(KERN_ERR "Failed to change irq handler " > + "for dev %s on IRQ%d to hardirq " > + "context (ret: %d)\n", act->name, irq, ret); > + return ret; > + } > + spin_lock_irqsave(&irq_desc[irq].lock, flags); > + act->flags &=~SA_MUST_THREAD; Both flags are zero here (see below about the WARN_ON) > + act->flags |= SA_NODELAY; > + spin_unlock_irqrestore(&irq_desc[irq].lock, flags); > + return 0; > + } > + else { > + printk(KERN_ERR "Irq handler " > + "for dev %s on IRQ%d can not be changed" > + " to hardirq context\n", act->name, irq); > + return -ENOSYS; > + } > + > +} > + > + > +static int __make_irq_threaded(int irq, struct irq_desc *desc); > + > +int make_irq_nodelay(int irq, struct irq_desc *desc) > +{ > + int ret = 0; > + struct irqaction *act; > + unsigned long flags; > + > + rcu_read_lock(); > + for(act = desc->action; act; act = act->next) { > + WARN_ON(((~act->flags) & (SA_MUST_THREAD|SA_NODELAY)) == 0); This WARN_ON is not protected by the descriptor lock, so if this function is run on two CPUs at the same time, then the act->flags can have both of these zero by the above code. > + if(act->flags & SA_MUST_THREAD) { > + ret = make_action_nodelay(irq, act); > + if(ret) { > + printk(KERN_ERR "Could not make irq %d " > + "nodelay (errno %d)\n", > + irq, ret); We printk on failure from above, and then printk again here? > + goto failed; > + } > + } > + } > + spin_lock_irqsave(&desc->lock, flags); > + desc->status |= IRQ_NODELAY; > + spin_unlock_irqrestore(&desc->lock, flags); > + rcu_read_unlock(); > + > + return 0; > + failed: > + __make_irq_threaded(irq, desc); > + rcu_read_unlock(); > + return ret; > +} > + > +int action_may_thread(int irq, struct irqaction *act) > +{ > + if(!act->change_context) > + return !(act->flags & SA_NODELAY); > + > + return act->change_context(irq, act->dev_id, IRQ_CAN_THREAD) == 0; This IRQ_CAN_THREAD just looks out of place of the other two. It feels very "hacky" to check if it can thread. But what do I know? > +} > + > + > +static int __make_irq_threaded(int irq, struct irq_desc *desc) > +{ > + struct irqaction *act; > + > + for(act = desc->action; act; act = act->next) { > + WARN_ON(((~act->flags) & (SA_MUST_THREAD|SA_NODELAY)) == 0); > + if(!action_may_thread(irq, act)) { > + return -EINVAL; > + } > + } > + > + if (start_irq_thread(irq, desc)) > + return -ENOMEM; I know that currently start_irq_thread can only return -ENOMEM on failure, but it may be more robust to capture the return code and return that anyway. > + > + spin_lock_irq(&desc->lock); > + desc->status &= ~IRQ_NODELAY; > + spin_unlock_irq(&desc->lock); > + > + /* We can't make irq handlers change their context before we > + are sure no CPU is running them in hard irq context */ > + synchronize_irq(irq); > + > + for(act = desc->action; act; act = act->next) { > + WARN_ON(((~act->flags) & (SA_MUST_THREAD|SA_NODELAY)) == 0); > + if(act->change_context) { > + /* This callback can't fail */ Will all device drivers know that the callback can't fail? > + act->change_context(irq, act->dev_id, IRQ_TO_THREADED); > + spin_lock_irq(&desc->lock); > + act->flags &=~SA_NODELAY; > + act->flags |= SA_MUST_THREAD; > + spin_unlock_irq(&desc->lock); > + } > + } > + > + return 0; > +} [snipped the rest] -- Steve