* [PATCH] [genirq] Expose default irq affinity mask (take 2)
@ 2008-05-27 22:01 Max Krasnyansky
2008-05-29 4:37 ` Paul Jackson
0 siblings, 1 reply; 3+ messages in thread
From: Max Krasnyansky @ 2008-05-27 22:01 UTC (permalink / raw)
To: mingo; +Cc: pj, a.p.zijlstra, linux-kernel, tglx, rdunlap, Max Krasnyansky
Current IRQ affinity interface does not provide a way to set affinity
for the IRQs that will be allocated/activated in the future.
This patch creates /proc/irq/default_smp_affinity that lets users set
default affinity mask for the newly allocated IRQs. Changing the default
does not affect affinity masks for the currently active IRQs, they
have to be changed explicitly.
This version addressed comments from Paul and Randy.
Signed-off-by: Max Krasnyansky <maxk@qualcomm.com>
---
Documentation/filesystems/proc.txt | 30 ++++++++++++-------
arch/alpha/kernel/irq.c | 5 +--
include/linux/irq.h | 14 +++------
kernel/irq/manage.c | 28 ++++++++++++++++-
kernel/irq/proc.c | 57 +++++++++++++++++++++++++++++++++++-
5 files changed, 108 insertions(+), 26 deletions(-)
diff --git a/Documentation/filesystems/proc.txt b/Documentation/filesystems/proc.txt
index 518ebe6..231d9db 100644
--- a/Documentation/filesystems/proc.txt
+++ b/Documentation/filesystems/proc.txt
@@ -379,28 +379,36 @@ i386 and x86_64 platforms support the new IRQ vector displays.
Of some interest is the introduction of the /proc/irq directory to 2.4.
It could be used to set IRQ to CPU affinity, this means that you can "hook" an
IRQ to only one CPU, or to exclude a CPU of handling IRQs. The contents of the
-irq subdir is one subdir for each IRQ, and one file; prof_cpu_mask
+irq subdir is one subdir for each IRQ, and two files; default_smp_affinity and
+prof_cpu_mask
For example
> ls /proc/irq/
0 10 12 14 16 18 2 4 6 8 prof_cpu_mask
- 1 11 13 15 17 19 3 5 7 9
+ 1 11 13 15 17 19 3 5 7 9 default_smp_affinity
> ls /proc/irq/0/
smp_affinity
-The contents of the prof_cpu_mask file and each smp_affinity file for each IRQ
-is the same by default:
+smp_affinity is a bitmask, in which you can specify which CPUs can handle the
+IRQ, you can set it by doing:
- > cat /proc/irq/0/smp_affinity
- ffffffff
+ > echo 1 > /proc/irq/5/smp_affinity
+
+This means that only the first CPU will handle the IRQ, but you can also echo
+5 which means that only the first and fourth CPU can handle the IRQ.
-It's a bitmask, in which you can specify which CPUs can handle the IRQ, you can
-set it by doing:
+The contents of each smp_affinity file is the same by default:
+
+ > cat /proc/irq/0/smp_affinity
+ ffffffff
- > echo 1 > /proc/irq/prof_cpu_mask
+default_smp_affinity mask applies to all non-active IRQs. In other words IRQs
+that have not been allocated/activated yet (for which /proc/irq/[0-9]* directory
+does not exist). Once IRQ is activated its affinity mask can be changed as
+described above.
-This means that only the first CPU will handle the IRQ, but you can also echo 5
-which means that only the first and fourth CPU can handle the IRQ.
+prof_cpu_mask specifies which CPUs are to be profiled by the system wide
+profiler.
The way IRQs are routed is handled by the IO-APIC, and it's Round Robin
between all the CPUs which are allowed to handle it. As usual the kernel has
diff --git a/arch/alpha/kernel/irq.c b/arch/alpha/kernel/irq.c
index facf82a..c626a82 100644
--- a/arch/alpha/kernel/irq.c
+++ b/arch/alpha/kernel/irq.c
@@ -42,8 +42,7 @@ void ack_bad_irq(unsigned int irq)
#ifdef CONFIG_SMP
static char irq_user_affinity[NR_IRQS];
-int
-select_smp_affinity(unsigned int irq)
+int irq_select_affinity(unsigned int irq)
{
static int last_cpu;
int cpu = last_cpu + 1;
@@ -51,7 +50,7 @@ select_smp_affinity(unsigned int irq)
if (!irq_desc[irq].chip->set_affinity || irq_user_affinity[irq])
return 1;
- while (!cpu_possible(cpu))
+ while (!cpu_possible(cpu) || !cpu_isset(cpu, irq_default_affinity))
cpu = (cpu < (NR_CPUS-1) ? cpu + 1 : 0);
last_cpu = cpu;
diff --git a/include/linux/irq.h b/include/linux/irq.h
index 176e5e7..8faebf8 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -228,8 +228,11 @@ static inline void set_pending_irq(unsigned int irq, cpumask_t mask)
#endif /* CONFIG_GENERIC_PENDING_IRQ */
+extern cpumask_t irq_default_affinity;
+
extern int irq_set_affinity(unsigned int irq, cpumask_t cpumask);
extern int irq_can_set_affinity(unsigned int irq);
+extern int irq_select_affinity(unsigned int irq);
#else /* CONFIG_SMP */
@@ -243,6 +246,8 @@ static inline int irq_set_affinity(unsigned int irq, cpumask_t cpumask)
static inline int irq_can_set_affinity(unsigned int irq) { return 0; }
+static inline int irq_select_affinity(unsigned int irq) { return 0; }
+
#endif /* CONFIG_SMP */
#ifdef CONFIG_IRQBALANCE
@@ -253,15 +258,6 @@ static inline void set_balance_irq_affinity(unsigned int irq, cpumask_t mask)
}
#endif
-#ifdef CONFIG_AUTO_IRQ_AFFINITY
-extern int select_smp_affinity(unsigned int irq);
-#else
-static inline int select_smp_affinity(unsigned int irq)
-{
- return 1;
-}
-#endif
-
extern int no_irq_affinity;
static inline int irq_balancing_disabled(unsigned int irq)
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 438a014..80c7573 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -16,6 +16,8 @@
#ifdef CONFIG_SMP
+cpumask_t irq_default_affinity = CPU_MASK_ALL;
+
/**
* synchronize_irq - wait for pending IRQ handlers (on other CPUs)
* @irq: interrupt number to wait for
@@ -94,6 +96,27 @@ int irq_set_affinity(unsigned int irq, cpumask_t cpumask)
return 0;
}
+#ifndef CONFIG_AUTO_IRQ_AFFINITY
+/*
+ * Generic version of the affinity autoselector.
+ */
+int irq_select_affinity(unsigned int irq)
+{
+ cpumask_t mask;
+
+ if (!irq_can_set_affinity(irq))
+ return 0;
+
+ cpus_and(mask, cpu_online_map, irq_default_affinity);
+
+ irq_desc[irq].affinity = mask;
+ irq_desc[irq].chip->set_affinity(irq, mask);
+
+ set_balance_irq_affinity(irq, mask);
+ return 0;
+}
+#endif
+
#endif
/**
@@ -376,6 +399,9 @@ int setup_irq(unsigned int irq, struct irqaction *new)
} else
/* Undo nested disables: */
desc->depth = 1;
+
+ /* Set default affinity mask once everything is setup */
+ irq_select_affinity(irq);
}
/* Reset broken irq detection when installing new handler */
desc->irq_count = 0;
@@ -555,8 +581,6 @@ int request_irq(unsigned int irq, irq_handler_t handler,
action->next = NULL;
action->dev_id = dev_id;
- select_smp_affinity(irq);
-
#ifdef CONFIG_DEBUG_SHIRQ
if (irqflags & IRQF_SHARED) {
/*
diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
index c2f2ccb..c20eb52 100644
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -66,13 +66,51 @@ static int irq_affinity_write_proc(struct file *file, const char __user *buffer,
if (cpus_empty(tmp))
/* Special case for empty set - allow the architecture
code to set default SMP affinity. */
- return select_smp_affinity(irq) ? -EINVAL : full_count;
+ return irq_select_affinity(irq) ? -EINVAL : full_count;
irq_set_affinity(irq, new_value);
return full_count;
}
+extern cpumask_t irq_default_affinity;
+
+static int default_affinity_read(char *page, char **start, off_t off,
+ int count, int *eof, void *data)
+{
+ int len = cpumask_scnprintf(page, count, irq_default_affinity);
+ if (count - len < 2)
+ return -EINVAL;
+ len += sprintf(page + len, "\n");
+ return len;
+}
+
+static int default_affinity_write(struct file *file, const char __user *buffer,
+ unsigned long count, void *data)
+{
+ unsigned int full_count = count, err;
+ cpumask_t new_value, tmp;
+
+ err = cpumask_parse_user(buffer, count, new_value);
+ if (err)
+ return err;
+
+ if (!is_affinity_mask_valid(new_value))
+ return -EINVAL;
+
+ /*
+ * Do not allow disabling IRQs completely - it's a too easy
+ * way to make the system unusable accidentally :-) At least
+ * one online CPU still has to be targeted.
+ */
+ cpus_and(tmp, new_value, cpu_online_map);
+ if (cpus_empty(tmp))
+ return -EINVAL;
+
+ irq_default_affinity = new_value;
+
+ return full_count;
+}
#endif
static int irq_spurious_read(char *page, char **start, off_t off,
@@ -171,6 +209,21 @@ void unregister_handler_proc(unsigned int irq, struct irqaction *action)
remove_proc_entry(action->dir->name, irq_desc[irq].dir);
}
+void register_default_affinity_proc(void)
+{
+#ifdef CONFIG_SMP
+ struct proc_dir_entry *entry;
+
+ /* create /proc/irq/default_smp_affinity */
+ entry = create_proc_entry("default_smp_affinity", 0600, root_irq_dir);
+ if (entry) {
+ entry->data = NULL;
+ entry->read_proc = default_affinity_read;
+ entry->write_proc = default_affinity_write;
+ }
+#endif
+}
+
void init_irq_proc(void)
{
int i;
@@ -180,6 +233,8 @@ void init_irq_proc(void)
if (!root_irq_dir)
return;
+ register_default_affinity_proc();
+
/*
* Create entries for all existing IRQs.
*/
--
1.5.4.5
^ permalink raw reply related [flat|nested] 3+ messages in thread* Re: [PATCH] [genirq] Expose default irq affinity mask (take 2)
2008-05-27 22:01 [PATCH] [genirq] Expose default irq affinity mask (take 2) Max Krasnyansky
@ 2008-05-29 4:37 ` Paul Jackson
2008-05-29 16:30 ` Max Krasnyanskiy
0 siblings, 1 reply; 3+ messages in thread
From: Paul Jackson @ 2008-05-29 4:37 UTC (permalink / raw)
To: Max Krasnyansky; +Cc: mingo, a.p.zijlstra, linux-kernel, tglx, rdunlap, maxk
Review comments ... just nits ... patch looks good overall.
1. Trailing whitespace on Doc lines:
+irq subdir is one subdir for each IRQ, and two files; default_smp_affinity and
+smp_affinity is a bitmask, in which you can specify which CPUs can handle the
+default_smp_affinity mask applies to all non-active IRQs. In other words IRQs
(Granted, there are already about 300 hundred trailing whitespaces in that file,
so this is like trying not to spit into the Pacific ocean ;).
2. I think that the following line should be removed from kernel/irq/proc.c:
+extern cpumask_t irq_default_affinity;
(You already pick up that extern from <linux/irq.h>, as it should be.)
3. The trailing period was missing after the line:
irq subdir is one subdir for each IRQ, and one file; prof_cpu_mask
Might as well add it, after "prof_cpu_mask", while you're there anyway.
4. In the doc:
+ > echo 1 > /proc/irq/5/smp_affinity
+
+This means that only the first CPU will handle the IRQ, but you can also echo
+5 which means that only the first and fourth CPU can handle the IRQ.
it might be a good idea to not use "5" for both the example CPU
and for the example mask. It increases the chance of a reader
getting confused as to which 5 is which. Perhaps try a different
example CPU, such as:
echo 1 > /proc/irq/2/smp_affinity
5. The original Doc text, before your patch, stated that the
"The contents of the prof_cpu_mask file ... by default."
After your patch, this statement that the default contents
of the prof_cpu_mask file was ffffffff got lost.
6. The grammar of the following got mangled a little:
+default_smp_affinity mask applies to all non-active IRQs. In other words IRQs
+that have not been allocated/activated yet (for which /proc/irq/[0-9]* directory
+does not exist).
Perhaps the following is better:
+The default_smp_affinity mask applies to all non-active IRQs, which are the
+IRQs which have not yet been allocated/activated, and hence which lack a
+/proc/irq/[0-9]* directory.
7. The lines of code (with their definition of cpumask_t tmp):
+ cpus_and(tmp, new_value, cpu_online_map);
+ if (cpus_empty(tmp))
+ return -EINVAL;
can be better written, I'm pretty sure, as:
+ if (!cpus_intersects(new_value, cpu_online_map))
+ return -EINVAL;
Minimizing cpumask_t's as local stack variables is a good thing,
especially on some of the high (thousands) count CPU systems that
SGI is developing.
Nice work. Thanks.
--
I won't rest till it's the best ...
Programmer, Linux Scalability
Paul Jackson <pj@sgi.com> 1.940.382.4214
^ permalink raw reply [flat|nested] 3+ messages in thread* Re: [PATCH] [genirq] Expose default irq affinity mask (take 2)
2008-05-29 4:37 ` Paul Jackson
@ 2008-05-29 16:30 ` Max Krasnyanskiy
0 siblings, 0 replies; 3+ messages in thread
From: Max Krasnyanskiy @ 2008-05-29 16:30 UTC (permalink / raw)
To: Paul Jackson; +Cc: mingo, a.p.zijlstra, linux-kernel, tglx, rdunlap
Paul Jackson wrote:
> Review comments ... just nits ... patch looks good overall.
>
> 1. Trailing whitespace on Doc lines:
> +irq subdir is one subdir for each IRQ, and two files; default_smp_affinity and
> +smp_affinity is a bitmask, in which you can specify which CPUs can handle the
> +default_smp_affinity mask applies to all non-active IRQs. In other words IRQs
>
> (Granted, there are already about 300 hundred trailing whitespaces in that file,
> so this is like trying not to spit into the Pacific ocean ;).
>
> 2. I think that the following line should be removed from kernel/irq/proc.c:
>
> +extern cpumask_t irq_default_affinity;
>
> (You already pick up that extern from <linux/irq.h>, as it should be.)
>
> 3. The trailing period was missing after the line:
>
> irq subdir is one subdir for each IRQ, and one file; prof_cpu_mask
>
> Might as well add it, after "prof_cpu_mask", while you're there anyway.
>
> 4. In the doc:
>
> + > echo 1 > /proc/irq/5/smp_affinity
> +
> +This means that only the first CPU will handle the IRQ, but you can also echo
> +5 which means that only the first and fourth CPU can handle the IRQ.
>
> it might be a good idea to not use "5" for both the example CPU
> and for the example mask. It increases the chance of a reader
> getting confused as to which 5 is which. Perhaps try a different
> example CPU, such as:
>
> echo 1 > /proc/irq/2/smp_affinity
>
> 5. The original Doc text, before your patch, stated that the
> "The contents of the prof_cpu_mask file ... by default."
>
> After your patch, this statement that the default contents
> of the prof_cpu_mask file was ffffffff got lost.
>
> 6. The grammar of the following got mangled a little:
>
> +default_smp_affinity mask applies to all non-active IRQs. In other words IRQs
> +that have not been allocated/activated yet (for which /proc/irq/[0-9]* directory
> +does not exist).
>
> Perhaps the following is better:
>
> +The default_smp_affinity mask applies to all non-active IRQs, which are the
> +IRQs which have not yet been allocated/activated, and hence which lack a
> +/proc/irq/[0-9]* directory.
>
> 7. The lines of code (with their definition of cpumask_t tmp):
> + cpus_and(tmp, new_value, cpu_online_map);
> + if (cpus_empty(tmp))
> + return -EINVAL;
>
> can be better written, I'm pretty sure, as:
> + if (!cpus_intersects(new_value, cpu_online_map))
> + return -EINVAL;
>
> Minimizing cpumask_t's as local stack variables is a good thing,
> especially on some of the high (thousands) count CPU systems that
> SGI is developing.
Nice (thorough) review. Thanx. I'll update the patch based on your comments.
#7 was definitely "monkey see, monkey do". I cut and pasted it from existing
mask update without much thought. Will update both places.
Thanx
Max
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2008-05-29 16:31 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-05-27 22:01 [PATCH] [genirq] Expose default irq affinity mask (take 2) Max Krasnyansky
2008-05-29 4:37 ` Paul Jackson
2008-05-29 16:30 ` Max Krasnyanskiy
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.