* Re: [PATCH 1/1] sched: provide common cpu_relax_yield definition
From: David Hildenbrand @ 2016-11-16 12:34 UTC (permalink / raw)
To: Christian Borntraeger, Peter Zijlstra
Cc: linux-arch, linux-s390, kvm, Catalin Marinas, x86, Will Deacon,
Russell King, Heiko Carstens, linux-kernel, sparclinux,
Noam Camus, Nicholas Piggin, Martin Schwidefsky, xen-devel,
virtualization, linuxppc-dev, Ingo Molnar
In-Reply-To: <1479298985-191589-1-git-send-email-borntraeger@de.ibm.com>
Am 16.11.2016 um 13:23 schrieb Christian Borntraeger:
> No need to duplicate the same define everywhere. Since
> the only user is stop-machine and the only provider is
> s390, we can use a default implementation of cpu_relax_yield
> in sched.h.
>
> Suggested-by: Russell King <linux@armlinux.org.uk>
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Looks good to me!
Reviewed-by: David Hildenbrand <david@redhat.com>
David
^ permalink raw reply
* Re: [PATCH 1/1] sched: provide common cpu_relax_yield definition
From: Russell King - ARM Linux @ 2016-11-16 12:42 UTC (permalink / raw)
To: Christian Borntraeger
Cc: linux-arch, linux-s390, kvm, Peter Zijlstra, Catalin Marinas, x86,
Will Deacon, linux-kernel, Heiko Carstens, virtualization,
sparclinux, Noam Camus, Nicholas Piggin, Martin Schwidefsky,
xen-devel, linuxppc-dev, Ingo Molnar
In-Reply-To: <1479298985-191589-1-git-send-email-borntraeger@de.ibm.com>
On Wed, Nov 16, 2016 at 01:23:05PM +0100, Christian Borntraeger wrote:
> No need to duplicate the same define everywhere. Since
> the only user is stop-machine and the only provider is
> s390, we can use a default implementation of cpu_relax_yield
> in sched.h.
>
> Suggested-by: Russell King <linux@armlinux.org.uk>
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Thanks for that. (Please change my address above when adding this ack...)
Acked-by: Russell King <rmk+kernel@armlinux.org.uk>
> ---
> arch/alpha/include/asm/processor.h | 1 -
> arch/arc/include/asm/processor.h | 3 ---
> arch/arm/include/asm/processor.h | 2 --
> arch/arm64/include/asm/processor.h | 2 --
> arch/avr32/include/asm/processor.h | 1 -
> arch/blackfin/include/asm/processor.h | 1 -
> arch/c6x/include/asm/processor.h | 1 -
> arch/cris/include/asm/processor.h | 1 -
> arch/frv/include/asm/processor.h | 1 -
> arch/h8300/include/asm/processor.h | 1 -
> arch/hexagon/include/asm/processor.h | 1 -
> arch/ia64/include/asm/processor.h | 1 -
> arch/m32r/include/asm/processor.h | 1 -
> arch/m68k/include/asm/processor.h | 1 -
> arch/metag/include/asm/processor.h | 1 -
> arch/microblaze/include/asm/processor.h | 1 -
> arch/mips/include/asm/processor.h | 1 -
> arch/mn10300/include/asm/processor.h | 1 -
> arch/nios2/include/asm/processor.h | 1 -
> arch/openrisc/include/asm/processor.h | 1 -
> arch/parisc/include/asm/processor.h | 1 -
> arch/powerpc/include/asm/processor.h | 2 --
> arch/s390/include/asm/processor.h | 1 +
> arch/score/include/asm/processor.h | 1 -
> arch/sh/include/asm/processor.h | 1 -
> arch/sparc/include/asm/processor_32.h | 1 -
> arch/sparc/include/asm/processor_64.h | 1 -
> arch/tile/include/asm/processor.h | 2 --
> arch/unicore32/include/asm/processor.h | 1 -
> arch/x86/include/asm/processor.h | 2 --
> arch/x86/um/asm/processor.h | 1 -
> arch/xtensa/include/asm/processor.h | 1 -
> include/linux/sched.h | 4 ++++
> 33 files changed, 5 insertions(+), 38 deletions(-)
>
> diff --git a/arch/alpha/include/asm/processor.h b/arch/alpha/include/asm/processor.h
> index 31e8dbe..2fec2de 100644
> --- a/arch/alpha/include/asm/processor.h
> +++ b/arch/alpha/include/asm/processor.h
> @@ -58,7 +58,6 @@ unsigned long get_wchan(struct task_struct *p);
> ((tsk) == current ? rdusp() : task_thread_info(tsk)->pcb.usp)
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> #define ARCH_HAS_PREFETCH
> #define ARCH_HAS_PREFETCHW
> diff --git a/arch/arc/include/asm/processor.h b/arch/arc/include/asm/processor.h
> index d102a49..6e1242d 100644
> --- a/arch/arc/include/asm/processor.h
> +++ b/arch/arc/include/asm/processor.h
> @@ -60,15 +60,12 @@ struct task_struct;
> #ifndef CONFIG_EZNPS_MTM_EXT
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> #else
>
> #define cpu_relax() \
> __asm__ __volatile__ (".word %0" : : "i"(CTOP_INST_SCHD_RW) : "memory")
>
> -#define cpu_relax_yield() cpu_relax()
> -
> #endif
>
> #define copy_segments(tsk, mm) do { } while (0)
> diff --git a/arch/arm/include/asm/processor.h b/arch/arm/include/asm/processor.h
> index 9e71c58b..c3d5fc1 100644
> --- a/arch/arm/include/asm/processor.h
> +++ b/arch/arm/include/asm/processor.h
> @@ -82,8 +82,6 @@ unsigned long get_wchan(struct task_struct *p);
> #define cpu_relax() barrier()
> #endif
>
> -#define cpu_relax_yield() cpu_relax()
> -
> #define task_pt_regs(p) \
> ((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
>
> diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
> index 6132f64..747c65a 100644
> --- a/arch/arm64/include/asm/processor.h
> +++ b/arch/arm64/include/asm/processor.h
> @@ -149,8 +149,6 @@ static inline void cpu_relax(void)
> asm volatile("yield" ::: "memory");
> }
>
> -#define cpu_relax_yield() cpu_relax()
> -
> /* Thread switching */
> extern struct task_struct *cpu_switch_to(struct task_struct *prev,
> struct task_struct *next);
> diff --git a/arch/avr32/include/asm/processor.h b/arch/avr32/include/asm/processor.h
> index ee62365..972adcc 100644
> --- a/arch/avr32/include/asm/processor.h
> +++ b/arch/avr32/include/asm/processor.h
> @@ -92,7 +92,6 @@ extern struct avr32_cpuinfo boot_cpu_data;
> #define TASK_UNMAPPED_BASE (PAGE_ALIGN(TASK_SIZE / 3))
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
> #define cpu_sync_pipeline() asm volatile("sub pc, -2" : : : "memory")
>
> struct cpu_context {
> diff --git a/arch/blackfin/include/asm/processor.h b/arch/blackfin/include/asm/processor.h
> index 57acfb1..85d4af9 100644
> --- a/arch/blackfin/include/asm/processor.h
> +++ b/arch/blackfin/include/asm/processor.h
> @@ -92,7 +92,6 @@ unsigned long get_wchan(struct task_struct *p);
> #define KSTK_ESP(tsk) ((tsk) == current ? rdusp() : (tsk)->thread.usp)
>
> #define cpu_relax() smp_mb()
> -#define cpu_relax_yield() cpu_relax()
>
> /* Get the Silicon Revision of the chip */
> static inline uint32_t __pure bfin_revid(void)
> diff --git a/arch/c6x/include/asm/processor.h b/arch/c6x/include/asm/processor.h
> index 1fd22e7..b9eb3da 100644
> --- a/arch/c6x/include/asm/processor.h
> +++ b/arch/c6x/include/asm/processor.h
> @@ -121,7 +121,6 @@ extern unsigned long get_wchan(struct task_struct *p);
> #define KSTK_ESP(task) (task_pt_regs(task)->sp)
>
> #define cpu_relax() do { } while (0)
> -#define cpu_relax_yield() cpu_relax()
>
> extern const struct seq_operations cpuinfo_op;
>
> diff --git a/arch/cris/include/asm/processor.h b/arch/cris/include/asm/processor.h
> index 1a57841..15b815d 100644
> --- a/arch/cris/include/asm/processor.h
> +++ b/arch/cris/include/asm/processor.h
> @@ -63,7 +63,6 @@ static inline void release_thread(struct task_struct *dead_task)
> #define init_stack (init_thread_union.stack)
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> void default_idle(void);
>
> diff --git a/arch/frv/include/asm/processor.h b/arch/frv/include/asm/processor.h
> index c1e5f2a..ddaeb9c 100644
> --- a/arch/frv/include/asm/processor.h
> +++ b/arch/frv/include/asm/processor.h
> @@ -107,7 +107,6 @@ unsigned long get_wchan(struct task_struct *p);
> #define KSTK_ESP(tsk) ((tsk)->thread.frame0->sp)
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> /* data cache prefetch */
> #define ARCH_HAS_PREFETCH
> diff --git a/arch/h8300/include/asm/processor.h b/arch/h8300/include/asm/processor.h
> index 42d6053..65132d7 100644
> --- a/arch/h8300/include/asm/processor.h
> +++ b/arch/h8300/include/asm/processor.h
> @@ -127,7 +127,6 @@ unsigned long get_wchan(struct task_struct *p);
> #define KSTK_ESP(tsk) ((tsk) == current ? rdusp() : (tsk)->thread.usp)
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> #define HARD_RESET_NOW() ({ \
> local_irq_disable(); \
> diff --git a/arch/hexagon/include/asm/processor.h b/arch/hexagon/include/asm/processor.h
> index 5d694cc..45a8254 100644
> --- a/arch/hexagon/include/asm/processor.h
> +++ b/arch/hexagon/include/asm/processor.h
> @@ -56,7 +56,6 @@ struct thread_struct {
> }
>
> #define cpu_relax() __vmyield()
> -#define cpu_relax_yield() cpu_relax()
>
> /*
> * Decides where the kernel will search for a free chunk of vm space during
> diff --git a/arch/ia64/include/asm/processor.h b/arch/ia64/include/asm/processor.h
> index 0c2c3b2..03911a3 100644
> --- a/arch/ia64/include/asm/processor.h
> +++ b/arch/ia64/include/asm/processor.h
> @@ -547,7 +547,6 @@ ia64_eoi (void)
> }
>
> #define cpu_relax() ia64_hint(ia64_hint_pause)
> -#define cpu_relax_yield() cpu_relax()
>
> static inline int
> ia64_get_irr(unsigned int vector)
> diff --git a/arch/m32r/include/asm/processor.h b/arch/m32r/include/asm/processor.h
> index 9b83a13..5767367 100644
> --- a/arch/m32r/include/asm/processor.h
> +++ b/arch/m32r/include/asm/processor.h
> @@ -133,6 +133,5 @@ unsigned long get_wchan(struct task_struct *p);
> #define KSTK_ESP(tsk) ((tsk)->thread.sp)
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> #endif /* _ASM_M32R_PROCESSOR_H */
> diff --git a/arch/m68k/include/asm/processor.h b/arch/m68k/include/asm/processor.h
> index b0d0442..f5f790c 100644
> --- a/arch/m68k/include/asm/processor.h
> +++ b/arch/m68k/include/asm/processor.h
> @@ -156,6 +156,5 @@ unsigned long get_wchan(struct task_struct *p);
> #define task_pt_regs(tsk) ((struct pt_regs *) ((tsk)->thread.esp0))
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> #endif
> diff --git a/arch/metag/include/asm/processor.h b/arch/metag/include/asm/processor.h
> index ee302a6..ec6a490 100644
> --- a/arch/metag/include/asm/processor.h
> +++ b/arch/metag/include/asm/processor.h
> @@ -152,7 +152,6 @@ unsigned long get_wchan(struct task_struct *p);
> #define user_stack_pointer(regs) ((regs)->ctx.AX[0].U0)
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> extern void setup_priv(void);
>
> diff --git a/arch/microblaze/include/asm/processor.h b/arch/microblaze/include/asm/processor.h
> index 08ec1f7..37ef196 100644
> --- a/arch/microblaze/include/asm/processor.h
> +++ b/arch/microblaze/include/asm/processor.h
> @@ -22,7 +22,6 @@
> extern const struct seq_operations cpuinfo_op;
>
> # define cpu_relax() barrier()
> -# define cpu_relax_yield() cpu_relax()
>
> #define task_pt_regs(tsk) \
> (((struct pt_regs *)(THREAD_SIZE + task_stack_page(tsk))) - 1)
> diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
> index 8ea95e7..95b8c47 100644
> --- a/arch/mips/include/asm/processor.h
> +++ b/arch/mips/include/asm/processor.h
> @@ -389,7 +389,6 @@ unsigned long get_wchan(struct task_struct *p);
> #define KSTK_STATUS(tsk) (task_pt_regs(tsk)->cp0_status)
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> /*
> * Return_address is a replacement for __builtin_return_address(count)
> diff --git a/arch/mn10300/include/asm/processor.h b/arch/mn10300/include/asm/processor.h
> index d11397b..18e17ab 100644
> --- a/arch/mn10300/include/asm/processor.h
> +++ b/arch/mn10300/include/asm/processor.h
> @@ -69,7 +69,6 @@ extern void print_cpu_info(struct mn10300_cpuinfo *);
> extern void dodgy_tsc(void);
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> /*
> * User space process size: 1.75GB (default).
> diff --git a/arch/nios2/include/asm/processor.h b/arch/nios2/include/asm/processor.h
> index d32c176..3bbbc3d 100644
> --- a/arch/nios2/include/asm/processor.h
> +++ b/arch/nios2/include/asm/processor.h
> @@ -88,7 +88,6 @@ extern unsigned long get_wchan(struct task_struct *p);
> #define KSTK_ESP(tsk) ((tsk)->thread.kregs->sp)
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> #endif /* __ASSEMBLY__ */
>
> diff --git a/arch/openrisc/include/asm/processor.h b/arch/openrisc/include/asm/processor.h
> index 7f47fc7..a908e6c 100644
> --- a/arch/openrisc/include/asm/processor.h
> +++ b/arch/openrisc/include/asm/processor.h
> @@ -92,7 +92,6 @@ extern unsigned long thread_saved_pc(struct task_struct *t);
> #define init_stack (init_thread_union.stack)
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> #endif /* __ASSEMBLY__ */
> #endif /* __ASM_OPENRISC_PROCESSOR_H */
> diff --git a/arch/parisc/include/asm/processor.h b/arch/parisc/include/asm/processor.h
> index a4a07f4..ca40741 100644
> --- a/arch/parisc/include/asm/processor.h
> +++ b/arch/parisc/include/asm/processor.h
> @@ -309,7 +309,6 @@ extern unsigned long get_wchan(struct task_struct *p);
> #define KSTK_ESP(tsk) ((tsk)->thread.regs.gr[30])
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> /*
> * parisc_requires_coherency() is used to identify the combined VIPT/PIPT
> diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h
> index 5684e68..dac83fc 100644
> --- a/arch/powerpc/include/asm/processor.h
> +++ b/arch/powerpc/include/asm/processor.h
> @@ -404,8 +404,6 @@ static inline unsigned long __pack_fe01(unsigned int fpmode)
> #define cpu_relax() barrier()
> #endif
>
> -#define cpu_relax_yield() cpu_relax()
> -
> /* Check that a certain kernel stack pointer is valid in task_struct p */
> int validate_sp(unsigned long sp, struct task_struct *p,
> unsigned long nbytes);
> diff --git a/arch/s390/include/asm/processor.h b/arch/s390/include/asm/processor.h
> index 17c001a..9eab1cb 100644
> --- a/arch/s390/include/asm/processor.h
> +++ b/arch/s390/include/asm/processor.h
> @@ -234,6 +234,7 @@ static inline unsigned short stap(void)
> /*
> * Give up the time slice of the virtual PU.
> */
> +#define cpu_relax_yield cpu_relax_yield
> void cpu_relax_yield(void);
>
> #define cpu_relax() barrier()
> diff --git a/arch/score/include/asm/processor.h b/arch/score/include/asm/processor.h
> index a1e97c0..d9a922d 100644
> --- a/arch/score/include/asm/processor.h
> +++ b/arch/score/include/asm/processor.h
> @@ -24,7 +24,6 @@ extern unsigned long get_wchan(struct task_struct *p);
> #define current_text_addr() ({ __label__ _l; _l: &&_l; })
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
> #define release_thread(thread) do {} while (0)
>
> /*
> diff --git a/arch/sh/include/asm/processor.h b/arch/sh/include/asm/processor.h
> index 9454ff1..5addd69 100644
> --- a/arch/sh/include/asm/processor.h
> +++ b/arch/sh/include/asm/processor.h
> @@ -97,7 +97,6 @@ extern struct sh_cpuinfo cpu_data[];
>
> #define cpu_sleep() __asm__ __volatile__ ("sleep" : : : "memory")
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> void default_idle(void);
> void stop_this_cpu(void *);
> diff --git a/arch/sparc/include/asm/processor_32.h b/arch/sparc/include/asm/processor_32.h
> index fc32b73..365d4cb 100644
> --- a/arch/sparc/include/asm/processor_32.h
> +++ b/arch/sparc/include/asm/processor_32.h
> @@ -119,7 +119,6 @@ extern struct task_struct *last_task_used_math;
> int do_mathemu(struct pt_regs *regs, struct task_struct *fpt);
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> extern void (*sparc_idle)(void);
>
> diff --git a/arch/sparc/include/asm/processor_64.h b/arch/sparc/include/asm/processor_64.h
> index 12787df..6448cfc 100644
> --- a/arch/sparc/include/asm/processor_64.h
> +++ b/arch/sparc/include/asm/processor_64.h
> @@ -216,7 +216,6 @@ unsigned long get_wchan(struct task_struct *task);
> "nop\n\t" \
> ".previous" \
> ::: "memory")
> -#define cpu_relax_yield() cpu_relax()
>
> /* Prefetch support. This is tuned for UltraSPARC-III and later.
> * UltraSPARC-I will treat these as nops, and UltraSPARC-II has
> diff --git a/arch/tile/include/asm/processor.h b/arch/tile/include/asm/processor.h
> index c1c228b..0bc9968 100644
> --- a/arch/tile/include/asm/processor.h
> +++ b/arch/tile/include/asm/processor.h
> @@ -264,8 +264,6 @@ static inline void cpu_relax(void)
> barrier();
> }
>
> -#define cpu_relax_yield() cpu_relax()
> -
> /* Info on this processor (see fs/proc/cpuinfo.c) */
> struct seq_operations;
> extern const struct seq_operations cpuinfo_op;
> diff --git a/arch/unicore32/include/asm/processor.h b/arch/unicore32/include/asm/processor.h
> index eeefe7c..4eaa421 100644
> --- a/arch/unicore32/include/asm/processor.h
> +++ b/arch/unicore32/include/asm/processor.h
> @@ -71,7 +71,6 @@ extern void release_thread(struct task_struct *);
> unsigned long get_wchan(struct task_struct *p);
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> #define task_pt_regs(p) \
> ((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
> diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
> index 7513c99..c84605b 100644
> --- a/arch/x86/include/asm/processor.h
> +++ b/arch/x86/include/asm/processor.h
> @@ -588,8 +588,6 @@ static __always_inline void cpu_relax(void)
> rep_nop();
> }
>
> -#define cpu_relax_yield() cpu_relax()
> -
> /* Stop speculative execution and prefetching of modified code. */
> static inline void sync_core(void)
> {
> diff --git a/arch/x86/um/asm/processor.h b/arch/x86/um/asm/processor.h
> index b4bd63b..c77db22 100644
> --- a/arch/x86/um/asm/processor.h
> +++ b/arch/x86/um/asm/processor.h
> @@ -26,7 +26,6 @@ static inline void rep_nop(void)
> }
>
> #define cpu_relax() rep_nop()
> -#define cpu_relax_yield() cpu_relax()
>
> #define task_pt_regs(t) (&(t)->thread.regs)
>
> diff --git a/arch/xtensa/include/asm/processor.h b/arch/xtensa/include/asm/processor.h
> index 7d8d6be..86ffcd6 100644
> --- a/arch/xtensa/include/asm/processor.h
> +++ b/arch/xtensa/include/asm/processor.h
> @@ -206,7 +206,6 @@ extern unsigned long get_wchan(struct task_struct *p);
> #define KSTK_ESP(tsk) (task_pt_regs(tsk)->areg[1])
>
> #define cpu_relax() barrier()
> -#define cpu_relax_yield() cpu_relax()
>
> /* Special register access. */
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 348f51b..c1aa3b0 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2444,6 +2444,10 @@ static inline void calc_load_enter_idle(void) { }
> static inline void calc_load_exit_idle(void) { }
> #endif /* CONFIG_NO_HZ_COMMON */
>
> +#ifndef cpu_relax_yield
> +#define cpu_relax_yield() cpu_relax()
> +#endif
> +
> /*
> * Do not use outside of architecture code which knows its limitations.
> *
> --
> 2.5.5
>
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.
^ permalink raw reply
* Re: BUG: 'list_empty(&vgdev->free_vbufs)' is true!
From: Gerd Hoffmann @ 2016-11-16 13:12 UTC (permalink / raw)
To: Jiri Slaby
Cc: David Airlie, virtualization, Linux kernel mailing list,
dri-devel, Michael S. Tsirkin
In-Reply-To: <f006e3a6-51d2-4092-57b6-eea7ff64f58f@suse.cz>
On Fr, 2016-11-11 at 17:28 +0100, Jiri Slaby wrote:
> On 11/09/2016, 09:01 AM, Gerd Hoffmann wrote:
> > On Di, 2016-11-08 at 22:37 +0200, Michael S. Tsirkin wrote:
> >> On Mon, Nov 07, 2016 at 09:43:24AM +0100, Jiri Slaby wrote:
> >>> Hi,
> >>>
> >>> I can relatively easily reproduce this bug:
> >
> > How?
>
> Run dmesg -w in the qemu window (virtio_gpu) to see a lot of output.
> Run pps [1] without exit(0); on e.g. serial console.
> Wait a bit. The lot of output causes the BUG.
>
> [1] https://github.com/jirislaby/collected_sources/blob/master/pps.c
Doesn't reproduce here.
Running "while true; do dmesg; done" on the virtio-gpu fbcon.
Running the pps fork bomb on the serial console.
Can watch dmesg printing the kernel messages over and over, until the
shell can't spawn dmesg any more due to the fork bomb hitting the
process limit. No BUG() triggered.
Tried spice, gtk and sdl.
Hmm.
Any ideas what else might be needed to reproduce it?
cheers,
Gerd
^ permalink raw reply
* Re: [PATCH v7 06/11] x86, paravirt: Add interface to support kvm/xen vcpu preempted check
From: Pan Xinhui @ 2016-11-17 5:16 UTC (permalink / raw)
To: Peter Zijlstra
Cc: kvm, rkrcmar, benh, will.deacon, virtualization, paulus,
kernellwp, linux-s390, dave, mpe, x86, mingo, Pan Xinhui,
xen-devel, paulmck, konrad.wilk, boqun.feng, jgross, linux-kernel,
David.Laight, xen-devel-request, pbonzini, linuxppc-dev
In-Reply-To: <20161116102355.GP3142@twins.programming.kicks-ass.net>
在 2016/11/16 18:23, Peter Zijlstra 写道:
> On Wed, Nov 16, 2016 at 12:19:09PM +0800, Pan Xinhui wrote:
>> Hi, Peter.
>> I think we can avoid a function call in a simpler way. How about below
>>
>> static inline bool vcpu_is_preempted(int cpu)
>> {
>> /* only set in pv case*/
>> if (pv_lock_ops.vcpu_is_preempted)
>> return pv_lock_ops.vcpu_is_preempted(cpu);
>> return false;
>> }
>
> That is still more expensive. It needs to do an actual load and makes it
> hard to predict the branch, you'd have to actually wait for the load to
> complete etc.
>
yes, one more load in native case. I think this is acceptable as vcpu_is_preempted is not a critical function.
however if we use pv_callee_save_regs_thunk, more unnecessary registers might be save/resotred in pv case.
that will introduce a little overhead.
but I think I am okay with your idea. I can make another patch based on this patchset with your suggested-by.
thanks
xinhui
> Also, it generates more code.
>
> Paravirt muck should strive to be as cheap as possible when ran on
> native hardware.
>
_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
^ permalink raw reply
* automatic IRQ affinity for virtio
From: Christoph Hellwig @ 2016-11-17 10:43 UTC (permalink / raw)
To: mst; +Cc: axboe, linux-block, linux-kernel, virtualization
Hi Michael,
this series contains a couple cleanups for the virtio_pci interrupt
handling code, including a switch to the new pci_irq_alloc_vectors
helper, and support for automatic affinity by the PCI layer if the
consumers ask for it. It then converts over virtio_blk to use this
functionality so that it's blk-mq queues are aligned to the MSI-X
vector routing. I have a similar patch in the queue for virtio-scsi,
but that would require pulling in the SCSI tree, so I'm not sure if
you'd like it for this window, but if you do I can send it in another
series. The third driver using per-CPU virtqueues is virtio_net,
but that will take some more time as I haven't started work on the
networking infrastructure yet.
Note that these patches require core IRQ changes from a stable
branch in the tip tree to be pulled in first:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/for-block
Gitweb:
https://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/log/?h=irq/for-block
^ permalink raw reply
* [PATCH 01/11] virtio_pci: use pci_alloc_irq_vectors
From: Christoph Hellwig @ 2016-11-17 10:43 UTC (permalink / raw)
To: mst; +Cc: axboe, linux-block, linux-kernel, virtualization
In-Reply-To: <1479379403-27880-1-git-send-email-hch@lst.de>
This avoids the separate allocation for the msix_entries structures, and
instead allows us to use pci_irq_vector to find a given IRQ vector.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/virtio/virtio_pci_common.c | 42 +++++++++++++++-----------------------
drivers/virtio/virtio_pci_common.h | 1 -
2 files changed, 17 insertions(+), 26 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index d9a9058..f5e2751 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -37,7 +37,7 @@ void vp_synchronize_vectors(struct virtio_device *vdev)
synchronize_irq(vp_dev->pci_dev->irq);
for (i = 0; i < vp_dev->msix_vectors; ++i)
- synchronize_irq(vp_dev->msix_entries[i].vector);
+ synchronize_irq(pci_irq_vector(vp_dev->pci_dev, i));
}
/* the notify function used when creating a virt queue */
@@ -113,7 +113,7 @@ static void vp_free_vectors(struct virtio_device *vdev)
}
for (i = 0; i < vp_dev->msix_used_vectors; ++i)
- free_irq(vp_dev->msix_entries[i].vector, vp_dev);
+ free_irq(pci_irq_vector(vp_dev->pci_dev, i), vp_dev);
for (i = 0; i < vp_dev->msix_vectors; i++)
if (vp_dev->msix_affinity_masks[i])
@@ -123,7 +123,7 @@ static void vp_free_vectors(struct virtio_device *vdev)
/* Disable the vector used for configuration */
vp_dev->config_vector(vp_dev, VIRTIO_MSI_NO_VECTOR);
- pci_disable_msix(vp_dev->pci_dev);
+ pci_free_irq_vectors(vp_dev->pci_dev);
vp_dev->msix_enabled = 0;
}
@@ -131,8 +131,6 @@ static void vp_free_vectors(struct virtio_device *vdev)
vp_dev->msix_used_vectors = 0;
kfree(vp_dev->msix_names);
vp_dev->msix_names = NULL;
- kfree(vp_dev->msix_entries);
- vp_dev->msix_entries = NULL;
kfree(vp_dev->msix_affinity_masks);
vp_dev->msix_affinity_masks = NULL;
}
@@ -147,10 +145,6 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
vp_dev->msix_vectors = nvectors;
- vp_dev->msix_entries = kmalloc(nvectors * sizeof *vp_dev->msix_entries,
- GFP_KERNEL);
- if (!vp_dev->msix_entries)
- goto error;
vp_dev->msix_names = kmalloc(nvectors * sizeof *vp_dev->msix_names,
GFP_KERNEL);
if (!vp_dev->msix_names)
@@ -165,12 +159,9 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
GFP_KERNEL))
goto error;
- for (i = 0; i < nvectors; ++i)
- vp_dev->msix_entries[i].entry = i;
-
- err = pci_enable_msix_exact(vp_dev->pci_dev,
- vp_dev->msix_entries, nvectors);
- if (err)
+ err = pci_alloc_irq_vectors(vp_dev->pci_dev, nvectors, nvectors,
+ PCI_IRQ_MSIX);
+ if (err < 0)
goto error;
vp_dev->msix_enabled = 1;
@@ -178,7 +169,7 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
v = vp_dev->msix_used_vectors;
snprintf(vp_dev->msix_names[v], sizeof *vp_dev->msix_names,
"%s-config", name);
- err = request_irq(vp_dev->msix_entries[v].vector,
+ err = request_irq(pci_irq_vector(vp_dev->pci_dev, v),
vp_config_changed, 0, vp_dev->msix_names[v],
vp_dev);
if (err)
@@ -197,7 +188,7 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
v = vp_dev->msix_used_vectors;
snprintf(vp_dev->msix_names[v], sizeof *vp_dev->msix_names,
"%s-virtqueues", name);
- err = request_irq(vp_dev->msix_entries[v].vector,
+ err = request_irq(pci_irq_vector(vp_dev->pci_dev, v),
vp_vring_interrupt, 0, vp_dev->msix_names[v],
vp_dev);
if (err)
@@ -276,14 +267,15 @@ void vp_del_vqs(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
struct virtqueue *vq, *n;
- struct virtio_pci_vq_info *info;
list_for_each_entry_safe(vq, n, &vdev->vqs, list) {
- info = vp_dev->vqs[vq->index];
- if (vp_dev->per_vq_vectors &&
- info->msix_vector != VIRTIO_MSI_NO_VECTOR)
- free_irq(vp_dev->msix_entries[info->msix_vector].vector,
- vq);
+ if (vp_dev->per_vq_vectors) {
+ int v = vp_dev->vqs[vq->index]->msix_vector;
+
+ if (v != VIRTIO_MSI_NO_VECTOR)
+ free_irq(pci_irq_vector(vp_dev->pci_dev, v),
+ vq);
+ }
vp_del_vq(vq);
}
vp_dev->per_vq_vectors = false;
@@ -356,7 +348,7 @@ static int vp_try_to_find_vqs(struct virtio_device *vdev, unsigned nvqs,
sizeof *vp_dev->msix_names,
"%s-%s",
dev_name(&vp_dev->vdev.dev), names[i]);
- err = request_irq(vp_dev->msix_entries[msix_vec].vector,
+ err = request_irq(pci_irq_vector(vp_dev->pci_dev, msix_vec),
vring_interrupt, 0,
vp_dev->msix_names[msix_vec],
vqs[i]);
@@ -419,7 +411,7 @@ int vp_set_vq_affinity(struct virtqueue *vq, int cpu)
if (vp_dev->msix_enabled) {
mask = vp_dev->msix_affinity_masks[info->msix_vector];
- irq = vp_dev->msix_entries[info->msix_vector].vector;
+ irq = pci_irq_vector(vp_dev->pci_dev, info->msix_vector);
if (cpu == -1)
irq_set_affinity_hint(irq, NULL);
else {
diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
index 2826320..b2f6662 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -85,7 +85,6 @@ struct virtio_pci_device {
/* MSI-X support */
int msix_enabled;
int intx_enabled;
- struct msix_entry *msix_entries;
cpumask_var_t *msix_affinity_masks;
/* Name strings for interrupts. This size should be enough,
* and I'm too lazy to allocate each name separately. */
--
2.1.4
^ permalink raw reply related
* [PATCH 02/11] virtio_pci: remove the call to vp_free_vectors in vp_request_msix_vectors
From: Christoph Hellwig @ 2016-11-17 10:43 UTC (permalink / raw)
To: mst; +Cc: axboe, linux-block, linux-kernel, virtualization
In-Reply-To: <1479379403-27880-1-git-send-email-hch@lst.de>
vp_request_msix_vectors is only called by vp_try_to_find_vqs, which already
calls vp_free_vectors through vp_del_vqs in the failure case.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/virtio/virtio_pci_common.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index f5e2751..93700c5 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -197,7 +197,6 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
}
return 0;
error:
- vp_free_vectors(vdev);
return err;
}
--
2.1.4
^ permalink raw reply related
* [PATCH 03/11] virtio_pci: merge vp_free_vectors into vp_del_vqs
From: Christoph Hellwig @ 2016-11-17 10:43 UTC (permalink / raw)
To: mst; +Cc: axboe, linux-block, linux-kernel, virtualization
In-Reply-To: <1479379403-27880-1-git-send-email-hch@lst.de>
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/virtio/virtio_pci_common.c | 61 +++++++++++++++++---------------------
1 file changed, 27 insertions(+), 34 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index 93700c5..f6c5499 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -102,39 +102,6 @@ static irqreturn_t vp_interrupt(int irq, void *opaque)
return vp_vring_interrupt(irq, opaque);
}
-static void vp_free_vectors(struct virtio_device *vdev)
-{
- struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- int i;
-
- if (vp_dev->intx_enabled) {
- free_irq(vp_dev->pci_dev->irq, vp_dev);
- vp_dev->intx_enabled = 0;
- }
-
- for (i = 0; i < vp_dev->msix_used_vectors; ++i)
- free_irq(pci_irq_vector(vp_dev->pci_dev, i), vp_dev);
-
- for (i = 0; i < vp_dev->msix_vectors; i++)
- if (vp_dev->msix_affinity_masks[i])
- free_cpumask_var(vp_dev->msix_affinity_masks[i]);
-
- if (vp_dev->msix_enabled) {
- /* Disable the vector used for configuration */
- vp_dev->config_vector(vp_dev, VIRTIO_MSI_NO_VECTOR);
-
- pci_free_irq_vectors(vp_dev->pci_dev);
- vp_dev->msix_enabled = 0;
- }
-
- vp_dev->msix_vectors = 0;
- vp_dev->msix_used_vectors = 0;
- kfree(vp_dev->msix_names);
- vp_dev->msix_names = NULL;
- kfree(vp_dev->msix_affinity_masks);
- vp_dev->msix_affinity_masks = NULL;
-}
-
static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
bool per_vq_vectors)
{
@@ -266,6 +233,7 @@ void vp_del_vqs(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
struct virtqueue *vq, *n;
+ int i;
list_for_each_entry_safe(vq, n, &vdev->vqs, list) {
if (vp_dev->per_vq_vectors) {
@@ -279,7 +247,32 @@ void vp_del_vqs(struct virtio_device *vdev)
}
vp_dev->per_vq_vectors = false;
- vp_free_vectors(vdev);
+ if (vp_dev->intx_enabled) {
+ free_irq(vp_dev->pci_dev->irq, vp_dev);
+ vp_dev->intx_enabled = 0;
+ }
+
+ for (i = 0; i < vp_dev->msix_used_vectors; ++i)
+ free_irq(pci_irq_vector(vp_dev->pci_dev, i), vp_dev);
+
+ for (i = 0; i < vp_dev->msix_vectors; i++)
+ if (vp_dev->msix_affinity_masks[i])
+ free_cpumask_var(vp_dev->msix_affinity_masks[i]);
+
+ if (vp_dev->msix_enabled) {
+ /* Disable the vector used for configuration */
+ vp_dev->config_vector(vp_dev, VIRTIO_MSI_NO_VECTOR);
+
+ pci_free_irq_vectors(vp_dev->pci_dev);
+ vp_dev->msix_enabled = 0;
+ }
+
+ vp_dev->msix_vectors = 0;
+ vp_dev->msix_used_vectors = 0;
+ kfree(vp_dev->msix_names);
+ vp_dev->msix_names = NULL;
+ kfree(vp_dev->msix_affinity_masks);
+ vp_dev->msix_affinity_masks = NULL;
kfree(vp_dev->vqs);
vp_dev->vqs = NULL;
}
--
2.1.4
^ permalink raw reply related
* [PATCH 04/11] virtio_pci: split vp_try_to_find_vqs into INTx and MSI-X variants
From: Christoph Hellwig @ 2016-11-17 10:43 UTC (permalink / raw)
To: mst; +Cc: axboe, linux-block, linux-kernel, virtualization
In-Reply-To: <1479379403-27880-1-git-send-email-hch@lst.de>
There is basically no shared logic between the INTx and MSI-X case in
vp_try_to_find_vqs, so split the function into two and clean them up
a little bit.
Also remove the fairly pointless vp_request_intx wrapper while we're at it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/virtio/virtio_pci_common.c | 97 ++++++++++++++++++++++----------------
1 file changed, 57 insertions(+), 40 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index f6c5499..9a9826e 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -167,18 +167,6 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
return err;
}
-static int vp_request_intx(struct virtio_device *vdev)
-{
- int err;
- struct virtio_pci_device *vp_dev = to_vp_device(vdev);
-
- err = request_irq(vp_dev->pci_dev->irq, vp_interrupt,
- IRQF_SHARED, dev_name(&vdev->dev), vp_dev);
- if (!err)
- vp_dev->intx_enabled = 1;
- return err;
-}
-
static struct virtqueue *vp_setup_vq(struct virtio_device *vdev, unsigned index,
void (*callback)(struct virtqueue *vq),
const char *name,
@@ -277,50 +265,44 @@ void vp_del_vqs(struct virtio_device *vdev)
vp_dev->vqs = NULL;
}
-static int vp_try_to_find_vqs(struct virtio_device *vdev, unsigned nvqs,
+static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
struct virtqueue *vqs[],
vq_callback_t *callbacks[],
const char * const names[],
- bool use_msix,
bool per_vq_vectors)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
u16 msix_vec;
int i, err, nvectors, allocated_vectors;
- vp_dev->vqs = kmalloc(nvqs * sizeof *vp_dev->vqs, GFP_KERNEL);
+ vp_dev->vqs = kcalloc(nvqs, sizeof(*vp_dev->vqs), GFP_KERNEL);
if (!vp_dev->vqs)
return -ENOMEM;
- if (!use_msix) {
- /* Old style: one normal interrupt for change and all vqs. */
- err = vp_request_intx(vdev);
- if (err)
- goto error_find;
+ if (per_vq_vectors) {
+ /* Best option: one for change interrupt, one per vq. */
+ nvectors = 1;
+ for (i = 0; i < nvqs; ++i)
+ if (callbacks[i])
+ ++nvectors;
} else {
- if (per_vq_vectors) {
- /* Best option: one for change interrupt, one per vq. */
- nvectors = 1;
- for (i = 0; i < nvqs; ++i)
- if (callbacks[i])
- ++nvectors;
- } else {
- /* Second best: one for change, shared for all vqs. */
- nvectors = 2;
- }
-
- err = vp_request_msix_vectors(vdev, nvectors, per_vq_vectors);
- if (err)
- goto error_find;
+ /* Second best: one for change, shared for all vqs. */
+ nvectors = 2;
}
+ err = vp_request_msix_vectors(vdev, nvectors, per_vq_vectors);
+ if (err)
+ goto error_find;
+
vp_dev->per_vq_vectors = per_vq_vectors;
allocated_vectors = vp_dev->msix_used_vectors;
for (i = 0; i < nvqs; ++i) {
if (!names[i]) {
vqs[i] = NULL;
continue;
- } else if (!callbacks[i] || !vp_dev->msix_enabled)
+ }
+
+ if (!callbacks[i])
msix_vec = VIRTIO_MSI_NO_VECTOR;
else if (vp_dev->per_vq_vectors)
msix_vec = allocated_vectors++;
@@ -356,6 +338,43 @@ static int vp_try_to_find_vqs(struct virtio_device *vdev, unsigned nvqs,
return err;
}
+static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned nvqs,
+ struct virtqueue *vqs[], vq_callback_t *callbacks[],
+ const char * const names[])
+{
+ struct virtio_pci_device *vp_dev = to_vp_device(vdev);
+ int i, err;
+
+ vp_dev->vqs = kcalloc(nvqs, sizeof(*vp_dev->vqs), GFP_KERNEL);
+ if (!vp_dev->vqs)
+ return -ENOMEM;
+
+ err = request_irq(vp_dev->pci_dev->irq, vp_interrupt, IRQF_SHARED,
+ dev_name(&vdev->dev), vp_dev);
+ if (err)
+ goto out_del_vqs;
+
+ vp_dev->intx_enabled = 1;
+ vp_dev->per_vq_vectors = false;
+ for (i = 0; i < nvqs; ++i) {
+ if (!names[i]) {
+ vqs[i] = NULL;
+ continue;
+ }
+ vqs[i] = vp_setup_vq(vdev, i, callbacks[i], names[i],
+ VIRTIO_MSI_NO_VECTOR);
+ if (IS_ERR(vqs[i])) {
+ err = PTR_ERR(vqs[i]);
+ goto out_del_vqs;
+ }
+ }
+
+ return 0;
+out_del_vqs:
+ vp_del_vqs(vdev);
+ return err;
+}
+
/* the config->find_vqs() implementation */
int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs,
struct virtqueue *vqs[],
@@ -365,17 +384,15 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs,
int err;
/* Try MSI-X with one vector per queue. */
- err = vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names, true, true);
+ err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, true);
if (!err)
return 0;
/* Fallback: MSI-X with one vector for config, one shared for queues. */
- err = vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names,
- true, false);
+ err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, false);
if (!err)
return 0;
/* Finally fall back to regular interrupts. */
- return vp_try_to_find_vqs(vdev, nvqs, vqs, callbacks, names,
- false, false);
+ return vp_find_vqs_intx(vdev, nvqs, vqs, callbacks, names);
}
const char *vp_bus_name(struct virtio_device *vdev)
--
2.1.4
^ permalink raw reply related
* [PATCH 05/11] virtio_pci: use shared interrupts for virtqueues
From: Christoph Hellwig @ 2016-11-17 10:43 UTC (permalink / raw)
To: mst; +Cc: axboe, linux-block, linux-kernel, virtualization
In-Reply-To: <1479379403-27880-1-git-send-email-hch@lst.de>
This lets IRQ layer handle dispatching IRQs to separate handlers for the
case where we don't have per-VQ MSI-X vectors, and allows us to greatly
simplify the code based on the assumption that we always have interrupt
vector 0 (legacy INTx or config interrupt for MSI-X) available, and
any other interrupt is request/freed throught the VQ, even if the
actual interrupt line might be shared in some cases.
This allows removing a great deal of variables keeping track of the
interrupt state in struct virtio_pci_device, as we can now simply walk the
list of VQs and deal with per-VQ interrupt handlers there, and only treat
vector 0 special. As a little caveat this means we have to check if VQs
have been set up before calling vp_synchronize_vectors, as otherwise
the initial reset will trip over the uninitialized ->vqs array.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/virtio/virtio_pci_common.c | 95 ++++++++++++++------------------------
drivers/virtio/virtio_pci_common.h | 18 +-------
2 files changed, 36 insertions(+), 77 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index 9a9826e..9e6c9d8 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -31,13 +31,18 @@ MODULE_PARM_DESC(force_legacy,
void vp_synchronize_vectors(struct virtio_device *vdev)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
- int i;
+ struct virtqueue *vq, *n;
+
+ if (!vp_dev->vqs)
+ return;
- if (vp_dev->intx_enabled)
- synchronize_irq(vp_dev->pci_dev->irq);
+ list_for_each_entry_safe(vq, n, &vdev->vqs, list) {
+ int vec = vp_dev->vqs[vq->index]->msix_vector;
+ if (vec != VIRTIO_MSI_NO_VECTOR)
+ synchronize_irq(pci_irq_vector(vp_dev->pci_dev, vec));
+ }
- for (i = 0; i < vp_dev->msix_vectors; ++i)
- synchronize_irq(pci_irq_vector(vp_dev->pci_dev, i));
+ synchronize_irq(pci_irq_vector(vp_dev->pci_dev, 0));
}
/* the notify function used when creating a virt queue */
@@ -102,16 +107,14 @@ static irqreturn_t vp_interrupt(int irq, void *opaque)
return vp_vring_interrupt(irq, opaque);
}
-static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
- bool per_vq_vectors)
+static int vp_setup_msix_vectors(struct virtio_device *vdev, int nvectors)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
const char *name = dev_name(&vp_dev->vdev.dev);
- unsigned i, v;
+ unsigned i;
int err = -ENOMEM;
vp_dev->msix_vectors = nvectors;
-
vp_dev->msix_names = kmalloc(nvectors * sizeof *vp_dev->msix_names,
GFP_KERNEL);
if (!vp_dev->msix_names)
@@ -133,35 +136,20 @@ static int vp_request_msix_vectors(struct virtio_device *vdev, int nvectors,
vp_dev->msix_enabled = 1;
/* Set the vector used for configuration */
- v = vp_dev->msix_used_vectors;
- snprintf(vp_dev->msix_names[v], sizeof *vp_dev->msix_names,
+ snprintf(vp_dev->msix_names[0], sizeof(*vp_dev->msix_names),
"%s-config", name);
- err = request_irq(pci_irq_vector(vp_dev->pci_dev, v),
- vp_config_changed, 0, vp_dev->msix_names[v],
+ err = request_irq(pci_irq_vector(vp_dev->pci_dev, 0),
+ vp_config_changed, 0, vp_dev->msix_names[0],
vp_dev);
if (err)
goto error;
- ++vp_dev->msix_used_vectors;
- v = vp_dev->config_vector(vp_dev, v);
/* Verify we had enough resources to assign the vector */
- if (v == VIRTIO_MSI_NO_VECTOR) {
+ if (vp_dev->config_vector(vp_dev, 0) == VIRTIO_MSI_NO_VECTOR) {
err = -EBUSY;
goto error;
}
- if (!per_vq_vectors) {
- /* Shared vector for all VQs */
- v = vp_dev->msix_used_vectors;
- snprintf(vp_dev->msix_names[v], sizeof *vp_dev->msix_names,
- "%s-virtqueues", name);
- err = request_irq(pci_irq_vector(vp_dev->pci_dev, v),
- vp_vring_interrupt, 0, vp_dev->msix_names[v],
- vp_dev);
- if (err)
- goto error;
- ++vp_dev->msix_used_vectors;
- }
return 0;
error:
return err;
@@ -224,24 +212,14 @@ void vp_del_vqs(struct virtio_device *vdev)
int i;
list_for_each_entry_safe(vq, n, &vdev->vqs, list) {
- if (vp_dev->per_vq_vectors) {
- int v = vp_dev->vqs[vq->index]->msix_vector;
+ int vec = vp_dev->vqs[vq->index]->msix_vector;
- if (v != VIRTIO_MSI_NO_VECTOR)
- free_irq(pci_irq_vector(vp_dev->pci_dev, v),
- vq);
- }
+ if (vec != VIRTIO_MSI_NO_VECTOR)
+ free_irq(pci_irq_vector(vp_dev->pci_dev, vec), vq);
vp_del_vq(vq);
}
- vp_dev->per_vq_vectors = false;
-
- if (vp_dev->intx_enabled) {
- free_irq(vp_dev->pci_dev->irq, vp_dev);
- vp_dev->intx_enabled = 0;
- }
- for (i = 0; i < vp_dev->msix_used_vectors; ++i)
- free_irq(pci_irq_vector(vp_dev->pci_dev, i), vp_dev);
+ free_irq(pci_irq_vector(vp_dev->pci_dev, 0), vp_dev);
for (i = 0; i < vp_dev->msix_vectors; i++)
if (vp_dev->msix_affinity_masks[i])
@@ -256,7 +234,6 @@ void vp_del_vqs(struct virtio_device *vdev)
}
vp_dev->msix_vectors = 0;
- vp_dev->msix_used_vectors = 0;
kfree(vp_dev->msix_names);
vp_dev->msix_names = NULL;
kfree(vp_dev->msix_affinity_masks);
@@ -273,7 +250,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
u16 msix_vec;
- int i, err, nvectors, allocated_vectors;
+ int i, err, allocated_vectors, nvectors;
vp_dev->vqs = kcalloc(nvqs, sizeof(*vp_dev->vqs), GFP_KERNEL);
if (!vp_dev->vqs)
@@ -290,46 +267,44 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
nvectors = 2;
}
- err = vp_request_msix_vectors(vdev, nvectors, per_vq_vectors);
+ err = vp_setup_msix_vectors(vdev, nvectors);
if (err)
goto error_find;
- vp_dev->per_vq_vectors = per_vq_vectors;
- allocated_vectors = vp_dev->msix_used_vectors;
+ allocated_vectors = 1; /* vector 0 is the config interrupt */
for (i = 0; i < nvqs; ++i) {
if (!names[i]) {
vqs[i] = NULL;
continue;
}
- if (!callbacks[i])
- msix_vec = VIRTIO_MSI_NO_VECTOR;
- else if (vp_dev->per_vq_vectors)
- msix_vec = allocated_vectors++;
+ if (callbacks[i])
+ msix_vec = allocated_vectors;
else
- msix_vec = VP_MSIX_VQ_VECTOR;
+ msix_vec = VIRTIO_MSI_NO_VECTOR;
+
vqs[i] = vp_setup_vq(vdev, i, callbacks[i], names[i], msix_vec);
if (IS_ERR(vqs[i])) {
err = PTR_ERR(vqs[i]);
goto error_find;
}
- if (!vp_dev->per_vq_vectors || msix_vec == VIRTIO_MSI_NO_VECTOR)
+ if (msix_vec == VIRTIO_MSI_NO_VECTOR)
continue;
- /* allocate per-vq irq if available and necessary */
snprintf(vp_dev->msix_names[msix_vec],
- sizeof *vp_dev->msix_names,
- "%s-%s",
+ sizeof(*vp_dev->msix_names), "%s-%s",
dev_name(&vp_dev->vdev.dev), names[i]);
err = request_irq(pci_irq_vector(vp_dev->pci_dev, msix_vec),
- vring_interrupt, 0,
- vp_dev->msix_names[msix_vec],
- vqs[i]);
+ vring_interrupt, IRQF_SHARED,
+ vp_dev->msix_names[msix_vec], vqs[i]);
if (err) {
vp_del_vq(vqs[i]);
goto error_find;
}
+
+ if (per_vq_vectors)
+ allocated_vectors++;
}
return 0;
@@ -354,8 +329,6 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned nvqs,
if (err)
goto out_del_vqs;
- vp_dev->intx_enabled = 1;
- vp_dev->per_vq_vectors = false;
for (i = 0; i < nvqs; ++i) {
if (!names[i]) {
vqs[i] = NULL;
diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
index b2f6662..93f3e4f 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -84,18 +84,12 @@ struct virtio_pci_device {
/* MSI-X support */
int msix_enabled;
- int intx_enabled;
cpumask_var_t *msix_affinity_masks;
/* Name strings for interrupts. This size should be enough,
* and I'm too lazy to allocate each name separately. */
char (*msix_names)[256];
- /* Number of available vectors */
- unsigned msix_vectors;
- /* Vectors allocated, excluding per-vq vectors if any */
- unsigned msix_used_vectors;
-
- /* Whether we have vector per vq */
- bool per_vq_vectors;
+ /* Total Number of MSI-X vectors (including per-VQ ones). */
+ int msix_vectors;
struct virtqueue *(*setup_vq)(struct virtio_pci_device *vp_dev,
struct virtio_pci_vq_info *info,
@@ -108,14 +102,6 @@ struct virtio_pci_device {
u16 (*config_vector)(struct virtio_pci_device *vp_dev, u16 vector);
};
-/* Constants for MSI-X */
-/* Use first vector for configuration changes, second and the rest for
- * virtqueues Thus, we need at least 2 vectors for MSI. */
-enum {
- VP_MSIX_CONFIG_VECTOR = 0,
- VP_MSIX_VQ_VECTOR = 1,
-};
-
/* Convert a generic virtio device to our structure */
static struct virtio_pci_device *to_vp_device(struct virtio_device *vdev)
{
--
2.1.4
^ permalink raw reply related
* [PATCH 06/11] virtio_pci: use msix_enable flag in struct pci_dev
From: Christoph Hellwig @ 2016-11-17 10:43 UTC (permalink / raw)
To: mst; +Cc: axboe, linux-block, linux-kernel, virtualization
In-Reply-To: <1479379403-27880-1-git-send-email-hch@lst.de>
Instead of duplicating it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/virtio/virtio_pci_common.c | 7 ++-----
drivers/virtio/virtio_pci_common.h | 2 --
drivers/virtio/virtio_pci_legacy.c | 2 +-
drivers/virtio/virtio_pci_modern.c | 2 +-
include/uapi/linux/virtio_pci.h | 2 +-
5 files changed, 5 insertions(+), 10 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index 9e6c9d8..03d3a3b 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -133,7 +133,6 @@ static int vp_setup_msix_vectors(struct virtio_device *vdev, int nvectors)
PCI_IRQ_MSIX);
if (err < 0)
goto error;
- vp_dev->msix_enabled = 1;
/* Set the vector used for configuration */
snprintf(vp_dev->msix_names[0], sizeof(*vp_dev->msix_names),
@@ -225,12 +224,10 @@ void vp_del_vqs(struct virtio_device *vdev)
if (vp_dev->msix_affinity_masks[i])
free_cpumask_var(vp_dev->msix_affinity_masks[i]);
- if (vp_dev->msix_enabled) {
+ if (vp_dev->pci_dev->msix_enabled) {
/* Disable the vector used for configuration */
vp_dev->config_vector(vp_dev, VIRTIO_MSI_NO_VECTOR);
-
pci_free_irq_vectors(vp_dev->pci_dev);
- vp_dev->msix_enabled = 0;
}
vp_dev->msix_vectors = 0;
@@ -391,7 +388,7 @@ int vp_set_vq_affinity(struct virtqueue *vq, int cpu)
if (!vq->callback)
return -EINVAL;
- if (vp_dev->msix_enabled) {
+ if (vp_dev->pci_dev->msix_enabled) {
mask = vp_dev->msix_affinity_masks[info->msix_vector];
irq = pci_irq_vector(vp_dev->pci_dev, info->msix_vector);
if (cpu == -1)
diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
index 93f3e4f..d041cce 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -82,8 +82,6 @@ struct virtio_pci_device {
/* array of all queues for house-keeping */
struct virtio_pci_vq_info **vqs;
- /* MSI-X support */
- int msix_enabled;
cpumask_var_t *msix_affinity_masks;
/* Name strings for interrupts. This size should be enough,
* and I'm too lazy to allocate each name separately. */
diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
index 6d9e517..f83829f 100644
--- a/drivers/virtio/virtio_pci_legacy.c
+++ b/drivers/virtio/virtio_pci_legacy.c
@@ -169,7 +169,7 @@ static void del_vq(struct virtio_pci_vq_info *info)
iowrite16(vq->index, vp_dev->ioaddr + VIRTIO_PCI_QUEUE_SEL);
- if (vp_dev->msix_enabled) {
+ if (vp_dev->pci_dev->msix_enabled) {
iowrite16(VIRTIO_MSI_NO_VECTOR,
vp_dev->ioaddr + VIRTIO_MSI_QUEUE_VECTOR);
/* Flush the write out to device */
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index e76bd91..e18d0b0 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -416,7 +416,7 @@ static void del_vq(struct virtio_pci_vq_info *info)
vp_iowrite16(vq->index, &vp_dev->common->queue_select);
- if (vp_dev->msix_enabled) {
+ if (vp_dev->pci_dev->msix_enabled) {
vp_iowrite16(VIRTIO_MSI_NO_VECTOR,
&vp_dev->common->queue_msix_vector);
/* Flush the write out to device */
diff --git a/include/uapi/linux/virtio_pci.h b/include/uapi/linux/virtio_pci.h
index 90007a1..15b4385 100644
--- a/include/uapi/linux/virtio_pci.h
+++ b/include/uapi/linux/virtio_pci.h
@@ -79,7 +79,7 @@
* configuration space */
#define VIRTIO_PCI_CONFIG_OFF(msix_enabled) ((msix_enabled) ? 24 : 20)
/* Deprecated: please use VIRTIO_PCI_CONFIG_OFF instead */
-#define VIRTIO_PCI_CONFIG(dev) VIRTIO_PCI_CONFIG_OFF((dev)->msix_enabled)
+#define VIRTIO_PCI_CONFIG(dev) VIRTIO_PCI_CONFIG_OFF((dev)->pci_dev->msix_enabled)
/* Virtio ABI version, this must match exactly */
#define VIRTIO_PCI_ABI_VERSION 0
--
2.1.4
^ permalink raw reply related
* [PATCH 07/11] virtio_pci: simplify MSI-X setup
From: Christoph Hellwig @ 2016-11-17 10:43 UTC (permalink / raw)
To: mst; +Cc: axboe, linux-block, linux-kernel, virtualization
In-Reply-To: <1479379403-27880-1-git-send-email-hch@lst.de>
Try to grab the MSI-X vectors early and fall back to the shared one
before doing tons of allocations.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/virtio/virtio_pci_common.c | 52 +++++++++++++++++++-------------------
1 file changed, 26 insertions(+), 26 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index 03d3a3b..90d7975 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -114,6 +114,18 @@ static int vp_setup_msix_vectors(struct virtio_device *vdev, int nvectors)
unsigned i;
int err = -ENOMEM;
+ /* Try one vector per queue first. */
+ err = pci_alloc_irq_vectors(vp_dev->pci_dev, nvectors, nvectors,
+ PCI_IRQ_MSIX);
+ if (err < 0) {
+ /* Fallback to one vector for config, one shared for queues. */
+ err = pci_alloc_irq_vectors(vp_dev->pci_dev, 2, 2,
+ PCI_IRQ_MSIX);
+ if (err < 0)
+ goto error;
+ nvectors = 2;
+ }
+
vp_dev->msix_vectors = nvectors;
vp_dev->msix_names = kmalloc(nvectors * sizeof *vp_dev->msix_names,
GFP_KERNEL);
@@ -129,11 +141,6 @@ static int vp_setup_msix_vectors(struct virtio_device *vdev, int nvectors)
GFP_KERNEL))
goto error;
- err = pci_alloc_irq_vectors(vp_dev->pci_dev, nvectors, nvectors,
- PCI_IRQ_MSIX);
- if (err < 0)
- goto error;
-
/* Set the vector used for configuration */
snprintf(vp_dev->msix_names[0], sizeof(*vp_dev->msix_names),
"%s-config", name);
@@ -149,7 +156,7 @@ static int vp_setup_msix_vectors(struct virtio_device *vdev, int nvectors)
goto error;
}
- return 0;
+ return nvectors;
error:
return err;
}
@@ -242,8 +249,7 @@ void vp_del_vqs(struct virtio_device *vdev)
static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
struct virtqueue *vqs[],
vq_callback_t *callbacks[],
- const char * const names[],
- bool per_vq_vectors)
+ const char * const names[])
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
u16 msix_vec;
@@ -253,20 +259,16 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
if (!vp_dev->vqs)
return -ENOMEM;
- if (per_vq_vectors) {
- /* Best option: one for change interrupt, one per vq. */
- nvectors = 1;
- for (i = 0; i < nvqs; ++i)
- if (callbacks[i])
- ++nvectors;
- } else {
- /* Second best: one for change, shared for all vqs. */
- nvectors = 2;
+ nvectors = 1;
+ for (i = 0; i < nvqs; i++) {
+ if (callbacks[i])
+ nvectors++;
}
err = vp_setup_msix_vectors(vdev, nvectors);
- if (err)
+ if (err < 0)
goto error_find;
+ nvectors = err;
allocated_vectors = 1; /* vector 0 is the config interrupt */
for (i = 0; i < nvqs; ++i) {
@@ -300,7 +302,11 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
goto error_find;
}
- if (per_vq_vectors)
+ /*
+ * Use a different vector for each queue if they are available,
+ * else share the same vector for all VQs.
+ */
+ if (nvectors == nvqs + 1)
allocated_vectors++;
}
return 0;
@@ -353,15 +359,9 @@ int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs,
{
int err;
- /* Try MSI-X with one vector per queue. */
- err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, true);
- if (!err)
- return 0;
- /* Fallback: MSI-X with one vector for config, one shared for queues. */
- err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, false);
+ err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names);
if (!err)
return 0;
- /* Finally fall back to regular interrupts. */
return vp_find_vqs_intx(vdev, nvqs, vqs, callbacks, names);
}
--
2.1.4
^ permalink raw reply related
* [PATCH 08/11] virtio: allow drivers to request IRQ affinity when creating VQs
From: Christoph Hellwig @ 2016-11-17 10:43 UTC (permalink / raw)
To: mst; +Cc: axboe, linux-block, linux-kernel, virtualization
In-Reply-To: <1479379403-27880-1-git-send-email-hch@lst.de>
Add a struct irq_affinity pointer to the find_vqs methods, which if set
is used to tell the PCI layer to create the MSI-X vectors for our I/O
virtqueues with the proper affinity from the start. Compared to after
the fact affinity hints this gives us an instantly working setup and
allows to allocate the irq descritors node-local and avoid interconnect
traffic. Last but not least this will allow blk-mq queues are created
based on the interrupt affinity for storage drivers.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/block/virtio_blk.c | 3 ++-
drivers/char/virtio_console.c | 2 +-
drivers/gpu/drm/virtio/virtgpu_kms.c | 2 +-
drivers/misc/mic/vop/vop_main.c | 2 +-
drivers/net/caif/caif_virtio.c | 3 ++-
drivers/net/virtio_net.c | 2 +-
drivers/remoteproc/remoteproc_virtio.c | 3 ++-
drivers/rpmsg/virtio_rpmsg_bus.c | 2 +-
drivers/s390/virtio/kvm_virtio.c | 3 ++-
drivers/s390/virtio/virtio_ccw.c | 3 ++-
drivers/scsi/virtio_scsi.c | 3 ++-
drivers/virtio/virtio_balloon.c | 3 ++-
drivers/virtio/virtio_input.c | 3 ++-
drivers/virtio/virtio_mmio.c | 3 ++-
drivers/virtio/virtio_pci_common.c | 31 +++++++++++++++++--------------
drivers/virtio/virtio_pci_common.h | 5 ++---
drivers/virtio/virtio_pci_modern.c | 7 +++----
include/linux/virtio_config.h | 9 +++++----
net/vmw_vsock/virtio_transport.c | 3 ++-
19 files changed, 52 insertions(+), 40 deletions(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 5545a67..689e790 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -409,7 +409,8 @@ static int init_vq(struct virtio_blk *vblk)
}
/* Discover virtqueues and write information to configuration. */
- err = vdev->config->find_vqs(vdev, num_vqs, vqs, callbacks, names);
+ err = vdev->config->find_vqs(vdev, num_vqs, vqs, callbacks, names,
+ NULL);
if (err)
goto out;
diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 5649234..18d0681 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -1939,7 +1939,7 @@ static int init_vqs(struct ports_device *portdev)
/* Find the queues. */
err = portdev->vdev->config->find_vqs(portdev->vdev, nr_queues, vqs,
io_callbacks,
- (const char **)io_names);
+ (const char **)io_names, NULL);
if (err)
goto free;
diff --git a/drivers/gpu/drm/virtio/virtgpu_kms.c b/drivers/gpu/drm/virtio/virtgpu_kms.c
index 036b0fb..d62ad53 100644
--- a/drivers/gpu/drm/virtio/virtgpu_kms.c
+++ b/drivers/gpu/drm/virtio/virtgpu_kms.c
@@ -172,7 +172,7 @@ int virtio_gpu_driver_load(struct drm_device *dev, unsigned long flags)
vgdev->has_virgl_3d ? "enabled" : "not available");
ret = vgdev->vdev->config->find_vqs(vgdev->vdev, 2, vqs,
- callbacks, names);
+ callbacks, names, NULL);
if (ret) {
DRM_ERROR("failed to find virt queues\n");
goto err_vqs;
diff --git a/drivers/misc/mic/vop/vop_main.c b/drivers/misc/mic/vop/vop_main.c
index 1a2b67f3..c2e29d7 100644
--- a/drivers/misc/mic/vop/vop_main.c
+++ b/drivers/misc/mic/vop/vop_main.c
@@ -374,7 +374,7 @@ static struct virtqueue *vop_find_vq(struct virtio_device *dev,
static int vop_find_vqs(struct virtio_device *dev, unsigned nvqs,
struct virtqueue *vqs[],
vq_callback_t *callbacks[],
- const char * const names[])
+ const char * const names[], struct irq_affinity *desc)
{
struct _vop_vdev *vdev = to_vopvdev(dev);
struct vop_device *vpdev = vdev->vpdev;
diff --git a/drivers/net/caif/caif_virtio.c b/drivers/net/caif/caif_virtio.c
index b306210..bc0eb47 100644
--- a/drivers/net/caif/caif_virtio.c
+++ b/drivers/net/caif/caif_virtio.c
@@ -679,7 +679,8 @@ static int cfv_probe(struct virtio_device *vdev)
goto err;
/* Get the TX virtio ring. This is a "guest side vring". */
- err = vdev->config->find_vqs(vdev, 1, &cfv->vq_tx, &vq_cbs, &names);
+ err = vdev->config->find_vqs(vdev, 1, &cfv->vq_tx, &vq_cbs, &names,
+ NULL);
if (err)
goto err;
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index fad84f3..08c9ee5 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1601,7 +1601,7 @@ static int virtnet_find_vqs(struct virtnet_info *vi)
}
ret = vi->vdev->config->find_vqs(vi->vdev, total_vqs, vqs, callbacks,
- names);
+ names, NULL);
if (ret)
goto err_find;
diff --git a/drivers/remoteproc/remoteproc_virtio.c b/drivers/remoteproc/remoteproc_virtio.c
index 01870a1..49c73ce 100644
--- a/drivers/remoteproc/remoteproc_virtio.c
+++ b/drivers/remoteproc/remoteproc_virtio.c
@@ -142,7 +142,8 @@ static void rproc_virtio_del_vqs(struct virtio_device *vdev)
static int rproc_virtio_find_vqs(struct virtio_device *vdev, unsigned int nvqs,
struct virtqueue *vqs[],
vq_callback_t *callbacks[],
- const char * const names[])
+ const char * const names[],
+ struct irq_affinity *desc)
{
int i, ret;
diff --git a/drivers/rpmsg/virtio_rpmsg_bus.c b/drivers/rpmsg/virtio_rpmsg_bus.c
index 3090b0d..5e66e08 100644
--- a/drivers/rpmsg/virtio_rpmsg_bus.c
+++ b/drivers/rpmsg/virtio_rpmsg_bus.c
@@ -869,7 +869,7 @@ static int rpmsg_probe(struct virtio_device *vdev)
init_waitqueue_head(&vrp->sendq);
/* We expect two virtqueues, rx and tx (and in this order) */
- err = vdev->config->find_vqs(vdev, 2, vqs, vq_cbs, names);
+ err = vdev->config->find_vqs(vdev, 2, vqs, vq_cbs, names, NULL);
if (err)
goto free_vrp;
diff --git a/drivers/s390/virtio/kvm_virtio.c b/drivers/s390/virtio/kvm_virtio.c
index 5e5c11f..2ce0b3e 100644
--- a/drivers/s390/virtio/kvm_virtio.c
+++ b/drivers/s390/virtio/kvm_virtio.c
@@ -255,7 +255,8 @@ static void kvm_del_vqs(struct virtio_device *vdev)
static int kvm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
struct virtqueue *vqs[],
vq_callback_t *callbacks[],
- const char * const names[])
+ const char * const names[],
+ struct irq_affinity *desc)
{
struct kvm_device *kdev = to_kvmdev(vdev);
int i;
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index 8688ad4..8b9c3a5 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -636,7 +636,8 @@ static int virtio_ccw_register_adapter_ind(struct virtio_ccw_device *vcdev,
static int virtio_ccw_find_vqs(struct virtio_device *vdev, unsigned nvqs,
struct virtqueue *vqs[],
vq_callback_t *callbacks[],
- const char * const names[])
+ const char * const names[],
+ struct irq_affinity *desc)
{
struct virtio_ccw_device *vcdev = to_vc_device(vdev);
unsigned long *indicatorp = NULL;
diff --git a/drivers/scsi/virtio_scsi.c b/drivers/scsi/virtio_scsi.c
index ec91bd0..32a1629 100644
--- a/drivers/scsi/virtio_scsi.c
+++ b/drivers/scsi/virtio_scsi.c
@@ -932,7 +932,8 @@ static int virtscsi_init(struct virtio_device *vdev,
}
/* Discover virtqueues and write information to configuration. */
- err = vdev->config->find_vqs(vdev, num_vqs, vqs, callbacks, names);
+ err = vdev->config->find_vqs(vdev, num_vqs, vqs, callbacks, names,
+ NULL);
if (err)
goto out;
diff --git a/drivers/virtio/virtio_balloon.c b/drivers/virtio/virtio_balloon.c
index 181793f..36c9c8f 100644
--- a/drivers/virtio/virtio_balloon.c
+++ b/drivers/virtio/virtio_balloon.c
@@ -413,7 +413,8 @@ static int init_vqs(struct virtio_balloon *vb)
* optionally stat.
*/
nvqs = virtio_has_feature(vb->vdev, VIRTIO_BALLOON_F_STATS_VQ) ? 3 : 2;
- err = vb->vdev->config->find_vqs(vb->vdev, nvqs, vqs, callbacks, names);
+ err = vb->vdev->config->find_vqs(vb->vdev, nvqs, vqs, callbacks, names,
+ NULL);
if (err)
return err;
diff --git a/drivers/virtio/virtio_input.c b/drivers/virtio/virtio_input.c
index 350a2a5..79f1293 100644
--- a/drivers/virtio/virtio_input.c
+++ b/drivers/virtio/virtio_input.c
@@ -173,7 +173,8 @@ static int virtinput_init_vqs(struct virtio_input *vi)
static const char * const names[] = { "events", "status" };
int err;
- err = vi->vdev->config->find_vqs(vi->vdev, 2, vqs, cbs, names);
+ err = vi->vdev->config->find_vqs(vi->vdev, 2, vqs, cbs, names,
+ NULL);
if (err)
return err;
vi->evt = vqs[0];
diff --git a/drivers/virtio/virtio_mmio.c b/drivers/virtio/virtio_mmio.c
index 48bfea9..64c87be 100644
--- a/drivers/virtio/virtio_mmio.c
+++ b/drivers/virtio/virtio_mmio.c
@@ -445,7 +445,8 @@ static struct virtqueue *vm_setup_vq(struct virtio_device *vdev, unsigned index,
static int vm_find_vqs(struct virtio_device *vdev, unsigned nvqs,
struct virtqueue *vqs[],
vq_callback_t *callbacks[],
- const char * const names[])
+ const char * const names[],
+ struct irq_affinity *desc)
{
struct virtio_mmio_device *vm_dev = to_virtio_mmio_device(vdev);
unsigned int irq = platform_get_irq(vm_dev->pdev, 0);
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index 90d7975..df7164e 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -107,20 +107,25 @@ static irqreturn_t vp_interrupt(int irq, void *opaque)
return vp_vring_interrupt(irq, opaque);
}
-static int vp_setup_msix_vectors(struct virtio_device *vdev, int nvectors)
+static int vp_setup_msix_vectors(struct virtio_device *vdev, int nvectors,
+ struct irq_affinity *desc)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
const char *name = dev_name(&vp_dev->vdev.dev);
- unsigned i;
+ unsigned flags = PCI_IRQ_MSIX, i;
int err = -ENOMEM;
+ if (desc) {
+ flags |= PCI_IRQ_AFFINITY;
+ desc->pre_vectors++; /* virtio config vector */
+ }
+
/* Try one vector per queue first. */
- err = pci_alloc_irq_vectors(vp_dev->pci_dev, nvectors, nvectors,
- PCI_IRQ_MSIX);
+ err = pci_alloc_irq_vectors_affinity(vp_dev->pci_dev, nvectors,
+ nvectors, flags, desc);
if (err < 0) {
/* Fallback to one vector for config, one shared for queues. */
- err = pci_alloc_irq_vectors(vp_dev->pci_dev, 2, 2,
- PCI_IRQ_MSIX);
+ err = pci_alloc_irq_vectors(vp_dev->pci_dev, 2, 2, flags);
if (err < 0)
goto error;
nvectors = 2;
@@ -247,9 +252,8 @@ void vp_del_vqs(struct virtio_device *vdev)
}
static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
- struct virtqueue *vqs[],
- vq_callback_t *callbacks[],
- const char * const names[])
+ struct virtqueue *vqs[], vq_callback_t *callbacks[],
+ const char * const names[], struct irq_affinity *desc)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
u16 msix_vec;
@@ -265,7 +269,7 @@ static int vp_find_vqs_msix(struct virtio_device *vdev, unsigned nvqs,
nvectors++;
}
- err = vp_setup_msix_vectors(vdev, nvectors);
+ err = vp_setup_msix_vectors(vdev, nvectors, desc);
if (err < 0)
goto error_find;
nvectors = err;
@@ -353,13 +357,12 @@ static int vp_find_vqs_intx(struct virtio_device *vdev, unsigned nvqs,
/* the config->find_vqs() implementation */
int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs,
- struct virtqueue *vqs[],
- vq_callback_t *callbacks[],
- const char * const names[])
+ struct virtqueue *vqs[], vq_callback_t *callbacks[],
+ const char * const names[], struct irq_affinity *desc)
{
int err;
- err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names);
+ err = vp_find_vqs_msix(vdev, nvqs, vqs, callbacks, names, desc);
if (!err)
return 0;
return vp_find_vqs_intx(vdev, nvqs, vqs, callbacks, names);
diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
index d041cce..85010f0 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -114,9 +114,8 @@ bool vp_notify(struct virtqueue *vq);
void vp_del_vqs(struct virtio_device *vdev);
/* the config->find_vqs() implementation */
int vp_find_vqs(struct virtio_device *vdev, unsigned nvqs,
- struct virtqueue *vqs[],
- vq_callback_t *callbacks[],
- const char * const names[]);
+ struct virtqueue *vqs[], vq_callback_t *callbacks[],
+ const char * const names[], struct irq_affinity *desc);
const char *vp_bus_name(struct virtio_device *vdev);
/* Setup the affinity for a virtqueue:
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index e18d0b0..975197b 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -387,13 +387,12 @@ static struct virtqueue *setup_vq(struct virtio_pci_device *vp_dev,
}
static int vp_modern_find_vqs(struct virtio_device *vdev, unsigned nvqs,
- struct virtqueue *vqs[],
- vq_callback_t *callbacks[],
- const char * const names[])
+ struct virtqueue *vqs[], vq_callback_t *callbacks[],
+ const char * const names[], struct irq_affinity *desc)
{
struct virtio_pci_device *vp_dev = to_vp_device(vdev);
struct virtqueue *vq;
- int rc = vp_find_vqs(vdev, nvqs, vqs, callbacks, names);
+ int rc = vp_find_vqs(vdev, nvqs, vqs, callbacks, names, desc);
if (rc)
return rc;
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 26c155b..2ebe506 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -7,6 +7,8 @@
#include <linux/virtio_byteorder.h>
#include <uapi/linux/virtio_config.h>
+struct irq_affinity;
+
/**
* virtio_config_ops - operations for configuring a virtio device
* @get: read the value of a configuration field
@@ -68,9 +70,8 @@ struct virtio_config_ops {
void (*set_status)(struct virtio_device *vdev, u8 status);
void (*reset)(struct virtio_device *vdev);
int (*find_vqs)(struct virtio_device *, unsigned nvqs,
- struct virtqueue *vqs[],
- vq_callback_t *callbacks[],
- const char * const names[]);
+ struct virtqueue *vqs[], vq_callback_t *callbacks[],
+ const char * const names[], struct irq_affinity *desc);
void (*del_vqs)(struct virtio_device *);
u64 (*get_features)(struct virtio_device *vdev);
int (*finalize_features)(struct virtio_device *vdev);
@@ -169,7 +170,7 @@ struct virtqueue *virtio_find_single_vq(struct virtio_device *vdev,
vq_callback_t *callbacks[] = { c };
const char *names[] = { n };
struct virtqueue *vq;
- int err = vdev->config->find_vqs(vdev, 1, &vq, callbacks, names);
+ int err = vdev->config->find_vqs(vdev, 1, &vq, callbacks, names, NULL);
if (err < 0)
return ERR_PTR(err);
return vq;
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 936d7ee..09906db 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -489,7 +489,8 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
vsock->vdev = vdev;
ret = vsock->vdev->config->find_vqs(vsock->vdev, VSOCK_VQ_MAX,
- vsock->vqs, callbacks, names);
+ vsock->vqs, callbacks, names,
+ NULL);
if (ret < 0)
goto out;
--
2.1.4
^ permalink raw reply related
* [PATCH 09/11] virtio: provide a method to get the IRQ affinity mask for a virtqueue
From: Christoph Hellwig @ 2016-11-17 10:43 UTC (permalink / raw)
To: mst; +Cc: axboe, linux-block, linux-kernel, virtualization
In-Reply-To: <1479379403-27880-1-git-send-email-hch@lst.de>
This basically passed up the pci_irq_get_affinity information through
virtio through an optional get_vq_affinity method. It is only implemented
by the PCI backend for now, and only when we use per-virtqueue IRQs.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/virtio/virtio_pci_common.c | 10 ++++++++++
drivers/virtio/virtio_pci_common.h | 2 ++
drivers/virtio/virtio_pci_legacy.c | 1 +
drivers/virtio/virtio_pci_modern.c | 2 ++
include/linux/virtio_config.h | 3 +++
5 files changed, 18 insertions(+)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c
index df7164e..25622da 100644
--- a/drivers/virtio/virtio_pci_common.c
+++ b/drivers/virtio/virtio_pci_common.c
@@ -405,6 +405,16 @@ int vp_set_vq_affinity(struct virtqueue *vq, int cpu)
return 0;
}
+const struct cpumask *vp_get_vq_affinity(struct virtio_device *vdev, int index)
+{
+ struct virtio_pci_device *vp_dev = to_vp_device(vdev);
+ int vec = vp_dev->vqs[index]->msix_vector;
+
+ if (vec == VIRTIO_MSI_NO_VECTOR)
+ return NULL;
+ return pci_irq_get_affinity(vp_dev->pci_dev, vec);
+}
+
#ifdef CONFIG_PM_SLEEP
static int virtio_pci_freeze(struct device *dev)
{
diff --git a/drivers/virtio/virtio_pci_common.h b/drivers/virtio/virtio_pci_common.h
index 85010f0..d6c29c5 100644
--- a/drivers/virtio/virtio_pci_common.h
+++ b/drivers/virtio/virtio_pci_common.h
@@ -125,6 +125,8 @@ const char *vp_bus_name(struct virtio_device *vdev);
*/
int vp_set_vq_affinity(struct virtqueue *vq, int cpu);
+const struct cpumask *vp_get_vq_affinity(struct virtio_device *vdev, int index);
+
#if IS_ENABLED(CONFIG_VIRTIO_PCI_LEGACY)
int virtio_pci_legacy_probe(struct virtio_pci_device *);
void virtio_pci_legacy_remove(struct virtio_pci_device *);
diff --git a/drivers/virtio/virtio_pci_legacy.c b/drivers/virtio/virtio_pci_legacy.c
index f83829f..4930b23 100644
--- a/drivers/virtio/virtio_pci_legacy.c
+++ b/drivers/virtio/virtio_pci_legacy.c
@@ -194,6 +194,7 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
.finalize_features = vp_finalize_features,
.bus_name = vp_bus_name,
.set_vq_affinity = vp_set_vq_affinity,
+ .get_vq_affinity = vp_get_vq_affinity,
};
/* the PCI probing function */
diff --git a/drivers/virtio/virtio_pci_modern.c b/drivers/virtio/virtio_pci_modern.c
index 975197b..cd09d78 100644
--- a/drivers/virtio/virtio_pci_modern.c
+++ b/drivers/virtio/virtio_pci_modern.c
@@ -441,6 +441,7 @@ static const struct virtio_config_ops virtio_pci_config_nodev_ops = {
.finalize_features = vp_finalize_features,
.bus_name = vp_bus_name,
.set_vq_affinity = vp_set_vq_affinity,
+ .get_vq_affinity = vp_get_vq_affinity,
};
static const struct virtio_config_ops virtio_pci_config_ops = {
@@ -456,6 +457,7 @@ static const struct virtio_config_ops virtio_pci_config_ops = {
.finalize_features = vp_finalize_features,
.bus_name = vp_bus_name,
.set_vq_affinity = vp_set_vq_affinity,
+ .get_vq_affinity = vp_get_vq_affinity,
};
/**
diff --git a/include/linux/virtio_config.h b/include/linux/virtio_config.h
index 2ebe506..8355bab 100644
--- a/include/linux/virtio_config.h
+++ b/include/linux/virtio_config.h
@@ -58,6 +58,7 @@ struct irq_affinity;
* This returns a pointer to the bus name a la pci_name from which
* the caller can then copy.
* @set_vq_affinity: set the affinity for a virtqueue.
+ * @get_vq_affinity: get the affinity for a virtqueue (optional).
*/
typedef void vq_callback_t(struct virtqueue *);
struct virtio_config_ops {
@@ -77,6 +78,8 @@ struct virtio_config_ops {
int (*finalize_features)(struct virtio_device *vdev);
const char *(*bus_name)(struct virtio_device *vdev);
int (*set_vq_affinity)(struct virtqueue *vq, int cpu);
+ const struct cpumask *(*get_vq_affinity)(struct virtio_device *vdev,
+ int index);
};
/* If driver didn't advertise the feature, it will never appear. */
--
2.1.4
^ permalink raw reply related
* [PATCH 10/11] blk-mq: provide a default queue mapping for virtio device
From: Christoph Hellwig @ 2016-11-17 10:43 UTC (permalink / raw)
To: mst; +Cc: axboe, linux-block, linux-kernel, virtualization
In-Reply-To: <1479379403-27880-1-git-send-email-hch@lst.de>
Similar to the PCI version, just calling into virtio instead.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
block/Kconfig | 5 ++++
block/Makefile | 1 +
block/blk-mq-virtio.c | 55 +++++++++++++++++++++++++++++++++++++++++++
include/linux/blk-mq-virtio.h | 10 ++++++++
4 files changed, 71 insertions(+)
create mode 100644 block/blk-mq-virtio.c
create mode 100644 include/linux/blk-mq-virtio.h
diff --git a/block/Kconfig b/block/Kconfig
index 1d4d624..2f69b45 100644
--- a/block/Kconfig
+++ b/block/Kconfig
@@ -130,4 +130,9 @@ config BLK_MQ_PCI
depends on BLOCK && PCI
default y
+config BLK_MQ_VIRTIO
+ bool
+ depends on BLOCK && VIRTIO
+ default y
+
source block/Kconfig.iosched
diff --git a/block/Makefile b/block/Makefile
index 36acdd75..3530a58 100644
--- a/block/Makefile
+++ b/block/Makefile
@@ -23,3 +23,4 @@ obj-$(CONFIG_BLOCK_COMPAT) += compat_ioctl.o
obj-$(CONFIG_BLK_CMDLINE_PARSER) += cmdline-parser.o
obj-$(CONFIG_BLK_DEV_INTEGRITY) += bio-integrity.o blk-integrity.o t10-pi.o
obj-$(CONFIG_BLK_MQ_PCI) += blk-mq-pci.o
+obj-$(CONFIG_BLK_MQ_VIRTIO) += blk-mq-virtio.o
diff --git a/block/blk-mq-virtio.c b/block/blk-mq-virtio.c
new file mode 100644
index 0000000..44d3f0d
--- /dev/null
+++ b/block/blk-mq-virtio.c
@@ -0,0 +1,55 @@
+/*
+ * Copyright (c) 2016 Christoph Hellwig.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
+ * more details.
+ */
+#include <linux/device.h>
+#include <linux/blk-mq.h>
+#include <linux/blk-mq-virtio.h>
+#include <linux/virtio_config.h>
+#include <linux/module.h>
+#include "blk-mq.h"
+
+/**
+ * blk_mq_virtio_map_queues - provide a default queue mapping for virtio device
+ * @set: tagset to provide the mapping for
+ * @vdev: virtio device associated with @set.
+ * @first_vec: first interrupt vectors to use for queues (usually 0)
+ *
+ * This function assumes the virtio device @vdev has at least as many available
+ * interrupt vetors as @set has queues. It will then queuery the vector
+ * corresponding to each queue for it's affinity mask and built queue mapping
+ * that maps a queue to the CPUs that have irq affinity for the corresponding
+ * vector.
+ */
+int blk_mq_virtio_map_queues(struct blk_mq_tag_set *set,
+ struct virtio_device *vdev, int first_vec)
+{
+ const struct cpumask *mask;
+ unsigned int queue, cpu;
+
+ if (!vdev->config->get_vq_affinity)
+ goto fallback;
+
+ /* the very first virtqueue is used internally by virtio, skip it */
+ for (queue = 0; queue < set->nr_hw_queues; queue++) {
+ mask = vdev->config->get_vq_affinity(vdev, first_vec + queue);
+ if (!mask)
+ goto fallback;
+
+ for_each_cpu(cpu, mask)
+ set->mq_map[cpu] = queue;
+ }
+
+ return 0;
+fallback:
+ return blk_mq_map_queues(set);
+}
+EXPORT_SYMBOL_GPL(blk_mq_virtio_map_queues);
diff --git a/include/linux/blk-mq-virtio.h b/include/linux/blk-mq-virtio.h
new file mode 100644
index 0000000..b1ef6e1
--- /dev/null
+++ b/include/linux/blk-mq-virtio.h
@@ -0,0 +1,10 @@
+#ifndef _LINUX_BLK_MQ_VIRTIO_H
+#define _LINUX_BLK_MQ_VIRTIO_H
+
+struct blk_mq_tag_set;
+struct virtio_device;
+
+int blk_mq_virtio_map_queues(struct blk_mq_tag_set *set,
+ struct virtio_device *vdev, int first_vec);
+
+#endif /* _LINUX_BLK_MQ_VIRTIO_H */
--
2.1.4
^ permalink raw reply related
* [PATCH 11/11] virtio_blk: use virtio IRQ affinity
From: Christoph Hellwig @ 2016-11-17 10:43 UTC (permalink / raw)
To: mst; +Cc: axboe, linux-block, linux-kernel, virtualization
In-Reply-To: <1479379403-27880-1-git-send-email-hch@lst.de>
Use automatic IRQ affinity assignment in the virtio layer if available,
and build the blk-mq queues based on it.
Signed-off-by: Christoph Hellwig <hch@lst.de>
---
drivers/block/virtio_blk.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 689e790..7982090 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -5,6 +5,7 @@
#include <linux/hdreg.h>
#include <linux/module.h>
#include <linux/mutex.h>
+#include <linux/interrupt.h>
#include <linux/virtio.h>
#include <linux/virtio_blk.h>
#include <linux/scatterlist.h>
@@ -12,6 +13,7 @@
#include <scsi/scsi_cmnd.h>
#include <linux/idr.h>
#include <linux/blk-mq.h>
+#include <linux/blk-mq-virtio.h>
#include <linux/numa.h>
#define PART_BITS 4
@@ -383,6 +385,7 @@ static int init_vq(struct virtio_blk *vblk)
struct virtqueue **vqs;
unsigned short num_vqs;
struct virtio_device *vdev = vblk->vdev;
+ struct irq_affinity desc = { 0, };
err = virtio_cread_feature(vdev, VIRTIO_BLK_F_MQ,
struct virtio_blk_config, num_queues,
@@ -410,7 +413,7 @@ static int init_vq(struct virtio_blk *vblk)
/* Discover virtqueues and write information to configuration. */
err = vdev->config->find_vqs(vdev, num_vqs, vqs, callbacks, names,
- NULL);
+ &desc);
if (err)
goto out;
@@ -541,10 +544,18 @@ static int virtblk_init_request(void *data, struct request *rq,
return 0;
}
+static int virtblk_map_queues(struct blk_mq_tag_set *set)
+{
+ struct virtio_blk *vblk = set->driver_data;
+
+ return blk_mq_virtio_map_queues(set, vblk->vdev, 0);
+}
+
static struct blk_mq_ops virtio_mq_ops = {
.queue_rq = virtio_queue_rq,
.complete = virtblk_request_done,
.init_request = virtblk_init_request,
+ .map_queues = virtblk_map_queues,
};
static unsigned int virtblk_queue_depth;
--
2.1.4
^ permalink raw reply related
* RE: [PATCH] crypto: add virtio-crypto driver
From: Benedetto, Salvatore @ 2016-11-17 15:55 UTC (permalink / raw)
To: Gonglei, qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org,
virtualization@lists.linux-foundation.org,
linux-crypto@vger.kernel.org
Cc: weidong.huang@huawei.com, claudio.fontana@huawei.com,
mst@redhat.com, luonengjun@huawei.com, hanweidong@huawei.com,
peter.huangpeng@huawei.com, Benedetto, Salvatore,
xuquan8@huawei.com, stefanha@redhat.com, jianjay.zhou@huawei.com,
arei.gonglei@hotmail.com, davem@davemloft.net,
wu.wubin@huawei.com, herbert@gondor.apana.org.au
In-Reply-To: <1479106074-32036-1-git-send-email-arei.gonglei@huawei.com>
Hi Gonglei,
...
> +
> +static int virtio_crypto_alg_ablkcipher_init_session(
> + struct virtio_crypto_ablkcipher_ctx *ctx,
> + int alg, const uint8_t *key,
> + unsigned int keylen,
> + int encrypt)
> +{
> + struct scatterlist outhdr, key_sg, inhdr, *sgs[3];
> + unsigned int tmp;
> + struct virtio_crypto_session_input input;
> + struct virtio_crypto_op_ctrl_req ctrl;
> + struct virtio_crypto *vcrypto = ctx->vcrypto;
> + int op = encrypt ? VIRTIO_CRYPTO_OP_ENCRYPT :
> VIRTIO_CRYPTO_OP_DECRYPT;
> + int err;
> + unsigned int num_out = 0, num_in = 0;
> +
> + memset(&ctrl, 0, sizeof(ctrl));
> + memset(&input, 0, sizeof(input));
> + /* Pad ctrl header */
> + ctrl.header.opcode =
> cpu_to_le32(VIRTIO_CRYPTO_CIPHER_CREATE_SESSION);
> + ctrl.header.algo = cpu_to_le32((uint32_t)alg);
> + /* Set the default dataqueue id to 0 */
> + ctrl.header.queue_id = 0;
> +
> + input.status = cpu_to_le32(VIRTIO_CRYPTO_ERR);
> + /* Pad cipher's parameters */
> + ctrl.u.sym_create_session.op_type =
> + cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
> + ctrl.u.sym_create_session.u.cipher.para.algo = ctrl.header.algo;
> + ctrl.u.sym_create_session.u.cipher.para.keylen =
> cpu_to_le32(keylen);
> + ctrl.u.sym_create_session.u.cipher.para.op = cpu_to_le32(op);
> +
> + sg_init_one(&outhdr, &ctrl, sizeof(ctrl));
I believe this won't work when the new virtually-mapped kernel stack (VMAP_STACK)
is enabled.
Regards,
Salvatore
^ permalink raw reply
* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
From: Namhyung Kim @ 2016-11-18 3:32 UTC (permalink / raw)
To: Paolo Bonzini
Cc: virtio-dev, Tony Luck, Kees Cook, KVM,
Radim Krčmář, Michael S. Tsirkin, LKML,
Steven Rostedt, qemu-devel, Minchan Kim, Anton Vorontsov,
Anthony Liguori, Colin Cross, virtualization, Ingo Molnar
In-Reply-To: <1627682877.13077005.1479298236793.JavaMail.zimbra@redhat.com>
Hi,
Thanks for your detailed information,
On Wed, Nov 16, 2016 at 07:10:36AM -0500, Paolo Bonzini wrote:
> > Not sure how independent ERST is from ACPI and other specs. It looks
> > like referencing UEFI spec at least.
>
> It is just the format of error records that comes from the UEFI spec
> (include/linux/cper.h) but you can ignore it, I think. It should be
> handled by tools on the host side. For you, the error log address
> range contains a CPER header followed by a binary blob. In practice,
> you only need the record length field (bytes 20-23 of the header),
> though it may be a good idea to validate the signature at the beginning
> of the header.
>
> > Btw, is the ERST used for pstore only (in Linux)?
>
> Yes. It can store various records, including dmesg and MCE.
>
> There are other examples in QEMU of interfaces with ACPI. They all use the
> DSDT, but the logic is similar. For example, docs/specs/acpi_mem_hotplug.txt
> documents the memory hotplug interface. In all cases, ACPI tables contain small
> programs that talk to specialized hardware registers, typically allocated to
> hard-coded I/O ports.
>
> In your case, the registers could occupy 16 consecutive I/O ports, like the
> following:
>
> 0x00 read/write operation type (0=write,1=read,2=clear,3=dummy write)
>
> 0x01 read-only bit 7: if set, operation in progress
>
> bit 0-6: operation status, see "Command Status Definition" in
> the ACPI spec
>
> 0x02 read-only when read:
>
> - read a 64-bit record id from the store to memory,
> from the address that was last written to 0x08.
>
> - if the id is valid and is not the last id in the store,
> write the next 64-bit record id to the same address
>
> - otherwise, write the first record id to the same address,
> or 0xffffffffffffffff if the store is empty
>
> 0x03 unused, read as zero
>
> 0x04-0x07 read/write offset of the error record into the error log address range
>
> 0x08-0x0b read/write when read, return number of stored records
>
> when written, the written value is a 32-bit memory address,
> which points to a 64-bit location used to communicate record ids.
>
> 0x0c-0x0f read/write when read, always return -1 (together with the "mask" field
> and READ_REGISTER, this lets ERST instructions return any value!)
>
> when written, trigger the pstore operation:
>
> - if the current operation is a dummy write, do nothing
>
> - if the current operation is a write, write a new record, using
> the written value as the base of the error log address range. The
> length must be parsed from the CPER header.
>
> - if the current operation is a clear, read the record id
> from the memory location that was last written to 0x08 and do the
> operation. the value written is ignored.
>
> - if the current operation is a read, read the record id from the
> memory location that was last written to 0x08, using the written
> value as the base of the error log address range.
>
> In addition, the firmware will need to reserve a few KB of RAM for the error log
> address range (I checked a real system and it reserves 8KB). The first eight
> bytes are needed for the record identifier interface, because there's no such
> thing as 64-bit I/O ports, and the rest can be used for the actual buffer.
Is there a limit on the size? It'd be great if it can use a few MB..
>
> QEMU already has an interface to allocate RAM and patch the address into an
> ACPI table (bios_linker_loader_alloc). Because this interface is actually meant
> to load data from QEMU into the firmware (using the "fw_cfg" interface), you
> would have to add a dummy 8KB file to fw_cfg using fw_cfg_add_file (for
> example "etc/erst-memory"), it can be just full of zeros.
>
> QEMU supports two chipsets, PIIX and ICH9, and the free I/O port ranges are
> different. You could use 0xa20 for ICH9 and 0xae20 for PIIX.
>
> All in all, the contents of the ERST table would not be very different from a
> non-virtual system, except that on real hardware the firmware would use SMIs
> as the trap mechanism. You almost have a one-to-one mapping between ERST
> actions and registers accesses:
>
> BEGIN_WRITE_OPERATION write value 0 to register at 0x00
> BEGIN_READ_OPERATION write value 1 to register at 0x00
> BEGIN_CLEAR_OPERATION write value 2 to register at 0x00
> BEGIN_DUMMY_WRITE_OPERATION write value 3 to register at 0x00
> END_OPERATION no-op
> CHECK_BUSY_STATUS read register at 0x01 with mask 0x80
> GET_COMMAND_STATUS read register at 0x01 with mask 0x7f
> SET_RECORD_OFFSET write register at 0x04
> GET_RECORD_COUNT read register at 0x08
> EXECUTE_OPERATION write ERST memory base + 8 to 0x0c
> GET_ERROR_LOG_ADDRESS_RANGE read register at 0x0c (with mask = ERST memory base + 8)
> GET_ERROR_LOG_ADDRESS_RANGE_LENGTH read register at 0x0c (with mask = 8192 - 8 = 8184)
> GET_ERROR_LOG_ADDRESS_RANGE_ATTRIBUTES read register at 0x0c (with mask = 0)
>
> Only the get/set record identifier instructions are a little harder:
>
> GET_RECORD_IDENTIFIER write ERST memory base to register at 0x08
> read register at 0x02
> read eight bytes at ERST memory base
>
> SET_RECORD_IDENTIFIER write ERST memory base to register at 0x08
> write eight bytes at ERST memory base
>
> On top of this, you need to add the APEI UUID (see apei_osc_setup in Linux)
> to build_q35_osc_method, and use "-M q35" when you start QEMU. If you need
> more help just ask. I or others can help you with the ACPI glue, then you
> can write the file backend yourself, based on your existing virtio-pstore code.
>
> > Also I need to control pstore driver like using bigger buffer,
> > enabling specific message types and so on if ERST supports. Is it
> > possible for ERST to provide such information?
>
> It's the normal pstore driver, same as on a real server. What exactly do you
> need?
Well, I don't want to send additional pstore messages to the device if
it cannot handle them properly - for example, ftrace message should not
overwrite kmsg dump. It'd be great if device somehow could expose
acceptable message types to the driver IMHO.
Btw I prefer using the kvmtool for my kernel work since it's much more
simpler..
Thanks,
Namhyung
^ permalink raw reply
* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
From: Michael S. Tsirkin @ 2016-11-18 4:07 UTC (permalink / raw)
To: Namhyung Kim
Cc: virtio-dev, Tony Luck, Kees Cook, KVM,
Radim Krčmář, Anton Vorontsov, LKML,
Steven Rostedt, qemu-devel, Minchan Kim, Anthony Liguori,
Colin Cross, Paolo Bonzini, virtualization, Ingo Molnar
In-Reply-To: <20161118033206.GA15698@danjae.aot.lge.com>
On Fri, Nov 18, 2016 at 12:32:06PM +0900, Namhyung Kim wrote:
> Btw I prefer using the kvmtool for my kernel work since it's much more
> simpler..
>
> Thanks,
> Namhyung
Up to you but then you should extend that to support 1.0 spec.
I strongly object to adding to the list of legacy interfaces
we need to maintain.
--
MST
^ permalink raw reply
* [PATCH 1/2] vhost: remove unused feature bit
From: Jason Wang @ 2016-11-18 7:58 UTC (permalink / raw)
To: mst, jasowang; +Cc: netdev, linux-kernel, kvm, virtualization
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
include/uapi/linux/vhost.h | 2 --
1 file changed, 2 deletions(-)
diff --git a/include/uapi/linux/vhost.h b/include/uapi/linux/vhost.h
index 56b7ab5..60180c0 100644
--- a/include/uapi/linux/vhost.h
+++ b/include/uapi/linux/vhost.h
@@ -172,8 +172,6 @@ struct vhost_memory {
#define VHOST_F_LOG_ALL 26
/* vhost-net should add virtio_net_hdr for RX, and strip for TX packets. */
#define VHOST_NET_F_VIRTIO_NET_HDR 27
-/* Vhost have device IOTLB */
-#define VHOST_F_DEVICE_IOTLB 63
/* VHOST_SCSI specific definitions */
--
2.7.4
^ permalink raw reply related
* [PATCH 2/2] vhost: forbid IOTLB invalidation when not enabled
From: Jason Wang @ 2016-11-18 7:58 UTC (permalink / raw)
To: mst, jasowang; +Cc: netdev, linux-kernel, kvm, virtualization
In-Reply-To: <1479455920-3285-1-git-send-email-jasowang@redhat.com>
When IOTLB is not enabled, we should forbid IOTLB invalidation to
avoid a NULL pointer dereference.
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
drivers/vhost/vhost.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index c6f2d89..7d338d5 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -959,6 +959,10 @@ int vhost_process_iotlb_msg(struct vhost_dev *dev,
vhost_iotlb_notify_vq(dev, msg);
break;
case VHOST_IOTLB_INVALIDATE:
+ if (!dev->iotlb) {
+ ret = -EFAULT;
+ break;
+ }
vhost_del_umem_range(dev->iotlb, msg->iova,
msg->iova + msg->size - 1);
break;
--
2.7.4
^ permalink raw reply related
* Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
From: Paolo Bonzini @ 2016-11-18 9:45 UTC (permalink / raw)
To: Namhyung Kim
Cc: virtio-dev, Tony Luck, Kees Cook, KVM,
Radim Krčmář, Michael S. Tsirkin, LKML,
Steven Rostedt, qemu-devel, Minchan Kim, Anton Vorontsov,
Anthony Liguori, Colin Cross, virtualization, Ingo Molnar
In-Reply-To: <20161118033206.GA15698@danjae.aot.lge.com>
On 18/11/2016 04:32, Namhyung Kim wrote:
>> In addition, the firmware will need to reserve a few KB of RAM for the error log
>> address range (I checked a real system and it reserves 8KB). The first eight
>> bytes are needed for the record identifier interface, because there's no such
>> thing as 64-bit I/O ports, and the rest can be used for the actual buffer.
>
> Is there a limit on the size? It'd be great if it can use a few MB..
Yes, you can make it customizable.
>>> Also I need to control pstore driver like using bigger buffer,
>>> enabling specific message types and so on if ERST supports. Is it
>>> possible for ERST to provide such information?
>>
>> It's the normal pstore driver, same as on a real server. What exactly do you
>> need?
>
> Well, I don't want to send additional pstore messages to the device if
> it cannot handle them properly - for example, ftrace message should not
> overwrite kmsg dump. It'd be great if device somehow could expose
> acceptable message types to the driver IMHO.
This is something that you have to do in the usual kernel pstore
infrastructure. It should not be specific to virtualization.
Paolo
> Btw I prefer using the kvmtool for my kernel work since it's much more
> simpler..
^ permalink raw reply
* Re: [virtio-dev] Re: [PATCH 1/3] virtio: Basic implementation of virtio pstore driver
From: Paolo Bonzini @ 2016-11-18 9:46 UTC (permalink / raw)
To: Michael S. Tsirkin, Namhyung Kim
Cc: virtio-dev, Tony Luck, Kees Cook, KVM,
Radim Krčmář, Anton Vorontsov, LKML,
Steven Rostedt, qemu-devel, Minchan Kim, Anthony Liguori,
Colin Cross, virtualization, Ingo Molnar
In-Reply-To: <20161118060649-mutt-send-email-mst@kernel.org>
On 18/11/2016 05:07, Michael S. Tsirkin wrote:
> On Fri, Nov 18, 2016 at 12:32:06PM +0900, Namhyung Kim wrote:
>> Btw I prefer using the kvmtool for my kernel work since it's much more
>> simpler..
>
> Up to you but then you should extend that to support 1.0 spec.
> I strongly object to adding to the list of legacy interfaces
> we need to maintain.
I object to adding paravirtualization unless there is a good reason why
the usual mechanisms for physical machines cannot be used. The cost of
maintaining a spec, two device implementations (kvmtool+qemu) and a
driver is not small, plus it will not work on older kernels.
Paolo
^ permalink raw reply
* Re: [PATCH] crypto: add virtio-crypto driver
From: gong lei @ 2016-11-20 7:11 UTC (permalink / raw)
To: Benedetto, Salvatore, Gonglei, qemu-devel@nongnu.org,
virtio-dev@lists.oasis-open.org,
virtualization@lists.linux-foundation.org,
linux-crypto@vger.kernel.org
Cc: weidong.huang@huawei.com, claudio.fontana@huawei.com,
mst@redhat.com, luonengjun@huawei.com, hanweidong@huawei.com,
peter.huangpeng@huawei.com, xuquan8@huawei.com,
stefanha@redhat.com, jianjay.zhou@huawei.com, davem@davemloft.net,
wu.wubin@huawei.com, herbert@gondor.apana.org.au
In-Reply-To: <309B30E91F5E2846B79BD9AA9711D031A12767@IRSMSX102.ger.corp.intel.com>
on 2016/11/17 23:55, Benedetto, Salvatore wrote:
> Hi Gonglei,
>
> ...
>> +
>> +static int virtio_crypto_alg_ablkcipher_init_session(
>> + struct virtio_crypto_ablkcipher_ctx *ctx,
>> + int alg, const uint8_t *key,
>> + unsigned int keylen,
>> + int encrypt)
>> +{
>> + struct scatterlist outhdr, key_sg, inhdr, *sgs[3];
>> + unsigned int tmp;
>> + struct virtio_crypto_session_input input;
>> + struct virtio_crypto_op_ctrl_req ctrl;
>> + struct virtio_crypto *vcrypto = ctx->vcrypto;
>> + int op = encrypt ? VIRTIO_CRYPTO_OP_ENCRYPT :
>> VIRTIO_CRYPTO_OP_DECRYPT;
>> + int err;
>> + unsigned int num_out = 0, num_in = 0;
>> +
>> + memset(&ctrl, 0, sizeof(ctrl));
>> + memset(&input, 0, sizeof(input));
>> + /* Pad ctrl header */
>> + ctrl.header.opcode =
>> cpu_to_le32(VIRTIO_CRYPTO_CIPHER_CREATE_SESSION);
>> + ctrl.header.algo = cpu_to_le32((uint32_t)alg);
>> + /* Set the default dataqueue id to 0 */
>> + ctrl.header.queue_id = 0;
>> +
>> + input.status = cpu_to_le32(VIRTIO_CRYPTO_ERR);
>> + /* Pad cipher's parameters */
>> + ctrl.u.sym_create_session.op_type =
>> + cpu_to_le32(VIRTIO_CRYPTO_SYM_OP_CIPHER);
>> + ctrl.u.sym_create_session.u.cipher.para.algo = ctrl.header.algo;
>> + ctrl.u.sym_create_session.u.cipher.para.keylen =
>> cpu_to_le32(keylen);
>> + ctrl.u.sym_create_session.u.cipher.para.op = cpu_to_le32(op);
>> +
>> + sg_init_one(&outhdr, &ctrl, sizeof(ctrl));
> I believe this won't work when the new virtually-mapped kernel stack (VMAP_STACK)
> is enabled.
I see, will fix it in the next version. Thanks for your comments :)
>
> Regards,
> Salvatore
--
Regards,
-Gonglei
^ permalink raw reply
* [RFC LINUX PATCH 0/2] Virtio ring works with DMA coherent memory
From: Wendy Liang @ 2016-11-22 0:32 UTC (permalink / raw)
To: virtualization, edgari, cyrilc; +Cc: Wendy Liang
RPMsg uses dma_alloc_coherent() to allocate memory to shared with the remote.
In this case, as there is no pages setup in the dma_alloc_coherent(),
we cannot get the physical address back from the virtual address, and thus,
we can set the sg_dma_addr to store the DMA address and mark it already DMA
mapped.
When virtio vring sees the sg_dma_addr is ready set, do not call dma_map_page().
The issue was once discussed here:
http://virtualization.linux-foundation.narkive.com/CfVP32Vy/rfc-0-4-rpmsg-fix-init-of-dma-able-virtqueues
Edgar E. Iglesias (1):
rpmsg: DMA map sgs passed to virtio
Wendy Liang (1):
virtio_ring: Do not call dma_map_page if sg is already mapped.
drivers/rpmsg/virtio_rpmsg_bus.c | 22 +++++++++++++++++++---
drivers/virtio/virtio_ring.c | 6 ++++++
2 files changed, 25 insertions(+), 3 deletions(-)
--
1.9.1
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox