* [PATCH v4 0/3] kvm: level irqfd and new eoifd
@ 2012-07-13 19:40 Alex Williamson
2012-07-13 19:40 ` [PATCH v4 1/3] kvm: Extend irqfd to support level interrupts Alex Williamson
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Alex Williamson @ 2012-07-13 19:40 UTC (permalink / raw)
To: avi, mst; +Cc: gleb, kvm, linux-kernel, jan.kiszka
v4:
- KVM_IRQFD_FLAG_LEVEL flag now documented and coded to only be
necessary on assign.
- Lock added to struct _irq_source to maintain the source ID assertion
state to avoid repeat assertions or spurious EOIs. I couldn't figure
out a way to make this work w/o races using atomics and don't see any
way to make use of kvm_irq_line_state.
- GSI support in KVM_EOIFD is dropped since it's not immediately used
and I don't seem to be able to explain why it's useful. It makes it
optional though, which may be a good thing if other archs try to use
this interface.
- eoifd switch to mutex from spinlock (all tested as lockdep safe)
- _irqfd_fdget_lock/_irqfd_put_unlock replaces _irqfd_fdget/_irqfd_put
so we hold the lock for the irqfds ensuring the one we're binding to
doesn't go away while we get a reference to the _irq_source.
- We also pull the GSI off the irqfd rather than specifying it for eoifd.
- eoifd_deactivate renamed to eoifd_destroy to make it blindingly
obvious what it does.
Patch 3/3 is not for commit and not meant to be ready for commit, it's
just an FYI on what adding back support for a GSI based, non-de-asserting
eoifd will look like. Thanks,
Alex
---
Alex Williamson (3):
kvm: Add a GSI specification for KVM_EOIFD
kvm: KVM_EOIFD, an eventfd for EOIs
kvm: Extend irqfd to support level interrupts
Documentation/virtual/kvm/api.txt | 27 +++
arch/x86/kvm/x86.c | 3
include/linux/kvm.h | 24 +++
include/linux/kvm_host.h | 13 +
virt/kvm/eventfd.c | 349 +++++++++++++++++++++++++++++++++++++
virt/kvm/kvm_main.c | 11 +
6 files changed, 423 insertions(+), 4 deletions(-)
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v4 1/3] kvm: Extend irqfd to support level interrupts
2012-07-13 19:40 [PATCH v4 0/3] kvm: level irqfd and new eoifd Alex Williamson
@ 2012-07-13 19:40 ` Alex Williamson
2012-07-13 19:41 ` [PATCH v4 2/3] kvm: KVM_EOIFD, an eventfd for EOIs Alex Williamson
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Alex Williamson @ 2012-07-13 19:40 UTC (permalink / raw)
To: avi, mst; +Cc: gleb, kvm, linux-kernel, jan.kiszka
In order to inject a level interrupt from an external source using an
irqfd, we need to allocate a new irq_source_id. This allows us to
assert and (later) de-assert an interrupt line independently from
users of KVM_IRQ_LINE and avoid lost interrupts.
We also add what may appear like a bit of excessive infrastructure
around an object for storing this irq_source_id. However, notice
that we only provide a way to assert the interrupt here. A follow-on
interface will make use of the same irq_source_id to allow de-assert.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
Documentation/virtual/kvm/api.txt | 6 ++
arch/x86/kvm/x86.c | 1
include/linux/kvm.h | 3 +
virt/kvm/eventfd.c | 114 ++++++++++++++++++++++++++++++++++++-
4 files changed, 120 insertions(+), 4 deletions(-)
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 100acde..c7267d5 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1981,6 +1981,12 @@ the guest using the specified gsi pin. The irqfd is removed using
the KVM_IRQFD_FLAG_DEASSIGN flag, specifying both kvm_irqfd.fd
and kvm_irqfd.gsi.
+The KVM_IRQFD_FLAG_LEVEL flag indicates the gsi input is for a level
+triggered interrupt. In this case a new irqchip input is allocated
+which is logically OR'd with other inputs allowing multiple sources
+to independently assert level interrupts. The KVM_IRQFD_FLAG_LEVEL
+is only necessary on setup, teardown is identical to that above.
+KVM_IRQFD_FLAG_LEVEL support is indicated by KVM_CAP_IRQFD_LEVEL.
5. The kvm_run structure
------------------------
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a01a424..80bed07 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2148,6 +2148,7 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_GET_TSC_KHZ:
case KVM_CAP_PCI_2_3:
case KVM_CAP_KVMCLOCK_CTRL:
+ case KVM_CAP_IRQFD_LEVEL:
r = 1;
break;
case KVM_CAP_COALESCED_MMIO:
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 2ce09aa..b2e6e4f 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -618,6 +618,7 @@ struct kvm_ppc_smmu_info {
#define KVM_CAP_PPC_GET_SMMU_INFO 78
#define KVM_CAP_S390_COW 79
#define KVM_CAP_PPC_ALLOC_HTAB 80
+#define KVM_CAP_IRQFD_LEVEL 81
#ifdef KVM_CAP_IRQ_ROUTING
@@ -683,6 +684,8 @@ struct kvm_xen_hvm_config {
#endif
#define KVM_IRQFD_FLAG_DEASSIGN (1 << 0)
+/* Available with KVM_CAP_IRQFD_LEVEL */
+#define KVM_IRQFD_FLAG_LEVEL (1 << 1)
struct kvm_irqfd {
__u32 fd;
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 7d7e2aa..ecdbfea 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -36,6 +36,68 @@
#include "iodev.h"
/*
+ * An irq_source_id can be created from KVM_IRQFD for level interrupt
+ * injections and shared with other interfaces for EOI or de-assert.
+ * Create an object with reference counting to make it easy to use.
+ */
+struct _irq_source {
+ int id; /* the IRQ source ID */
+ bool level_asserted; /* Track assertion state and protect with lock */
+ spinlock_t lock; /* to avoid unnecessary re-assert/spurious eoi. */
+ struct kvm *kvm;
+ struct kref kref;
+};
+
+static void _irq_source_release(struct kref *kref)
+{
+ struct _irq_source *source;
+
+ source = container_of(kref, struct _irq_source, kref);
+
+ /* This also de-asserts */
+ kvm_free_irq_source_id(source->kvm, source->id);
+ kfree(source);
+}
+
+static void _irq_source_put(struct _irq_source *source)
+{
+ if (source)
+ kref_put(&source->kref, _irq_source_release);
+}
+
+static struct _irq_source *__attribute__ ((used)) /* white lie for now */
+_irq_source_get(struct _irq_source *source)
+{
+ if (source)
+ kref_get(&source->kref);
+
+ return source;
+}
+
+static struct _irq_source *_irq_source_alloc(struct kvm *kvm)
+{
+ struct _irq_source *source;
+ int id;
+
+ source = kzalloc(sizeof(*source), GFP_KERNEL);
+ if (!source)
+ return ERR_PTR(-ENOMEM);
+
+ id = kvm_request_irq_source_id(kvm);
+ if (id < 0) {
+ kfree(source);
+ return ERR_PTR(id);
+ }
+
+ kref_init(&source->kref);
+ spin_lock_init(&source->lock);
+ source->kvm = kvm;
+ source->id = id;
+
+ return source;
+}
+
+/*
* --------------------------------------------------------------------
* irqfd: Allows an fd to be used to inject an interrupt to the guest
*
@@ -52,6 +114,8 @@ struct _irqfd {
/* Used for level IRQ fast-path */
int gsi;
struct work_struct inject;
+ /* IRQ source ID for level triggered irqfds */
+ struct _irq_source *source;
/* Used for setup/shutdown */
struct eventfd_ctx *eventfd;
struct list_head list;
@@ -62,7 +126,7 @@ struct _irqfd {
static struct workqueue_struct *irqfd_cleanup_wq;
static void
-irqfd_inject(struct work_struct *work)
+irqfd_inject_edge(struct work_struct *work)
{
struct _irqfd *irqfd = container_of(work, struct _irqfd, inject);
struct kvm *kvm = irqfd->kvm;
@@ -71,6 +135,29 @@ irqfd_inject(struct work_struct *work)
kvm_set_irq(kvm, KVM_USERSPACE_IRQ_SOURCE_ID, irqfd->gsi, 0);
}
+static void
+irqfd_inject_level(struct work_struct *work)
+{
+ struct _irqfd *irqfd = container_of(work, struct _irqfd, inject);
+
+ /*
+ * Inject an interrupt only if not already asserted.
+ *
+ * We can safely ignore the kvm_set_irq return value here. If
+ * masked, the irr bit is still set and will eventually be serviced.
+ * This interface does not guarantee immediate injection. If
+ * coalesced, an eoi will be coming where we can de-assert and
+ * re-inject if necessary. NB, if you need to know if an interrupt
+ * was coalesced, this interface is not for you.
+ */
+ spin_lock(&irqfd->source->lock);
+ if (!irqfd->source->level_asserted) {
+ kvm_set_irq(irqfd->kvm, irqfd->source->id, irqfd->gsi, 1);
+ irqfd->source->level_asserted = true;
+ }
+ spin_unlock(&irqfd->source->lock);
+}
+
/*
* Race-free decouple logic (ordering is critical)
*/
@@ -96,6 +183,9 @@ irqfd_shutdown(struct work_struct *work)
* It is now safe to release the object's resources
*/
eventfd_ctx_put(irqfd->eventfd);
+
+ _irq_source_put(irqfd->source);
+
kfree(irqfd);
}
@@ -202,6 +292,7 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
{
struct kvm_irq_routing_table *irq_rt;
struct _irqfd *irqfd, *tmp;
+ struct _irq_source *source = NULL;
struct file *file = NULL;
struct eventfd_ctx *eventfd = NULL;
int ret;
@@ -214,7 +305,19 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
irqfd->kvm = kvm;
irqfd->gsi = args->gsi;
INIT_LIST_HEAD(&irqfd->list);
- INIT_WORK(&irqfd->inject, irqfd_inject);
+
+ if (args->flags & KVM_IRQFD_FLAG_LEVEL) {
+ source = _irq_source_alloc(kvm);
+ if (IS_ERR(source)) {
+ ret = PTR_ERR(source);
+ goto fail;
+ }
+
+ irqfd->source = source;
+ INIT_WORK(&irqfd->inject, irqfd_inject_level);
+ } else
+ INIT_WORK(&irqfd->inject, irqfd_inject_edge);
+
INIT_WORK(&irqfd->shutdown, irqfd_shutdown);
file = eventfd_fget(args->fd);
@@ -276,10 +379,13 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
return 0;
fail:
+ if (source && !IS_ERR(source))
+ _irq_source_put(source);
+
if (eventfd && !IS_ERR(eventfd))
eventfd_ctx_put(eventfd);
- if (!IS_ERR(file))
+ if (file && !IS_ERR(file))
fput(file);
kfree(irqfd);
@@ -340,7 +446,7 @@ kvm_irqfd_deassign(struct kvm *kvm, struct kvm_irqfd *args)
int
kvm_irqfd(struct kvm *kvm, struct kvm_irqfd *args)
{
- if (args->flags & ~KVM_IRQFD_FLAG_DEASSIGN)
+ if (args->flags & ~(KVM_IRQFD_FLAG_DEASSIGN | KVM_IRQFD_FLAG_LEVEL))
return -EINVAL;
if (args->flags & KVM_IRQFD_FLAG_DEASSIGN)
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v4 2/3] kvm: KVM_EOIFD, an eventfd for EOIs
2012-07-13 19:40 [PATCH v4 0/3] kvm: level irqfd and new eoifd Alex Williamson
2012-07-13 19:40 ` [PATCH v4 1/3] kvm: Extend irqfd to support level interrupts Alex Williamson
@ 2012-07-13 19:41 ` Alex Williamson
2012-07-15 16:23 ` Michael S. Tsirkin
2012-07-15 16:32 ` Michael S. Tsirkin
2012-07-13 19:41 ` [PATCH v4 3/3] kvm: Add a GSI specification for KVM_EOIFD Alex Williamson
2012-07-15 16:45 ` [PATCH v4 0/3] kvm: level irqfd and new eoifd Michael S. Tsirkin
3 siblings, 2 replies; 8+ messages in thread
From: Alex Williamson @ 2012-07-13 19:41 UTC (permalink / raw)
To: avi, mst; +Cc: gleb, kvm, linux-kernel, jan.kiszka
This new ioctl enables an eventfd to be triggered when an EOI is
written for a specified irqchip pin. The first user of this will
be external device assignment through VFIO, using a level irqfd
for asserting a PCI INTx interrupt and this interface for de-assert
and notification once the interrupt is serviced.
Here we make use of the reference counting of the _irq_source
object allowing us to share it with an irqfd and cleanup regardless
of the release order.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
Documentation/virtual/kvm/api.txt | 21 +++
arch/x86/kvm/x86.c | 2
include/linux/kvm.h | 15 ++
include/linux/kvm_host.h | 13 ++
virt/kvm/eventfd.c | 226 +++++++++++++++++++++++++++++++++++++
virt/kvm/kvm_main.c | 11 ++
6 files changed, 286 insertions(+), 2 deletions(-)
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index c7267d5..d5be635 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1988,6 +1988,27 @@ to independently assert level interrupts. The KVM_IRQFD_FLAG_LEVEL
is only necessary on setup, teardown is identical to that above.
KVM_IRQFD_FLAG_LEVEL support is indicated by KVM_CAP_IRQFD_LEVEL.
+4.77 KVM_EOIFD
+
+Capability: KVM_CAP_EOIFD
+Architectures: x86
+Type: vm ioctl
+Parameters: struct kvm_eoifd (in)
+Returns: 0 on success, -1 on error
+
+KVM_EOIFD allows userspace to receive interrupt EOI notification
+through an eventfd. kvm_eoifd.fd specifies the eventfd used for
+notification. KVM_EOIFD_FLAG_DEASSIGN is used to de-assign an eoifd
+once assigned. KVM_EOIFD also requires additional bits set in
+kvm_eoifd.flags to bind to the proper interrupt line. The
+KVM_EOIFD_FLAG_LEVEL_IRQFD indicates that kvm_eoifd.irqfd is provided
+and is an irqfd for a level triggered interrupt (configured from
+KVM_IRQFD using KVM_IRQFD_FLAG_LEVEL). The EOI notification is bound
+to the same GSI and irqchip input as the irqfd. Both kvm_eoifd.irqfd
+and KVM_EOIFD_FLAG_LEVEL_IRQFD must be specified both on assignment
+and de-assignment of KVM_EOIFD. KVM_CAP_EOIFD_LEVEL_IRQFD indicates
+support of KVM_EOIFD_FLAG_LEVEL_IRQFD.
+
5. The kvm_run structure
------------------------
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 80bed07..cc47e31 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2149,6 +2149,8 @@ int kvm_dev_ioctl_check_extension(long ext)
case KVM_CAP_PCI_2_3:
case KVM_CAP_KVMCLOCK_CTRL:
case KVM_CAP_IRQFD_LEVEL:
+ case KVM_CAP_EOIFD:
+ case KVM_CAP_EOIFD_LEVEL_IRQFD:
r = 1;
break;
case KVM_CAP_COALESCED_MMIO:
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index b2e6e4f..5ca887d 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -619,6 +619,8 @@ struct kvm_ppc_smmu_info {
#define KVM_CAP_S390_COW 79
#define KVM_CAP_PPC_ALLOC_HTAB 80
#define KVM_CAP_IRQFD_LEVEL 81
+#define KVM_CAP_EOIFD 82
+#define KVM_CAP_EOIFD_LEVEL_IRQFD 83
#ifdef KVM_CAP_IRQ_ROUTING
@@ -694,6 +696,17 @@ struct kvm_irqfd {
__u8 pad[20];
};
+#define KVM_EOIFD_FLAG_DEASSIGN (1 << 0)
+/* Available with KVM_CAP_EOIFD_LEVEL_IRQFD */
+#define KVM_EOIFD_FLAG_LEVEL_IRQFD (1 << 1)
+
+struct kvm_eoifd {
+ __u32 fd;
+ __u32 flags;
+ __u32 irqfd;
+ __u8 pad[20];
+};
+
struct kvm_clock_data {
__u64 clock;
__u32 flags;
@@ -834,6 +847,8 @@ struct kvm_s390_ucas_mapping {
#define KVM_PPC_GET_SMMU_INFO _IOR(KVMIO, 0xa6, struct kvm_ppc_smmu_info)
/* Available with KVM_CAP_PPC_ALLOC_HTAB */
#define KVM_PPC_ALLOCATE_HTAB _IOWR(KVMIO, 0xa7, __u32)
+/* Available with KVM_CAP_EOIFD */
+#define KVM_EOIFD _IOW(KVMIO, 0xa8, struct kvm_eoifd)
/*
* ioctls for vcpu fds
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index ae3b426..a7661c0 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -285,6 +285,10 @@ struct kvm {
struct list_head items;
} irqfds;
struct list_head ioeventfds;
+ struct {
+ struct mutex lock;
+ struct list_head items;
+ } eoifds;
#endif
struct kvm_vm_stat stat;
struct kvm_arch arch;
@@ -828,6 +832,8 @@ int kvm_irqfd(struct kvm *kvm, struct kvm_irqfd *args);
void kvm_irqfd_release(struct kvm *kvm);
void kvm_irq_routing_update(struct kvm *, struct kvm_irq_routing_table *);
int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args);
+int kvm_eoifd(struct kvm *kvm, struct kvm_eoifd *args);
+void kvm_eoifd_release(struct kvm *kvm);
#else
@@ -853,6 +859,13 @@ static inline int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
return -ENOSYS;
}
+static inline int kvm_eoifd(struct kvm *kvm, struct kvm_eoifd *args)
+{
+ return -ENOSYS;
+}
+
+static inline void kvm_eoifd_release(struct kvm *kvm) {}
+
#endif /* CONFIG_HAVE_KVM_EVENTFD */
#ifdef CONFIG_KVM_APIC_ARCHITECTURE
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index ecdbfea..2fae198 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -65,8 +65,7 @@ static void _irq_source_put(struct _irq_source *source)
kref_put(&source->kref, _irq_source_release);
}
-static struct _irq_source *__attribute__ ((used)) /* white lie for now */
-_irq_source_get(struct _irq_source *source)
+static struct _irq_source *_irq_source_get(struct _irq_source *source)
{
if (source)
kref_get(&source->kref);
@@ -123,6 +122,39 @@ struct _irqfd {
struct work_struct shutdown;
};
+static struct _irqfd *_irqfd_fdget_lock(struct kvm *kvm, int fd)
+{
+ struct eventfd_ctx *eventfd;
+ struct _irqfd *tmp, *irqfd = NULL;
+
+ eventfd = eventfd_ctx_fdget(fd);
+ if (IS_ERR(eventfd))
+ return (struct _irqfd *)eventfd;
+
+ spin_lock_irq(&kvm->irqfds.lock);
+
+ list_for_each_entry(tmp, &kvm->irqfds.items, list) {
+ if (tmp->eventfd == eventfd) {
+ irqfd = tmp;
+ break;
+ }
+ }
+
+ if (!irqfd) {
+ spin_unlock_irq(&kvm->irqfds.lock);
+ eventfd_ctx_put(eventfd);
+ return ERR_PTR(-ENODEV);
+ }
+
+ return irqfd;
+}
+
+static void _irqfd_put_unlock(struct _irqfd *irqfd)
+{
+ eventfd_ctx_put(irqfd->eventfd);
+ spin_unlock_irq(&irqfd->kvm->irqfds.lock);
+}
+
static struct workqueue_struct *irqfd_cleanup_wq;
static void
@@ -398,6 +430,8 @@ kvm_eventfd_init(struct kvm *kvm)
spin_lock_init(&kvm->irqfds.lock);
INIT_LIST_HEAD(&kvm->irqfds.items);
INIT_LIST_HEAD(&kvm->ioeventfds);
+ mutex_init(&kvm->eoifds.lock);
+ INIT_LIST_HEAD(&kvm->eoifds.items);
}
/*
@@ -764,3 +798,191 @@ kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
return kvm_assign_ioeventfd(kvm, args);
}
+
+/*
+ * --------------------------------------------------------------------
+ * eoifd: Translate KVM APIC/IOAPIC EOI into eventfd signal.
+ *
+ * userspace can register with an eventfd for receiving
+ * notification when an EOI occurs.
+ * --------------------------------------------------------------------
+ */
+
+struct _eoifd {
+ /* eventfd triggered on EOI */
+ struct eventfd_ctx *eventfd;
+ /* irq source ID de-asserted on EOI */
+ struct _irq_source *source;
+ struct kvm *kvm;
+ struct kvm_irq_ack_notifier notifier;
+ /* reference to irqfd eventfd for de-assign matching */
+ struct eventfd_ctx *level_irqfd;
+ struct list_head list;
+};
+
+static void eoifd_event(struct kvm_irq_ack_notifier *notifier)
+{
+ struct _eoifd *eoifd;
+
+ eoifd = container_of(notifier, struct _eoifd, notifier);
+
+ /*
+ * Ack notifier is per GSI, which may be shared with others.
+ * Only de-assert and send EOI if our source ID is asserted.
+ * User needs to re-assert if device still requires service.
+ */
+ spin_lock(&eoifd->source->lock);
+ if (eoifd->source->level_asserted) {
+ kvm_set_irq(eoifd->kvm,
+ eoifd->source->id, eoifd->notifier.gsi, 0);
+ eoifd->source->level_asserted = false;
+ eventfd_signal(eoifd->eventfd, 1);
+ }
+ spin_unlock(&eoifd->source->lock);
+}
+
+static int kvm_assign_eoifd(struct kvm *kvm, struct kvm_eoifd *args)
+{
+ struct eventfd_ctx *level_irqfd = NULL, *eventfd = NULL;
+ struct _eoifd *eoifd = NULL;
+ struct _irq_source *source = NULL;
+ unsigned gsi;
+ int ret;
+
+ eventfd = eventfd_ctx_fdget(args->fd);
+ if (IS_ERR(eventfd)) {
+ ret = PTR_ERR(eventfd);
+ goto fail;
+ }
+
+ eoifd = kzalloc(sizeof(*eoifd), GFP_KERNEL);
+ if (!eoifd) {
+ ret = -ENOMEM;
+ goto fail;
+ }
+
+ if (args->flags & KVM_EOIFD_FLAG_LEVEL_IRQFD) {
+ struct _irqfd *irqfd = _irqfd_fdget_lock(kvm, args->irqfd);
+ if (IS_ERR(irqfd)) {
+ ret = PTR_ERR(irqfd);
+ goto fail;
+ }
+
+ gsi = irqfd->gsi;
+ level_irqfd = eventfd_ctx_get(irqfd->eventfd);
+ source = _irq_source_get(irqfd->source);
+ _irqfd_put_unlock(irqfd);
+ if (!source) {
+ ret = -EINVAL;
+ goto fail;
+ }
+ } else {
+ ret = -EINVAL;
+ goto fail;
+ }
+
+ INIT_LIST_HEAD(&eoifd->list);
+ eoifd->kvm = kvm;
+ eoifd->eventfd = eventfd;
+ eoifd->source = source;
+ eoifd->level_irqfd = level_irqfd;
+ eoifd->notifier.gsi = gsi;
+ eoifd->notifier.irq_acked = eoifd_event;
+
+ mutex_lock(&kvm->eoifds.lock);
+
+ list_add_tail(&eoifd->list, &kvm->eoifds.items);
+ kvm_register_irq_ack_notifier(kvm, &eoifd->notifier);
+
+ mutex_unlock(&kvm->eoifds.lock);
+
+ return 0;
+
+fail:
+ if (eventfd && !IS_ERR(eventfd))
+ eventfd_ctx_put(eventfd);
+ kfree(eoifd);
+ if (level_irqfd)
+ eventfd_ctx_put(level_irqfd);
+ _irq_source_put(source);
+ return ret;
+}
+
+static void eoifd_destroy(struct kvm *kvm, struct _eoifd *eoifd)
+{
+ list_del(&eoifd->list);
+ kvm_unregister_irq_ack_notifier(kvm, &eoifd->notifier);
+ _irq_source_put(eoifd->source);
+ eventfd_ctx_put(eoifd->eventfd);
+ eventfd_ctx_put(eoifd->level_irqfd);
+ kfree(eoifd);
+}
+
+void kvm_eoifd_release(struct kvm *kvm)
+{
+ struct _eoifd *tmp, *eoifd;
+
+ mutex_lock(&kvm->eoifds.lock);
+
+ list_for_each_entry_safe(eoifd, tmp, &kvm->eoifds.items, list)
+ eoifd_destroy(kvm, eoifd);
+
+ mutex_unlock(&kvm->eoifds.lock);
+}
+
+static int kvm_deassign_eoifd(struct kvm *kvm, struct kvm_eoifd *args)
+{
+ struct eventfd_ctx *eventfd = NULL, *level_irqfd = NULL;
+ struct _eoifd *eoifd;
+ int ret = -ENOENT;
+
+ eventfd = eventfd_ctx_fdget(args->fd);
+ if (IS_ERR(eventfd)) {
+ ret = PTR_ERR(eventfd);
+ goto fail;
+ }
+
+ if (args->flags & KVM_EOIFD_FLAG_LEVEL_IRQFD) {
+ level_irqfd = eventfd_ctx_fdget(args->irqfd);
+ if (IS_ERR(level_irqfd)) {
+ ret = PTR_ERR(level_irqfd);
+ goto fail;
+ }
+ } else {
+ ret = -EINVAL;
+ goto fail;
+ }
+
+ mutex_lock(&kvm->eoifds.lock);
+
+ list_for_each_entry(eoifd, &kvm->eoifds.items, list) {
+ if (eoifd->eventfd == eventfd &&
+ eoifd->level_irqfd == level_irqfd) {
+ eoifd_destroy(kvm, eoifd);
+ ret = 0;
+ break;
+ }
+ }
+
+ mutex_unlock(&kvm->eoifds.lock);
+
+fail:
+ if (eventfd && !IS_ERR(eventfd))
+ eventfd_ctx_put(eventfd);
+ if (level_irqfd && !IS_ERR(level_irqfd))
+ eventfd_ctx_put(level_irqfd);
+
+ return ret;
+}
+
+int kvm_eoifd(struct kvm *kvm, struct kvm_eoifd *args)
+{
+ if (args->flags & ~(KVM_EOIFD_FLAG_DEASSIGN |
+ KVM_EOIFD_FLAG_LEVEL_IRQFD))
+ return -EINVAL;
+
+ if (args->flags & KVM_EOIFD_FLAG_DEASSIGN)
+ return kvm_deassign_eoifd(kvm, args);
+
+ return kvm_assign_eoifd(kvm, args);
+}
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index b4ad14cc..5b41df1 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -620,6 +620,8 @@ static int kvm_vm_release(struct inode *inode, struct file *filp)
kvm_irqfd_release(kvm);
+ kvm_eoifd_release(kvm);
+
kvm_put_kvm(kvm);
return 0;
}
@@ -2093,6 +2095,15 @@ static long kvm_vm_ioctl(struct file *filp,
break;
}
#endif
+ case KVM_EOIFD: {
+ struct kvm_eoifd data;
+
+ r = -EFAULT;
+ if (copy_from_user(&data, argp, sizeof data))
+ goto out;
+ r = kvm_eoifd(kvm, &data);
+ break;
+ }
default:
r = kvm_arch_vm_ioctl(filp, ioctl, arg);
if (r == -ENOTTY)
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v4 3/3] kvm: Add a GSI specification for KVM_EOIFD
2012-07-13 19:40 [PATCH v4 0/3] kvm: level irqfd and new eoifd Alex Williamson
2012-07-13 19:40 ` [PATCH v4 1/3] kvm: Extend irqfd to support level interrupts Alex Williamson
2012-07-13 19:41 ` [PATCH v4 2/3] kvm: KVM_EOIFD, an eventfd for EOIs Alex Williamson
@ 2012-07-13 19:41 ` Alex Williamson
2012-07-15 16:45 ` [PATCH v4 0/3] kvm: level irqfd and new eoifd Michael S. Tsirkin
3 siblings, 0 replies; 8+ messages in thread
From: Alex Williamson @ 2012-07-13 19:41 UTC (permalink / raw)
To: avi, mst; +Cc: gleb, kvm, linux-kernel, jan.kiszka
NOT FOR COMMIT - Here's what it would look like to add a GSI
flavor for KVM_EOIFD later. I expect we're going to need this
to support generic EOI notifiers in qemu when irqchip is enabled.
No API doc included here yet.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
include/linux/kvm.h | 8 +++++++-
virt/kvm/eventfd.c | 17 +++++++++++++++--
2 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 5ca887d..18d87fc 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -621,6 +621,7 @@ struct kvm_ppc_smmu_info {
#define KVM_CAP_IRQFD_LEVEL 81
#define KVM_CAP_EOIFD 82
#define KVM_CAP_EOIFD_LEVEL_IRQFD 83
+#define KVM_CAP_EOIFD_GSI 84
#ifdef KVM_CAP_IRQ_ROUTING
@@ -699,11 +700,16 @@ struct kvm_irqfd {
#define KVM_EOIFD_FLAG_DEASSIGN (1 << 0)
/* Available with KVM_CAP_EOIFD_LEVEL_IRQFD */
#define KVM_EOIFD_FLAG_LEVEL_IRQFD (1 << 1)
+/* Available with KVM_CAP_EOIFD_GSI */
+#define KVM_EOIFD_FLAG_GSI (1 << 2)
struct kvm_eoifd {
__u32 fd;
__u32 flags;
- __u32 irqfd;
+ union {
+ __u32 irqfd;
+ __u32 gsi;
+ };
__u8 pad[20];
};
diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 2fae198..b6d9c1c 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -826,6 +826,11 @@ static void eoifd_event(struct kvm_irq_ack_notifier *notifier)
eoifd = container_of(notifier, struct _eoifd, notifier);
+ if (!eoifd->level_irqfd) {
+ eventfd_signal(eoifd->eventfd, 1);
+ return;
+ }
+
/*
* Ack notifier is per GSI, which may be shared with others.
* Only de-assert and send EOI if our source ID is asserted.
@@ -876,6 +881,8 @@ static int kvm_assign_eoifd(struct kvm *kvm, struct kvm_eoifd *args)
ret = -EINVAL;
goto fail;
}
+ } else if (args->flags & KVM_EOIFD_FLAG_GSI) {
+ gsi = args->gsi;
} else {
ret = -EINVAL;
goto fail;
@@ -934,6 +941,7 @@ static int kvm_deassign_eoifd(struct kvm *kvm, struct kvm_eoifd *args)
{
struct eventfd_ctx *eventfd = NULL, *level_irqfd = NULL;
struct _eoifd *eoifd;
+ unsigned gsi = ~0;
int ret = -ENOENT;
eventfd = eventfd_ctx_fdget(args->fd);
@@ -948,6 +956,8 @@ static int kvm_deassign_eoifd(struct kvm *kvm, struct kvm_eoifd *args)
ret = PTR_ERR(level_irqfd);
goto fail;
}
+ } else if (args->flags & KVM_EOIFD_FLAG_GSI) {
+ gsi = args->gsi;
} else {
ret = -EINVAL;
goto fail;
@@ -957,7 +967,9 @@ static int kvm_deassign_eoifd(struct kvm *kvm, struct kvm_eoifd *args)
list_for_each_entry(eoifd, &kvm->eoifds.items, list) {
if (eoifd->eventfd == eventfd &&
- eoifd->level_irqfd == level_irqfd) {
+ ((level_irqfd && eoifd->level_irqfd == level_irqfd) ||
+ (!level_irqfd && !eoifd->level_irqfd &&
+ eoifd->notifier.gsi == gsi))) {
eoifd_destroy(kvm, eoifd);
ret = 0;
break;
@@ -978,7 +990,8 @@ fail:
int kvm_eoifd(struct kvm *kvm, struct kvm_eoifd *args)
{
if (args->flags & ~(KVM_EOIFD_FLAG_DEASSIGN |
- KVM_EOIFD_FLAG_LEVEL_IRQFD))
+ KVM_EOIFD_FLAG_LEVEL_IRQFD |
+ KVM_EOIFD_FLAG_GSI))
return -EINVAL;
if (args->flags & KVM_EOIFD_FLAG_DEASSIGN)
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v4 2/3] kvm: KVM_EOIFD, an eventfd for EOIs
2012-07-13 19:41 ` [PATCH v4 2/3] kvm: KVM_EOIFD, an eventfd for EOIs Alex Williamson
@ 2012-07-15 16:23 ` Michael S. Tsirkin
2012-07-15 16:32 ` Michael S. Tsirkin
1 sibling, 0 replies; 8+ messages in thread
From: Michael S. Tsirkin @ 2012-07-15 16:23 UTC (permalink / raw)
To: Alex Williamson; +Cc: avi, gleb, kvm, linux-kernel, jan.kiszka
On Fri, Jul 13, 2012 at 01:41:05PM -0600, Alex Williamson wrote:
> This new ioctl enables an eventfd to be triggered when an EOI is
> written for a specified irqchip pin. The first user of this will
> be external device assignment through VFIO, using a level irqfd
> for asserting a PCI INTx interrupt and this interface for de-assert
> and notification once the interrupt is serviced.
>
> Here we make use of the reference counting of the _irq_source
> object allowing us to share it with an irqfd and cleanup regardless
> of the release order.
>
> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
> ---
>
> Documentation/virtual/kvm/api.txt | 21 +++
> arch/x86/kvm/x86.c | 2
> include/linux/kvm.h | 15 ++
> include/linux/kvm_host.h | 13 ++
> virt/kvm/eventfd.c | 226 +++++++++++++++++++++++++++++++++++++
> virt/kvm/kvm_main.c | 11 ++
> 6 files changed, 286 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
> index c7267d5..d5be635 100644
> --- a/Documentation/virtual/kvm/api.txt
> +++ b/Documentation/virtual/kvm/api.txt
> @@ -1988,6 +1988,27 @@ to independently assert level interrupts. The KVM_IRQFD_FLAG_LEVEL
> is only necessary on setup, teardown is identical to that above.
> KVM_IRQFD_FLAG_LEVEL support is indicated by KVM_CAP_IRQFD_LEVEL.
>
> +4.77 KVM_EOIFD
> +
> +Capability: KVM_CAP_EOIFD
> +Architectures: x86
> +Type: vm ioctl
> +Parameters: struct kvm_eoifd (in)
> +Returns: 0 on success, -1 on error
> +
> +KVM_EOIFD allows userspace to receive interrupt EOI notification
> +through an eventfd. kvm_eoifd.fd specifies the eventfd used for
> +notification. KVM_EOIFD_FLAG_DEASSIGN is used to de-assign an eoifd
> +once assigned. KVM_EOIFD also requires additional bits set in
> +kvm_eoifd.flags to bind to the proper interrupt line. The
> +KVM_EOIFD_FLAG_LEVEL_IRQFD indicates that kvm_eoifd.irqfd is provided
> +and is an irqfd for a level triggered interrupt (configured from
> +KVM_IRQFD using KVM_IRQFD_FLAG_LEVEL). The EOI notification is bound
> +to the same GSI and irqchip input as the irqfd. Both kvm_eoifd.irqfd
> +and KVM_EOIFD_FLAG_LEVEL_IRQFD must be specified both on assignment
> +and de-assignment of KVM_EOIFD. KVM_CAP_EOIFD_LEVEL_IRQFD indicates
> +support of KVM_EOIFD_FLAG_LEVEL_IRQFD.
> +
> 5. The kvm_run structure
> ------------------------
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 80bed07..cc47e31 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2149,6 +2149,8 @@ int kvm_dev_ioctl_check_extension(long ext)
> case KVM_CAP_PCI_2_3:
> case KVM_CAP_KVMCLOCK_CTRL:
> case KVM_CAP_IRQFD_LEVEL:
> + case KVM_CAP_EOIFD:
> + case KVM_CAP_EOIFD_LEVEL_IRQFD:
> r = 1;
> break;
> case KVM_CAP_COALESCED_MMIO:
> diff --git a/include/linux/kvm.h b/include/linux/kvm.h
> index b2e6e4f..5ca887d 100644
> --- a/include/linux/kvm.h
> +++ b/include/linux/kvm.h
> @@ -619,6 +619,8 @@ struct kvm_ppc_smmu_info {
> #define KVM_CAP_S390_COW 79
> #define KVM_CAP_PPC_ALLOC_HTAB 80
> #define KVM_CAP_IRQFD_LEVEL 81
> +#define KVM_CAP_EOIFD 82
> +#define KVM_CAP_EOIFD_LEVEL_IRQFD 83
>
> #ifdef KVM_CAP_IRQ_ROUTING
>
> @@ -694,6 +696,17 @@ struct kvm_irqfd {
> __u8 pad[20];
> };
>
> +#define KVM_EOIFD_FLAG_DEASSIGN (1 << 0)
> +/* Available with KVM_CAP_EOIFD_LEVEL_IRQFD */
> +#define KVM_EOIFD_FLAG_LEVEL_IRQFD (1 << 1)
> +
> +struct kvm_eoifd {
> + __u32 fd;
> + __u32 flags;
> + __u32 irqfd;
> + __u8 pad[20];
> +};
> +
> struct kvm_clock_data {
> __u64 clock;
> __u32 flags;
> @@ -834,6 +847,8 @@ struct kvm_s390_ucas_mapping {
> #define KVM_PPC_GET_SMMU_INFO _IOR(KVMIO, 0xa6, struct kvm_ppc_smmu_info)
> /* Available with KVM_CAP_PPC_ALLOC_HTAB */
> #define KVM_PPC_ALLOCATE_HTAB _IOWR(KVMIO, 0xa7, __u32)
> +/* Available with KVM_CAP_EOIFD */
> +#define KVM_EOIFD _IOW(KVMIO, 0xa8, struct kvm_eoifd)
>
> /*
> * ioctls for vcpu fds
> diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
> index ae3b426..a7661c0 100644
> --- a/include/linux/kvm_host.h
> +++ b/include/linux/kvm_host.h
> @@ -285,6 +285,10 @@ struct kvm {
> struct list_head items;
> } irqfds;
> struct list_head ioeventfds;
> + struct {
> + struct mutex lock;
> + struct list_head items;
> + } eoifds;
> #endif
> struct kvm_vm_stat stat;
> struct kvm_arch arch;
> @@ -828,6 +832,8 @@ int kvm_irqfd(struct kvm *kvm, struct kvm_irqfd *args);
> void kvm_irqfd_release(struct kvm *kvm);
> void kvm_irq_routing_update(struct kvm *, struct kvm_irq_routing_table *);
> int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args);
> +int kvm_eoifd(struct kvm *kvm, struct kvm_eoifd *args);
> +void kvm_eoifd_release(struct kvm *kvm);
>
> #else
>
> @@ -853,6 +859,13 @@ static inline int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
> return -ENOSYS;
> }
>
> +static inline int kvm_eoifd(struct kvm *kvm, struct kvm_eoifd *args)
> +{
> + return -ENOSYS;
> +}
> +
> +static inline void kvm_eoifd_release(struct kvm *kvm) {}
> +
> #endif /* CONFIG_HAVE_KVM_EVENTFD */
>
> #ifdef CONFIG_KVM_APIC_ARCHITECTURE
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index ecdbfea..2fae198 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -65,8 +65,7 @@ static void _irq_source_put(struct _irq_source *source)
> kref_put(&source->kref, _irq_source_release);
> }
>
> -static struct _irq_source *__attribute__ ((used)) /* white lie for now */
> -_irq_source_get(struct _irq_source *source)
> +static struct _irq_source *_irq_source_get(struct _irq_source *source)
> {
> if (source)
> kref_get(&source->kref);
> @@ -123,6 +122,39 @@ struct _irqfd {
> struct work_struct shutdown;
> };
>
> +static struct _irqfd *_irqfd_fdget_lock(struct kvm *kvm, int fd)
> +{
> + struct eventfd_ctx *eventfd;
> + struct _irqfd *tmp, *irqfd = NULL;
> +
> + eventfd = eventfd_ctx_fdget(fd);
> + if (IS_ERR(eventfd))
> + return (struct _irqfd *)eventfd;
> +
> + spin_lock_irq(&kvm->irqfds.lock);
> +
> + list_for_each_entry(tmp, &kvm->irqfds.items, list) {
> + if (tmp->eventfd == eventfd) {
> + irqfd = tmp;
> + break;
> + }
> + }
> +
> + if (!irqfd) {
> + spin_unlock_irq(&kvm->irqfds.lock);
> + eventfd_ctx_put(eventfd);
> + return ERR_PTR(-ENODEV);
> + }
> +
> + return irqfd;
> +}
> +
> +static void _irqfd_put_unlock(struct _irqfd *irqfd)
> +{
> + eventfd_ctx_put(irqfd->eventfd);
> + spin_unlock_irq(&irqfd->kvm->irqfds.lock);
> +}
> +
> static struct workqueue_struct *irqfd_cleanup_wq;
>
> static void
> @@ -398,6 +430,8 @@ kvm_eventfd_init(struct kvm *kvm)
> spin_lock_init(&kvm->irqfds.lock);
> INIT_LIST_HEAD(&kvm->irqfds.items);
> INIT_LIST_HEAD(&kvm->ioeventfds);
> + mutex_init(&kvm->eoifds.lock);
> + INIT_LIST_HEAD(&kvm->eoifds.items);
> }
>
> /*
> @@ -764,3 +798,191 @@ kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>
> return kvm_assign_ioeventfd(kvm, args);
> }
> +
> +/*
> + * --------------------------------------------------------------------
> + * eoifd: Translate KVM APIC/IOAPIC EOI into eventfd signal.
> + *
> + * userspace can register with an eventfd for receiving
> + * notification when an EOI occurs.
> + * --------------------------------------------------------------------
> + */
> +
> +struct _eoifd {
> + /* eventfd triggered on EOI */
> + struct eventfd_ctx *eventfd;
> + /* irq source ID de-asserted on EOI */
> + struct _irq_source *source;
> + struct kvm *kvm;
> + struct kvm_irq_ack_notifier notifier;
> + /* reference to irqfd eventfd for de-assign matching */
> + struct eventfd_ctx *level_irqfd;
> + struct list_head list;
> +};
> +
> +static void eoifd_event(struct kvm_irq_ack_notifier *notifier)
> +{
> + struct _eoifd *eoifd;
> +
> + eoifd = container_of(notifier, struct _eoifd, notifier);
> +
> + /*
> + * Ack notifier is per GSI, which may be shared with others.
> + * Only de-assert and send EOI if our source ID is asserted.
> + * User needs to re-assert if device still requires service.
> + */
> + spin_lock(&eoifd->source->lock);
> + if (eoifd->source->level_asserted) {
> + kvm_set_irq(eoifd->kvm,
> + eoifd->source->id, eoifd->notifier.gsi, 0);
How about we add "clear" pic callback, in addition to set, and implement
kvm_set_irq with kvm_clear_irq which returns current status?
This would avoid the need for level_asserted and for locks, won't it?
--
MST
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4 2/3] kvm: KVM_EOIFD, an eventfd for EOIs
2012-07-13 19:41 ` [PATCH v4 2/3] kvm: KVM_EOIFD, an eventfd for EOIs Alex Williamson
2012-07-15 16:23 ` Michael S. Tsirkin
@ 2012-07-15 16:32 ` Michael S. Tsirkin
2012-07-16 15:01 ` Alex Williamson
1 sibling, 1 reply; 8+ messages in thread
From: Michael S. Tsirkin @ 2012-07-15 16:32 UTC (permalink / raw)
To: Alex Williamson; +Cc: avi, gleb, kvm, linux-kernel, jan.kiszka
On Fri, Jul 13, 2012 at 01:41:05PM -0600, Alex Williamson wrote:
> +static int kvm_assign_eoifd(struct kvm *kvm, struct kvm_eoifd *args)
> +{
> + struct eventfd_ctx *level_irqfd = NULL, *eventfd = NULL;
> + struct _eoifd *eoifd = NULL;
> + struct _irq_source *source = NULL;
> + unsigned gsi;
> + int ret;
> +
> + eventfd = eventfd_ctx_fdget(args->fd);
> + if (IS_ERR(eventfd)) {
> + ret = PTR_ERR(eventfd);
> + goto fail;
> + }
> +
> + eoifd = kzalloc(sizeof(*eoifd), GFP_KERNEL);
> + if (!eoifd) {
> + ret = -ENOMEM;
> + goto fail;
> + }
> +
> + if (args->flags & KVM_EOIFD_FLAG_LEVEL_IRQFD) {
> + struct _irqfd *irqfd = _irqfd_fdget_lock(kvm, args->irqfd);
> + if (IS_ERR(irqfd)) {
> + ret = PTR_ERR(irqfd);
> + goto fail;
> + }
> +
> + gsi = irqfd->gsi;
> + level_irqfd = eventfd_ctx_get(irqfd->eventfd);
> + source = _irq_source_get(irqfd->source);
> + _irqfd_put_unlock(irqfd);
> + if (!source) {
> + ret = -EINVAL;
> + goto fail;
> + }
> + } else {
> + ret = -EINVAL;
> + goto fail;
> + }
> +
> + INIT_LIST_HEAD(&eoifd->list);
> + eoifd->kvm = kvm;
> + eoifd->eventfd = eventfd;
> + eoifd->source = source;
> + eoifd->level_irqfd = level_irqfd;
> + eoifd->notifier.gsi = gsi;
> + eoifd->notifier.irq_acked = eoifd_event;
> +
> + mutex_lock(&kvm->eoifds.lock);
> +
> + list_add_tail(&eoifd->list, &kvm->eoifds.items);
Do we want to disallow multiple eventfds mapping the same irqfd?
No strong opinions but preventing this might make it possible to cache
the callback in the irqfd in the future.
This will also help limit the number of eoifd-s to 1024 GSIs * 32 source ids.
As it is userspace can apparently consume unlimited kernel memory.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4 0/3] kvm: level irqfd and new eoifd
2012-07-13 19:40 [PATCH v4 0/3] kvm: level irqfd and new eoifd Alex Williamson
` (2 preceding siblings ...)
2012-07-13 19:41 ` [PATCH v4 3/3] kvm: Add a GSI specification for KVM_EOIFD Alex Williamson
@ 2012-07-15 16:45 ` Michael S. Tsirkin
3 siblings, 0 replies; 8+ messages in thread
From: Michael S. Tsirkin @ 2012-07-15 16:45 UTC (permalink / raw)
To: Alex Williamson; +Cc: avi, gleb, kvm, linux-kernel, jan.kiszka
On Fri, Jul 13, 2012 at 01:40:48PM -0600, Alex Williamson wrote:
> v4:
> - KVM_IRQFD_FLAG_LEVEL flag now documented and coded to only be
> necessary on assign.
> - Lock added to struct _irq_source to maintain the source ID assertion
> state to avoid repeat assertions or spurious EOIs. I couldn't figure
> out a way to make this work w/o races using atomics and don't see any
> way to make use of kvm_irq_line_state.
Well not kvm_irq_line_state specifically but what's the problem
with looking at irq_state in the correct pic? That is already
done atomically. We'll need new APIs for this of course.
See specific patch for suggestion.
> - GSI support in KVM_EOIFD is dropped since it's not immediately used
> and I don't seem to be able to explain why it's useful. It makes it
> optional though, which may be a good thing if other archs try to use
> this interface.
> - eoifd switch to mutex from spinlock (all tested as lockdep safe)
> - _irqfd_fdget_lock/_irqfd_put_unlock replaces _irqfd_fdget/_irqfd_put
> so we hold the lock for the irqfds ensuring the one we're binding to
> doesn't go away while we get a reference to the _irq_source.
> - We also pull the GSI off the irqfd rather than specifying it for eoifd.
> - eoifd_deactivate renamed to eoifd_destroy to make it blindingly
> obvious what it does.
Overall looks ok to me.
Looks like as written userspace can create any # of eoifds
with same irqfd leading to unlimited kernel memory use.
See specific patch for suggestion on how to fix.
> Patch 3/3 is not for commit and not meant to be ready for commit, it's
> just an FYI on what adding back support for a GSI based, non-de-asserting
> eoifd will look like. Thanks,
>
> Alex
>
> ---
>
> Alex Williamson (3):
> kvm: Add a GSI specification for KVM_EOIFD
> kvm: KVM_EOIFD, an eventfd for EOIs
> kvm: Extend irqfd to support level interrupts
>
>
> Documentation/virtual/kvm/api.txt | 27 +++
> arch/x86/kvm/x86.c | 3
> include/linux/kvm.h | 24 +++
> include/linux/kvm_host.h | 13 +
> virt/kvm/eventfd.c | 349 +++++++++++++++++++++++++++++++++++++
> virt/kvm/kvm_main.c | 11 +
> 6 files changed, 423 insertions(+), 4 deletions(-)
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4 2/3] kvm: KVM_EOIFD, an eventfd for EOIs
2012-07-15 16:32 ` Michael S. Tsirkin
@ 2012-07-16 15:01 ` Alex Williamson
0 siblings, 0 replies; 8+ messages in thread
From: Alex Williamson @ 2012-07-16 15:01 UTC (permalink / raw)
To: Michael S. Tsirkin; +Cc: avi, gleb, kvm, linux-kernel, jan.kiszka
On Sun, 2012-07-15 at 19:32 +0300, Michael S. Tsirkin wrote:
> On Fri, Jul 13, 2012 at 01:41:05PM -0600, Alex Williamson wrote:
> > +static int kvm_assign_eoifd(struct kvm *kvm, struct kvm_eoifd *args)
> > +{
> > + struct eventfd_ctx *level_irqfd = NULL, *eventfd = NULL;
> > + struct _eoifd *eoifd = NULL;
> > + struct _irq_source *source = NULL;
> > + unsigned gsi;
> > + int ret;
> > +
> > + eventfd = eventfd_ctx_fdget(args->fd);
> > + if (IS_ERR(eventfd)) {
> > + ret = PTR_ERR(eventfd);
> > + goto fail;
> > + }
> > +
> > + eoifd = kzalloc(sizeof(*eoifd), GFP_KERNEL);
> > + if (!eoifd) {
> > + ret = -ENOMEM;
> > + goto fail;
> > + }
> > +
> > + if (args->flags & KVM_EOIFD_FLAG_LEVEL_IRQFD) {
> > + struct _irqfd *irqfd = _irqfd_fdget_lock(kvm, args->irqfd);
> > + if (IS_ERR(irqfd)) {
> > + ret = PTR_ERR(irqfd);
> > + goto fail;
> > + }
> > +
> > + gsi = irqfd->gsi;
> > + level_irqfd = eventfd_ctx_get(irqfd->eventfd);
> > + source = _irq_source_get(irqfd->source);
> > + _irqfd_put_unlock(irqfd);
> > + if (!source) {
> > + ret = -EINVAL;
> > + goto fail;
> > + }
> > + } else {
> > + ret = -EINVAL;
> > + goto fail;
> > + }
> > +
> > + INIT_LIST_HEAD(&eoifd->list);
> > + eoifd->kvm = kvm;
> > + eoifd->eventfd = eventfd;
> > + eoifd->source = source;
> > + eoifd->level_irqfd = level_irqfd;
> > + eoifd->notifier.gsi = gsi;
> > + eoifd->notifier.irq_acked = eoifd_event;
> > +
> > + mutex_lock(&kvm->eoifds.lock);
> > +
> > + list_add_tail(&eoifd->list, &kvm->eoifds.items);
>
> Do we want to disallow multiple eventfds mapping the same irqfd?
> No strong opinions but preventing this might make it possible to cache
> the callback in the irqfd in the future.
>
> This will also help limit the number of eoifd-s to 1024 GSIs * 32 source ids.
> As it is userspace can apparently consume unlimited kernel memory.
So your concern is that if you have a single irqfd and a single eventfd,
KVM_EOIFD could be called in a loop and each time time through we'd
allocate a struct _eoifd until the kernel falls over? Sounds like a
problem. You've already pushed for allowing the eventfd to be
multiplexed, so seems like our only choice is to test for irqfd re-use.
So each irqfd can only be tied to a single eoifd, but multiple irqfds
may be tied to the same eoifd. Seems like a fairly easy change.
Thanks,
Alex
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2012-07-16 15:01 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-13 19:40 [PATCH v4 0/3] kvm: level irqfd and new eoifd Alex Williamson
2012-07-13 19:40 ` [PATCH v4 1/3] kvm: Extend irqfd to support level interrupts Alex Williamson
2012-07-13 19:41 ` [PATCH v4 2/3] kvm: KVM_EOIFD, an eventfd for EOIs Alex Williamson
2012-07-15 16:23 ` Michael S. Tsirkin
2012-07-15 16:32 ` Michael S. Tsirkin
2012-07-16 15:01 ` Alex Williamson
2012-07-13 19:41 ` [PATCH v4 3/3] kvm: Add a GSI specification for KVM_EOIFD Alex Williamson
2012-07-15 16:45 ` [PATCH v4 0/3] kvm: level irqfd and new eoifd Michael S. Tsirkin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).