linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] KVM: MMU: fast zap all shadow pages
@ 2013-03-13  4:55 Xiao Guangrong
  2013-03-13  4:55 ` [PATCH 1/6] KVM: MMU: move mmu related members into a separate struct Xiao Guangrong
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Xiao Guangrong @ 2013-03-13  4:55 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Gleb Natapov, LKML, KVM

[-- Attachment #1: Type: text/plain, Size: 1467 bytes --]

* Issue
The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to
walk and zap all shadow pages one by one, also it need to zap all guest
page's rmap and all shadow page's parent spte list. Particularly, things
become worse if guest uses more memory or vcpus. It is not good for
scalability.

* Idea
Since all shadow page will be zapped, we can directly zap the mmu-cache
and rmap so that vcpu will fault on the new mmu-cache, after that, we can
directly free the memory used by old mmu-cache.

The root shadow page is little especial since they are currently used by
vcpus, we can not directly free them. So, we zap the root shadow pages and
re-add them into the new mmu-cache.

* TODO
Avoiding Marcelo beat me :), they are some works not attached to make the
patchset more smaller:
(1): batch kvm_reload_remote_mmus for zapping root shadow pages
(2): free shadow pages by using generation-number
(3): remove unneeded kvm_reload_remote_mmus after kvm_mmu_zap_all
(4): drop unnecessary @npages from kvm_arch_create_memslot
(5): rename init_kvm_mmu to init_vcpu_mmu

* Performance
The attached testcase is used to measure the time of delete / add memslot.
At that time, all vcpus are waiting, that means, no mmu-lock contention.
I believe the result be more beautiful if other vcpus and mmu notification
need to catch the mmu-lock.

Guest VCPU:6, Mem:2048M

before: Run 10 times, Avg time:46078825 ns.

after: Run 10 times, Avg time:21558774 ns. (+ 113%)

[-- Attachment #2: migrate-perf.tar.bz2 --]
[-- Type: application/x-bzip, Size: 107820 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/6] KVM: MMU: move mmu related members into a separate struct
  2013-03-13  4:55 [PATCH 0/6] KVM: MMU: fast zap all shadow pages Xiao Guangrong
@ 2013-03-13  4:55 ` Xiao Guangrong
  2013-03-13  4:56 ` [PATCH 2/6] KVM: MMU: introduce mmu_cache->pte_list_descs Xiao Guangrong
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2013-03-13  4:55 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Marcelo Tosatti, Gleb Natapov, LKML, KVM

Move all mmu related members from kvm_arch to a separate struct named
kvm_mmu_cache, so we can easily reset the mmu cache when we zap all shadow
pages

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/include/asm/kvm_host.h |    6 +++++-
 arch/x86/kvm/mmu.c              |   36 ++++++++++++++++++++----------------
 arch/x86/kvm/mmu.h              |    4 ++--
 arch/x86/kvm/mmu_audit.c        |    2 +-
 arch/x86/kvm/x86.c              |   11 ++++++-----
 5 files changed, 34 insertions(+), 25 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 635a74d..85291b08 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -525,7 +525,7 @@ struct kvm_apic_map {
 	struct kvm_lapic *logical_map[16][16];
 };

-struct kvm_arch {
+struct kvm_mmu_cache {
 	unsigned int n_used_mmu_pages;
 	unsigned int n_requested_mmu_pages;
 	unsigned int n_max_mmu_pages;
@@ -535,6 +535,10 @@ struct kvm_arch {
 	 * Hash table of struct kvm_mmu_page.
 	 */
 	struct list_head active_mmu_pages;
+};
+
+struct kvm_arch {
+	struct kvm_mmu_cache mmu_cache;
 	struct list_head assigned_dev_head;
 	struct iommu_domain *iommu_domain;
 	int iommu_flags;
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index fdacabb..c52d147 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -751,7 +751,7 @@ static void account_shadowed(struct kvm *kvm, gfn_t gfn)
 		linfo = lpage_info_slot(gfn, slot, i);
 		linfo->write_count += 1;
 	}
-	kvm->arch.indirect_shadow_pages++;
+	kvm->arch.mmu_cache.indirect_shadow_pages++;
 }

 static void unaccount_shadowed(struct kvm *kvm, gfn_t gfn)
@@ -767,7 +767,7 @@ static void unaccount_shadowed(struct kvm *kvm, gfn_t gfn)
 		linfo->write_count -= 1;
 		WARN_ON(linfo->write_count < 0);
 	}
-	kvm->arch.indirect_shadow_pages--;
+	kvm->arch.mmu_cache.indirect_shadow_pages--;
 }

 static int has_wrprotected_page(struct kvm *kvm,
@@ -1456,7 +1456,7 @@ static int is_empty_shadow_page(u64 *spt)
  */
 static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, int nr)
 {
-	kvm->arch.n_used_mmu_pages += nr;
+	kvm->arch.mmu_cache.n_used_mmu_pages += nr;
 	percpu_counter_add(&kvm_total_used_mmu_pages, nr);
 }

@@ -1507,7 +1507,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu,
 	if (!direct)
 		sp->gfns = mmu_memory_cache_alloc(&vcpu->arch.mmu_page_cache);
 	set_page_private(virt_to_page(sp->spt), (unsigned long)sp);
-	list_add(&sp->link, &vcpu->kvm->arch.active_mmu_pages);
+	list_add(&sp->link, &vcpu->kvm->arch.mmu_cache.active_mmu_pages);
 	sp->parent_ptes = 0;
 	mmu_page_add_parent_pte(vcpu, sp, parent_pte);
 	kvm_mod_used_mmu_pages(vcpu->kvm, +1);
@@ -1646,7 +1646,8 @@ static void kvm_mmu_commit_zap_page(struct kvm *kvm,

 #define for_each_gfn_sp(_kvm, _sp, _gfn)				\
 	hlist_for_each_entry(_sp,					\
-	  &(_kvm)->arch.mmu_page_hash[kvm_page_table_hashfn(_gfn)], hash_link) \
+	&(_kvm)->arch.mmu_cache.mmu_page_hash[kvm_page_table_hashfn(_gfn)],\
+	       hash_link)						\
 		if ((_sp)->gfn != (_gfn)) {} else

 #define for_each_gfn_indirect_valid_sp(_kvm, _sp, _gfn)			\
@@ -1842,6 +1843,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 					     unsigned access,
 					     u64 *parent_pte)
 {
+	struct kvm_mmu_cache *cache;
 	union kvm_mmu_page_role role;
 	unsigned quadrant;
 	struct kvm_mmu_page *sp;
@@ -1886,8 +1888,9 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 		return sp;
 	sp->gfn = gfn;
 	sp->role = role;
+	cache = &vcpu->kvm->arch.mmu_cache;
 	hlist_add_head(&sp->hash_link,
-		&vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)]);
+		       &cache->mmu_page_hash[kvm_page_table_hashfn(gfn)]);
 	if (!direct) {
 		if (rmap_write_protect(vcpu->kvm, gfn))
 			kvm_flush_remote_tlbs(vcpu->kvm);
@@ -2076,7 +2079,7 @@ static int kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
 		list_move(&sp->link, invalid_list);
 		kvm_mod_used_mmu_pages(kvm, -1);
 	} else {
-		list_move(&sp->link, &kvm->arch.active_mmu_pages);
+		list_move(&sp->link, &kvm->arch.mmu_cache.active_mmu_pages);
 		kvm_reload_remote_mmus(kvm);
 	}

@@ -2115,10 +2118,10 @@ static bool prepare_zap_oldest_mmu_page(struct kvm *kvm,
 {
 	struct kvm_mmu_page *sp;

-	if (list_empty(&kvm->arch.active_mmu_pages))
+	if (list_empty(&kvm->arch.mmu_cache.active_mmu_pages))
 		return false;

-	sp = list_entry(kvm->arch.active_mmu_pages.prev,
+	sp = list_entry(kvm->arch.mmu_cache.active_mmu_pages.prev,
 			struct kvm_mmu_page, link);
 	kvm_mmu_prepare_zap_page(kvm, sp, invalid_list);

@@ -2135,17 +2138,17 @@ void kvm_mmu_change_mmu_pages(struct kvm *kvm, unsigned int goal_nr_mmu_pages)

 	spin_lock(&kvm->mmu_lock);

-	if (kvm->arch.n_used_mmu_pages > goal_nr_mmu_pages) {
+	if (kvm->arch.mmu_cache.n_used_mmu_pages > goal_nr_mmu_pages) {
 		/* Need to free some mmu pages to achieve the goal. */
-		while (kvm->arch.n_used_mmu_pages > goal_nr_mmu_pages)
+		while (kvm->arch.mmu_cache.n_used_mmu_pages > goal_nr_mmu_pages)
 			if (!prepare_zap_oldest_mmu_page(kvm, &invalid_list))
 				break;

 		kvm_mmu_commit_zap_page(kvm, &invalid_list);
-		goal_nr_mmu_pages = kvm->arch.n_used_mmu_pages;
+		goal_nr_mmu_pages = kvm->arch.mmu_cache.n_used_mmu_pages;
 	}

-	kvm->arch.n_max_mmu_pages = goal_nr_mmu_pages;
+	kvm->arch.mmu_cache.n_max_mmu_pages = goal_nr_mmu_pages;

 	spin_unlock(&kvm->mmu_lock);
 }
@@ -3941,7 +3944,7 @@ void kvm_mmu_pte_write(struct kvm_vcpu *vcpu, gpa_t gpa,
 	 * If we don't have indirect shadow pages, it means no page is
 	 * write-protected, so we can exit simply.
 	 */
-	if (!ACCESS_ONCE(vcpu->kvm->arch.indirect_shadow_pages))
+	if (!ACCESS_ONCE(vcpu->kvm->arch.mmu_cache.indirect_shadow_pages))
 		return;

 	zap_page = remote_flush = local_flush = false;
@@ -4178,7 +4181,8 @@ void kvm_mmu_zap_all(struct kvm *kvm)

 	spin_lock(&kvm->mmu_lock);
 restart:
-	list_for_each_entry_safe(sp, node, &kvm->arch.active_mmu_pages, link)
+	list_for_each_entry_safe(sp, node,
+	      &kvm->arch.mmu_cache.active_mmu_pages, link)
 		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
 			goto restart;

@@ -4214,7 +4218,7 @@ static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
 		 * want to shrink a VM that only started to populate its MMU
 		 * anyway.
 		 */
-		if (!kvm->arch.n_used_mmu_pages)
+		if (!kvm->arch.mmu_cache.n_used_mmu_pages)
 			continue;

 		idx = srcu_read_lock(&kvm->srcu);
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 6987108..2e61c24 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -57,8 +57,8 @@ int kvm_init_shadow_mmu(struct kvm_vcpu *vcpu, struct kvm_mmu *context);

 static inline unsigned int kvm_mmu_available_pages(struct kvm *kvm)
 {
-	return kvm->arch.n_max_mmu_pages -
-		kvm->arch.n_used_mmu_pages;
+	return kvm->arch.mmu_cache.n_max_mmu_pages -
+		kvm->arch.mmu_cache.n_used_mmu_pages;
 }

 static inline void kvm_mmu_free_some_pages(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/mmu_audit.c b/arch/x86/kvm/mmu_audit.c
index daff69e..a2712c1 100644
--- a/arch/x86/kvm/mmu_audit.c
+++ b/arch/x86/kvm/mmu_audit.c
@@ -89,7 +89,7 @@ static void walk_all_active_sps(struct kvm *kvm, sp_handler fn)
 {
 	struct kvm_mmu_page *sp;

-	list_for_each_entry(sp, &kvm->arch.active_mmu_pages, link)
+	list_for_each_entry(sp, &kvm->arch.mmu_cache.active_mmu_pages, link)
 		fn(kvm, sp);
 }

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 35b4912..9cb899c 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3274,7 +3274,7 @@ static int kvm_vm_ioctl_set_nr_mmu_pages(struct kvm *kvm,
 	mutex_lock(&kvm->slots_lock);

 	kvm_mmu_change_mmu_pages(kvm, kvm_nr_mmu_pages);
-	kvm->arch.n_requested_mmu_pages = kvm_nr_mmu_pages;
+	kvm->arch.mmu_cache.n_requested_mmu_pages = kvm_nr_mmu_pages;

 	mutex_unlock(&kvm->slots_lock);
 	return 0;
@@ -3282,7 +3282,7 @@ static int kvm_vm_ioctl_set_nr_mmu_pages(struct kvm *kvm,

 static int kvm_vm_ioctl_get_nr_mmu_pages(struct kvm *kvm)
 {
-	return kvm->arch.n_max_mmu_pages;
+	return kvm->arch.mmu_cache.n_max_mmu_pages;
 }

 static int kvm_vm_ioctl_get_irqchip(struct kvm *kvm, struct kvm_irqchip *chip)
@@ -4795,10 +4795,11 @@ static bool reexecute_instruction(struct kvm_vcpu *vcpu, gva_t cr2,

 	/* The instructions are well-emulated on direct mmu. */
 	if (vcpu->arch.mmu.direct_map) {
+		struct kvm_mmu_cache *cache = &vcpu->kvm->arch.mmu_cache;
 		unsigned int indirect_shadow_pages;

 		spin_lock(&vcpu->kvm->mmu_lock);
-		indirect_shadow_pages = vcpu->kvm->arch.indirect_shadow_pages;
+		indirect_shadow_pages = cache->indirect_shadow_pages;
 		spin_unlock(&vcpu->kvm->mmu_lock);

 		if (indirect_shadow_pages)
@@ -6756,7 +6757,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	if (type)
 		return -EINVAL;

-	INIT_LIST_HEAD(&kvm->arch.active_mmu_pages);
+	INIT_LIST_HEAD(&kvm->arch.mmu_cache.active_mmu_pages);
 	INIT_LIST_HEAD(&kvm->arch.assigned_dev_head);

 	/* Reserve bit 0 of irq_sources_bitmap for userspace irq source */
@@ -6952,7 +6953,7 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
 			       "failed to munmap memory\n");
 	}

-	if (!kvm->arch.n_requested_mmu_pages)
+	if (!kvm->arch.mmu_cache.n_requested_mmu_pages)
 		nr_mmu_pages = kvm_mmu_calculate_mmu_pages(kvm);

 	if (nr_mmu_pages)
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/6] KVM: MMU: introduce mmu_cache->pte_list_descs
  2013-03-13  4:55 [PATCH 0/6] KVM: MMU: fast zap all shadow pages Xiao Guangrong
  2013-03-13  4:55 ` [PATCH 1/6] KVM: MMU: move mmu related members into a separate struct Xiao Guangrong
@ 2013-03-13  4:56 ` Xiao Guangrong
  2013-03-13  4:57 ` [PATCH 3/6] KVM: x86: introduce memslot_set_lpage_disallowed Xiao Guangrong
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2013-03-13  4:56 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Marcelo Tosatti, Gleb Natapov, LKML, KVM

This list is used to link all the pte_list_desc used by mmu cache, so
we can easily free the memory used by gfn's rmap and parent spte list

[ The new function name: kvm_mmu_init is vey similar with init_kvm_mmu
  which actually init vcpu mmu, will rename init_kvm_mmu to init_vcpu_mmu ]

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
 ---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/mmu.c              |   14 +++++++++++++-
 arch/x86/kvm/mmu.h              |    1 +
 arch/x86/kvm/x86.c              |    2 +-
 4 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 85291b08..04d8897 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -535,6 +535,7 @@ struct kvm_mmu_cache {
 	 * Hash table of struct kvm_mmu_page.
 	 */
 	struct list_head active_mmu_pages;
+	struct list_head pte_list_descs;
 };

 struct kvm_arch {
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index c52d147..4152766 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -156,6 +156,7 @@ module_param(dbg, bool, 0644);
 struct pte_list_desc {
 	u64 *sptes[PTE_LIST_EXT];
 	struct pte_list_desc *more;
+	struct list_head list;
 };

 struct kvm_shadow_walk_iterator {
@@ -701,11 +702,16 @@ static void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc)

 static struct pte_list_desc *mmu_alloc_pte_list_desc(struct kvm_vcpu *vcpu)
 {
-	return mmu_memory_cache_alloc(&vcpu->arch.mmu_pte_list_desc_cache);
+	struct pte_list_desc *desc;
+
+	desc = mmu_memory_cache_alloc(&vcpu->arch.mmu_pte_list_desc_cache);
+	list_add(&desc->list, &vcpu->kvm->arch.mmu_cache.pte_list_descs);
+	return desc;
 }

 static void mmu_free_pte_list_desc(struct pte_list_desc *pte_list_desc)
 {
+	list_del(&pte_list_desc->list);
 	kmem_cache_free(pte_list_desc_cache, pte_list_desc);
 }

@@ -4320,6 +4326,12 @@ int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4])
 }
 EXPORT_SYMBOL_GPL(kvm_mmu_get_spte_hierarchy);

+void kvm_mmu_init(struct kvm *kvm)
+{
+	INIT_LIST_HEAD(&kvm->arch.mmu_cache.active_mmu_pages);
+	INIT_LIST_HEAD(&kvm->arch.mmu_cache.pte_list_descs);
+}
+
 void kvm_mmu_destroy(struct kvm_vcpu *vcpu)
 {
 	ASSERT(vcpu);
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 2e61c24..76adc5f 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -50,6 +50,7 @@
 #define PFERR_RSVD_MASK (1U << 3)
 #define PFERR_FETCH_MASK (1U << 4)

+void kvm_mmu_init(struct kvm *kvm);
 int kvm_mmu_get_spte_hierarchy(struct kvm_vcpu *vcpu, u64 addr, u64 sptes[4]);
 void kvm_mmu_set_mmio_spte_mask(u64 mmio_mask);
 int handle_mmio_page_fault_common(struct kvm_vcpu *vcpu, u64 addr, bool direct);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 9cb899c..7083568 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6757,7 +6757,7 @@ int kvm_arch_init_vm(struct kvm *kvm, unsigned long type)
 	if (type)
 		return -EINVAL;

-	INIT_LIST_HEAD(&kvm->arch.mmu_cache.active_mmu_pages);
+	kvm_mmu_init(kvm);
 	INIT_LIST_HEAD(&kvm->arch.assigned_dev_head);

 	/* Reserve bit 0 of irq_sources_bitmap for userspace irq source */
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/6] KVM: x86: introduce memslot_set_lpage_disallowed
  2013-03-13  4:55 [PATCH 0/6] KVM: MMU: fast zap all shadow pages Xiao Guangrong
  2013-03-13  4:55 ` [PATCH 1/6] KVM: MMU: move mmu related members into a separate struct Xiao Guangrong
  2013-03-13  4:56 ` [PATCH 2/6] KVM: MMU: introduce mmu_cache->pte_list_descs Xiao Guangrong
@ 2013-03-13  4:57 ` Xiao Guangrong
  2013-03-13  4:57 ` [PATCH 4/6] KVM: x86: introduce kvm_clear_all_gfn_page_info Xiao Guangrong
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2013-03-13  4:57 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Marcelo Tosatti, Gleb Natapov, LKML, KVM

It is used to set disallowed lage page on the specified level, can be
used in later patch

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/x86.c |   53 ++++++++++++++++++++++++++++++++++-----------------
 1 files changed, 35 insertions(+), 18 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7083568..2cf3722 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6847,12 +6847,45 @@ void kvm_arch_free_memslot(struct kvm_memory_slot *free,
 	}
 }

+static void
+memslot_set_lpage_disallowed(struct kvm_memory_slot *slot,
+			     unsigned long npages, int lpage_size,
+			     int lpages)
+{
+	struct kvm_lpage_info *lpage_info;
+	unsigned long ugfn;
+	int level = lpage_size + 1;
+
+	WARN_ON(!lpage_size);
+
+	lpage_info = slot->arch.lpage_info[lpage_size - 1];
+
+	if (slot->base_gfn & (KVM_PAGES_PER_HPAGE(level) - 1))
+		lpage_info[0].write_count = 1;
+
+	if ((slot->base_gfn + npages) & (KVM_PAGES_PER_HPAGE(level) - 1))
+		lpage_info[lpages - 1].write_count = 1;
+
+	ugfn = slot->userspace_addr >> PAGE_SHIFT;
+	/*
+	 * If the gfn and userspace address are not aligned wrt each
+	 * other, or if explicitly asked to, disable large page
+	 * support for this slot
+	 */
+	if ((slot->base_gfn ^ ugfn) & (KVM_PAGES_PER_HPAGE(level) - 1) ||
+	      !kvm_largepages_enabled()) {
+		unsigned long j;
+
+		for (j = 0; j < lpages; ++j)
+			lpage_info[j].write_count = 1;
+	}
+}
+
 int kvm_arch_create_memslot(struct kvm_memory_slot *slot, unsigned long npages)
 {
 	int i;

 	for (i = 0; i < KVM_NR_PAGE_SIZES; ++i) {
-		unsigned long ugfn;
 		int lpages;
 		int level = i + 1;

@@ -6871,23 +6904,7 @@ int kvm_arch_create_memslot(struct kvm_memory_slot *slot, unsigned long npages)
 		if (!slot->arch.lpage_info[i - 1])
 			goto out_free;

-		if (slot->base_gfn & (KVM_PAGES_PER_HPAGE(level) - 1))
-			slot->arch.lpage_info[i - 1][0].write_count = 1;
-		if ((slot->base_gfn + npages) & (KVM_PAGES_PER_HPAGE(level) - 1))
-			slot->arch.lpage_info[i - 1][lpages - 1].write_count = 1;
-		ugfn = slot->userspace_addr >> PAGE_SHIFT;
-		/*
-		 * If the gfn and userspace address are not aligned wrt each
-		 * other, or if explicitly asked to, disable large page
-		 * support for this slot
-		 */
-		if ((slot->base_gfn ^ ugfn) & (KVM_PAGES_PER_HPAGE(level) - 1) ||
-		    !kvm_largepages_enabled()) {
-			unsigned long j;
-
-			for (j = 0; j < lpages; ++j)
-				slot->arch.lpage_info[i - 1][j].write_count = 1;
-		}
+		memslot_set_lpage_disallowed(slot, npages, i, lpages);
 	}

 	return 0;
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/6] KVM: x86: introduce kvm_clear_all_gfn_page_info
  2013-03-13  4:55 [PATCH 0/6] KVM: MMU: fast zap all shadow pages Xiao Guangrong
                   ` (2 preceding siblings ...)
  2013-03-13  4:57 ` [PATCH 3/6] KVM: x86: introduce memslot_set_lpage_disallowed Xiao Guangrong
@ 2013-03-13  4:57 ` Xiao Guangrong
  2013-03-13  4:58 ` [PATCH 5/6] KVM: MMU: delete shadow page from hash list in kvm_mmu_prepare_zap_page Xiao Guangrong
  2013-03-13  4:59 ` [PATCH 6/6] KVM: MMU: fast zap all shadow pages Xiao Guangrong
  5 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2013-03-13  4:57 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Marcelo Tosatti, Gleb Natapov, LKML, KVM

This function is used to reset the rmaps and page info of all guest page
which will be used in later patch

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/x86.c       |   31 +++++++++++++++++++++++++++++++
 include/linux/kvm_host.h |    1 +
 2 files changed, 32 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 2cf3722..3cb7685 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6881,6 +6881,37 @@ memslot_set_lpage_disallowed(struct kvm_memory_slot *slot,
 	}
 }

+static void clear_memslot_page_info(struct kvm_memory_slot *slot)
+{
+	int i;
+
+	for (i = 0; i < KVM_NR_PAGE_SIZES; ++i) {
+		int lpages;
+		int level = i + 1;
+
+		lpages = gfn_to_index(slot->base_gfn + slot->npages - 1,
+				      slot->base_gfn, level) + 1;
+
+		memset(slot->arch.rmap[i], 0,
+		       lpages * sizeof(*slot->arch.rmap[i]));
+
+		if (i) {
+			memset(slot->arch.lpage_info[i - 1], 0,
+			       sizeof(*slot->arch.lpage_info[i - 1]));
+			memslot_set_lpage_disallowed(slot, slot->npages, i,
+						     lpages);
+		}
+	}
+}
+
+void kvm_clear_all_gfn_page_info(struct kvm *kvm)
+{
+	struct kvm_memory_slot *slot;
+
+	kvm_for_each_memslot(slot, kvm->memslots)
+		clear_memslot_page_info(slot);
+}
+
 int kvm_arch_create_memslot(struct kvm_memory_slot *slot, unsigned long npages)
 {
 	int i;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 9fa13eb..449126f 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -490,6 +490,7 @@ int __kvm_set_memory_region(struct kvm *kvm,
 			    struct kvm_userspace_memory_region *mem);
 void kvm_arch_free_memslot(struct kvm_memory_slot *free,
 			   struct kvm_memory_slot *dont);
+void kvm_clear_all_gfn_page_info(struct kvm *kvm);
 int kvm_arch_create_memslot(struct kvm_memory_slot *slot, unsigned long npages);
 int kvm_arch_prepare_memory_region(struct kvm *kvm,
 				struct kvm_memory_slot *memslot,
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/6] KVM: MMU: delete shadow page from hash list in kvm_mmu_prepare_zap_page
  2013-03-13  4:55 [PATCH 0/6] KVM: MMU: fast zap all shadow pages Xiao Guangrong
                   ` (3 preceding siblings ...)
  2013-03-13  4:57 ` [PATCH 4/6] KVM: x86: introduce kvm_clear_all_gfn_page_info Xiao Guangrong
@ 2013-03-13  4:58 ` Xiao Guangrong
  2013-03-13  4:59 ` [PATCH 6/6] KVM: MMU: fast zap all shadow pages Xiao Guangrong
  5 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2013-03-13  4:58 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Marcelo Tosatti, Gleb Natapov, LKML, KVM

Move deletion shadow page from the hash list from kvm_mmu_commit_zap_page to
kvm_mmu_prepare_zap_page, we that we can free the shadow page out of mmu-lock.

Also, delete the invalid shadow page from the hash list since this page can
not be reused anymore. This makes reset mmu-cache more easier - we do not need
to care all hash entries after reset mmu-cache

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/mmu.c |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index 4152766..e326099 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -1469,7 +1469,7 @@ static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, int nr)
 static void kvm_mmu_free_page(struct kvm_mmu_page *sp)
 {
 	ASSERT(is_empty_shadow_page(sp->spt));
-	hlist_del(&sp->hash_link);
+
 	list_del(&sp->link);
 	free_page((unsigned long)sp->spt);
 	if (!sp->role.direct)
@@ -1658,7 +1658,8 @@ static void kvm_mmu_commit_zap_page(struct kvm *kvm,

 #define for_each_gfn_indirect_valid_sp(_kvm, _sp, _gfn)			\
 	for_each_gfn_sp(_kvm, _sp, _gfn)				\
-		if ((_sp)->role.direct || (_sp)->role.invalid) {} else
+		if ((_sp)->role.direct ||				\
+		      ((_sp)->role.invalid && WARN_ON(1))) {} else

 /* @sp->gfn should be write-protected at the call site */
 static int __kvm_sync_page(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
@@ -2079,6 +2080,9 @@ static int kvm_mmu_prepare_zap_page(struct kvm *kvm, struct kvm_mmu_page *sp,
 		unaccount_shadowed(kvm, sp->gfn);
 	if (sp->unsync)
 		kvm_unlink_unsync_page(kvm, sp);
+
+	hlist_del_init(&sp->hash_link);
+
 	if (!sp->root_count) {
 		/* Count self */
 		ret++;
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 6/6] KVM: MMU: fast zap all shadow pages
  2013-03-13  4:55 [PATCH 0/6] KVM: MMU: fast zap all shadow pages Xiao Guangrong
                   ` (4 preceding siblings ...)
  2013-03-13  4:58 ` [PATCH 5/6] KVM: MMU: delete shadow page from hash list in kvm_mmu_prepare_zap_page Xiao Guangrong
@ 2013-03-13  4:59 ` Xiao Guangrong
  2013-03-14  1:07   ` Marcelo Tosatti
  2013-03-18 20:46   ` Marcelo Tosatti
  5 siblings, 2 replies; 15+ messages in thread
From: Xiao Guangrong @ 2013-03-13  4:59 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Marcelo Tosatti, Gleb Natapov, LKML, KVM

The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to
walk and zap all shadow pages one by one, also it need to zap all guest
page's rmap and all shadow page's parent spte list. Particularly, things
become worse if guest uses more memory or vcpus. It is not good for
scalability.

Since all shadow page will be zapped, we can directly zap the mmu-cache
and rmap so that vcpu will fault on the new mmu-cache, after that, we can
directly free the memory used by old mmu-cache.

The root shadow page is little especial since they are currently used by
vcpus, we can not directly free them. So, we zap the root shadow pages and
re-add them into the new mmu-cache.

After this patch, kvm_mmu_zap_all can be faster 113% than before

Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
---
 arch/x86/kvm/mmu.c |   62 ++++++++++++++++++++++++++++++++++++++++++++++-----
 1 files changed, 56 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index e326099..536d9ce 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -4186,18 +4186,68 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)

 void kvm_mmu_zap_all(struct kvm *kvm)
 {
-	struct kvm_mmu_page *sp, *node;
+	LIST_HEAD(root_mmu_pages);
 	LIST_HEAD(invalid_list);
+	struct list_head pte_list_descs;
+	struct kvm_mmu_cache *cache = &kvm->arch.mmu_cache;
+	struct kvm_mmu_page *sp, *node;
+	struct pte_list_desc *desc, *ndesc;
+	int root_sp = 0;

 	spin_lock(&kvm->mmu_lock);
+
 restart:
-	list_for_each_entry_safe(sp, node,
-	      &kvm->arch.mmu_cache.active_mmu_pages, link)
-		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
-			goto restart;
+	/*
+	 * The root shadow pages are being used on vcpus that can not
+	 * directly removed, we filter them out and re-add them to the
+	 * new mmu cache.
+	 */
+	list_for_each_entry_safe(sp, node, &cache->active_mmu_pages, link)
+		if (sp->root_count) {
+			int ret;
+
+			root_sp++;
+			ret = kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
+			list_move(&sp->link, &root_mmu_pages);
+			if (ret)
+				goto restart;
+		}
+
+	list_splice(&cache->active_mmu_pages, &invalid_list);
+	list_replace(&cache->pte_list_descs, &pte_list_descs);
+
+	/*
+	 * Reset the mmu cache so that later vcpu will fault on the new
+	 * mmu cache.
+	 */
+	memset(cache, 0, sizeof(*cache));
+	kvm_mmu_init(kvm);
+
+	/*
+	 * Now, the mmu cache has been reset, we can re-add the root shadow
+	 * pages into the cache.
+	 */
+	list_replace(&root_mmu_pages, &cache->active_mmu_pages);
+	kvm_mod_used_mmu_pages(kvm, root_sp);
+
+	/* Reset gfn's rmap and lpage info. */
+	kvm_clear_all_gfn_page_info(kvm);
+
+	/*
+	 * Flush all TLBs so that vcpu can not use the invalid mappings.
+	 * Do not disturb vcpus if root shadow pages have been zapped
+	 * since KVM_REQ_MMU_RELOAD will force TLB to be flushed.
+	 */
+	if (!root_sp && !list_empty(&invalid_list))
+		kvm_flush_remote_tlbs(kvm);

-	kvm_mmu_commit_zap_page(kvm, &invalid_list);
 	spin_unlock(&kvm->mmu_lock);
+
+	list_for_each_entry_safe(sp, node, &invalid_list, link)
+		kvm_mmu_free_page(sp);
+
+	list_for_each_entry_safe(desc, ndesc, &pte_list_descs, list)
+		mmu_free_pte_list_desc(desc);
 }

 static int mmu_shrink(struct shrinker *shrink, struct shrink_control *sc)
-- 
1.7.7.6


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/6] KVM: MMU: fast zap all shadow pages
  2013-03-13  4:59 ` [PATCH 6/6] KVM: MMU: fast zap all shadow pages Xiao Guangrong
@ 2013-03-14  1:07   ` Marcelo Tosatti
  2013-03-14  1:35     ` Marcelo Tosatti
  2013-03-18 20:46   ` Marcelo Tosatti
  1 sibling, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2013-03-14  1:07 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Gleb Natapov, LKML, KVM

On Wed, Mar 13, 2013 at 12:59:12PM +0800, Xiao Guangrong wrote:
> The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to
> walk and zap all shadow pages one by one, also it need to zap all guest
> page's rmap and all shadow page's parent spte list. Particularly, things
> become worse if guest uses more memory or vcpus. It is not good for
> scalability.
> 
> Since all shadow page will be zapped, we can directly zap the mmu-cache
> and rmap so that vcpu will fault on the new mmu-cache, after that, we can
> directly free the memory used by old mmu-cache.
> 
> The root shadow page is little especial since they are currently used by
> vcpus, we can not directly free them. So, we zap the root shadow pages and
> re-add them into the new mmu-cache.
> 
> After this patch, kvm_mmu_zap_all can be faster 113% than before
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> ---
>  arch/x86/kvm/mmu.c |   62 ++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 files changed, 56 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index e326099..536d9ce 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -4186,18 +4186,68 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
> 
>  void kvm_mmu_zap_all(struct kvm *kvm)
>  {
> -	struct kvm_mmu_page *sp, *node;
> +	LIST_HEAD(root_mmu_pages);
>  	LIST_HEAD(invalid_list);
> +	struct list_head pte_list_descs;
> +	struct kvm_mmu_cache *cache = &kvm->arch.mmu_cache;
> +	struct kvm_mmu_page *sp, *node;
> +	struct pte_list_desc *desc, *ndesc;
> +	int root_sp = 0;
> 
>  	spin_lock(&kvm->mmu_lock);
> +
>  restart:
> -	list_for_each_entry_safe(sp, node,
> -	      &kvm->arch.mmu_cache.active_mmu_pages, link)
> -		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
> -			goto restart;
> +	/*
> +	 * The root shadow pages are being used on vcpus that can not
> +	 * directly removed, we filter them out and re-add them to the
> +	 * new mmu cache.
> +	 */
> +	list_for_each_entry_safe(sp, node, &cache->active_mmu_pages, link)
> +		if (sp->root_count) {
> +			int ret;
> +
> +			root_sp++;
> +			ret = kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
> +			list_move(&sp->link, &root_mmu_pages);
> +			if (ret)
> +				goto restart;
> +		}

Why is it safe to skip flushing of root pages, for all
kvm_flush_shadow() callers?

Should revisit KVM_REQ_MMU_RELOAD... not clear it is necessary for NPT
(unrelated).


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/6] KVM: MMU: fast zap all shadow pages
  2013-03-14  1:07   ` Marcelo Tosatti
@ 2013-03-14  1:35     ` Marcelo Tosatti
  2013-03-14  4:42       ` Xiao Guangrong
  0 siblings, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2013-03-14  1:35 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Gleb Natapov, LKML, KVM

On Wed, Mar 13, 2013 at 10:07:06PM -0300, Marcelo Tosatti wrote:
> On Wed, Mar 13, 2013 at 12:59:12PM +0800, Xiao Guangrong wrote:
> > The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to
> > walk and zap all shadow pages one by one, also it need to zap all guest
> > page's rmap and all shadow page's parent spte list. Particularly, things
> > become worse if guest uses more memory or vcpus. It is not good for
> > scalability.
> > 
> > Since all shadow page will be zapped, we can directly zap the mmu-cache
> > and rmap so that vcpu will fault on the new mmu-cache, after that, we can
> > directly free the memory used by old mmu-cache.
> > 
> > The root shadow page is little especial since they are currently used by
> > vcpus, we can not directly free them. So, we zap the root shadow pages and
> > re-add them into the new mmu-cache.
> > 
> > After this patch, kvm_mmu_zap_all can be faster 113% than before
> > 
> > Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> > ---
> >  arch/x86/kvm/mmu.c |   62 ++++++++++++++++++++++++++++++++++++++++++++++-----
> >  1 files changed, 56 insertions(+), 6 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> > index e326099..536d9ce 100644
> > --- a/arch/x86/kvm/mmu.c
> > +++ b/arch/x86/kvm/mmu.c
> > @@ -4186,18 +4186,68 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
> > 
> >  void kvm_mmu_zap_all(struct kvm *kvm)
> >  {
> > -	struct kvm_mmu_page *sp, *node;
> > +	LIST_HEAD(root_mmu_pages);
> >  	LIST_HEAD(invalid_list);
> > +	struct list_head pte_list_descs;
> > +	struct kvm_mmu_cache *cache = &kvm->arch.mmu_cache;
> > +	struct kvm_mmu_page *sp, *node;
> > +	struct pte_list_desc *desc, *ndesc;
> > +	int root_sp = 0;
> > 
> >  	spin_lock(&kvm->mmu_lock);
> > +
> >  restart:
> > -	list_for_each_entry_safe(sp, node,
> > -	      &kvm->arch.mmu_cache.active_mmu_pages, link)
> > -		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
> > -			goto restart;
> > +	/*
> > +	 * The root shadow pages are being used on vcpus that can not
> > +	 * directly removed, we filter them out and re-add them to the
> > +	 * new mmu cache.
> > +	 */
> > +	list_for_each_entry_safe(sp, node, &cache->active_mmu_pages, link)
> > +		if (sp->root_count) {
> > +			int ret;
> > +
> > +			root_sp++;
> > +			ret = kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
> > +			list_move(&sp->link, &root_mmu_pages);
> > +			if (ret)
> > +				goto restart;
> > +		}
> 
> Why is it safe to skip flushing of root pages, for all
> kvm_flush_shadow() callers?

You are not skipping the flush, only moving to the new mmu cache.

> Should revisit KVM_REQ_MMU_RELOAD... not clear it is necessary for NPT
> (unrelated).

Actually, what i meant is: you can batch KVM_REQ_MMU_RELOAD requests to
the end of kvm_mmu_zap_all. Waking up vcpus is not optimal since they're
going to contend for mmu_lock anyway.

Need more time to have more useful comments to this patchset, sorry.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/6] KVM: MMU: fast zap all shadow pages
  2013-03-14  1:35     ` Marcelo Tosatti
@ 2013-03-14  4:42       ` Xiao Guangrong
  0 siblings, 0 replies; 15+ messages in thread
From: Xiao Guangrong @ 2013-03-14  4:42 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Gleb Natapov, LKML, KVM

On 03/14/2013 09:35 AM, Marcelo Tosatti wrote:
> On Wed, Mar 13, 2013 at 10:07:06PM -0300, Marcelo Tosatti wrote:
>> On Wed, Mar 13, 2013 at 12:59:12PM +0800, Xiao Guangrong wrote:
>>> The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to
>>> walk and zap all shadow pages one by one, also it need to zap all guest
>>> page's rmap and all shadow page's parent spte list. Particularly, things
>>> become worse if guest uses more memory or vcpus. It is not good for
>>> scalability.
>>>
>>> Since all shadow page will be zapped, we can directly zap the mmu-cache
>>> and rmap so that vcpu will fault on the new mmu-cache, after that, we can
>>> directly free the memory used by old mmu-cache.
>>>
>>> The root shadow page is little especial since they are currently used by
>>> vcpus, we can not directly free them. So, we zap the root shadow pages and
>>> re-add them into the new mmu-cache.
>>>
>>> After this patch, kvm_mmu_zap_all can be faster 113% than before
>>>
>>> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
>>> ---
>>>  arch/x86/kvm/mmu.c |   62 ++++++++++++++++++++++++++++++++++++++++++++++-----
>>>  1 files changed, 56 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>>> index e326099..536d9ce 100644
>>> --- a/arch/x86/kvm/mmu.c
>>> +++ b/arch/x86/kvm/mmu.c
>>> @@ -4186,18 +4186,68 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
>>>
>>>  void kvm_mmu_zap_all(struct kvm *kvm)
>>>  {
>>> -	struct kvm_mmu_page *sp, *node;
>>> +	LIST_HEAD(root_mmu_pages);
>>>  	LIST_HEAD(invalid_list);
>>> +	struct list_head pte_list_descs;
>>> +	struct kvm_mmu_cache *cache = &kvm->arch.mmu_cache;
>>> +	struct kvm_mmu_page *sp, *node;
>>> +	struct pte_list_desc *desc, *ndesc;
>>> +	int root_sp = 0;
>>>
>>>  	spin_lock(&kvm->mmu_lock);
>>> +
>>>  restart:
>>> -	list_for_each_entry_safe(sp, node,
>>> -	      &kvm->arch.mmu_cache.active_mmu_pages, link)
>>> -		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
>>> -			goto restart;
>>> +	/*
>>> +	 * The root shadow pages are being used on vcpus that can not
>>> +	 * directly removed, we filter them out and re-add them to the
>>> +	 * new mmu cache.
>>> +	 */
>>> +	list_for_each_entry_safe(sp, node, &cache->active_mmu_pages, link)
>>> +		if (sp->root_count) {
>>> +			int ret;
>>> +
>>> +			root_sp++;
>>> +			ret = kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
>>> +			list_move(&sp->link, &root_mmu_pages);
>>> +			if (ret)
>>> +				goto restart;
>>> +		}
>>
>> Why is it safe to skip flushing of root pages, for all
>> kvm_flush_shadow() callers?
> 
> You are not skipping the flush, only moving to the new mmu cache.
> 
>> Should revisit KVM_REQ_MMU_RELOAD... not clear it is necessary for NPT
>> (unrelated).
> 
> Actually, what i meant is: you can batch KVM_REQ_MMU_RELOAD requests to
> the end of kvm_mmu_zap_all. Waking up vcpus is not optimal since they're
> going to contend for mmu_lock anyway.

Yes, I agree. Will move KVM_REQ_MMU_RELOAD to the end of kvm_mmu_zap_all in
the V2.

BTW, the TLB flushed is not needed if no root shadow page zapped since all
vcpus are not using shadow pages. The code may be simplified to (after batch
KVM_REQ_MMU_RELOAD):

if (root_sp)
	kvm_reload_remote_mmus()
> 
> Need more time to have more useful comments to this patchset, sorry.

No problem. ;) The current comments is really useful for me.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/6] KVM: MMU: fast zap all shadow pages
  2013-03-13  4:59 ` [PATCH 6/6] KVM: MMU: fast zap all shadow pages Xiao Guangrong
  2013-03-14  1:07   ` Marcelo Tosatti
@ 2013-03-18 20:46   ` Marcelo Tosatti
  2013-03-19  3:06     ` Xiao Guangrong
  1 sibling, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2013-03-18 20:46 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Gleb Natapov, LKML, KVM

On Wed, Mar 13, 2013 at 12:59:12PM +0800, Xiao Guangrong wrote:
> The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to
> walk and zap all shadow pages one by one, also it need to zap all guest
> page's rmap and all shadow page's parent spte list. Particularly, things
> become worse if guest uses more memory or vcpus. It is not good for
> scalability.
> 
> Since all shadow page will be zapped, we can directly zap the mmu-cache
> and rmap so that vcpu will fault on the new mmu-cache, after that, we can
> directly free the memory used by old mmu-cache.
> 
> The root shadow page is little especial since they are currently used by
> vcpus, we can not directly free them. So, we zap the root shadow pages and
> re-add them into the new mmu-cache.
> 
> After this patch, kvm_mmu_zap_all can be faster 113% than before
> 
> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> ---
>  arch/x86/kvm/mmu.c |   62 ++++++++++++++++++++++++++++++++++++++++++++++-----
>  1 files changed, 56 insertions(+), 6 deletions(-)
> 
> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> index e326099..536d9ce 100644
> --- a/arch/x86/kvm/mmu.c
> +++ b/arch/x86/kvm/mmu.c
> @@ -4186,18 +4186,68 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
> 
>  void kvm_mmu_zap_all(struct kvm *kvm)
>  {
> -	struct kvm_mmu_page *sp, *node;
> +	LIST_HEAD(root_mmu_pages);
>  	LIST_HEAD(invalid_list);
> +	struct list_head pte_list_descs;
> +	struct kvm_mmu_cache *cache = &kvm->arch.mmu_cache;
> +	struct kvm_mmu_page *sp, *node;
> +	struct pte_list_desc *desc, *ndesc;
> +	int root_sp = 0;
> 
>  	spin_lock(&kvm->mmu_lock);
> +
>  restart:
> -	list_for_each_entry_safe(sp, node,
> -	      &kvm->arch.mmu_cache.active_mmu_pages, link)
> -		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
> -			goto restart;
> +	/*
> +	 * The root shadow pages are being used on vcpus that can not
> +	 * directly removed, we filter them out and re-add them to the
> +	 * new mmu cache.
> +	 */
> +	list_for_each_entry_safe(sp, node, &cache->active_mmu_pages, link)
> +		if (sp->root_count) {
> +			int ret;
> +
> +			root_sp++;
> +			ret = kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
> +			list_move(&sp->link, &root_mmu_pages);
> +			if (ret)
> +				goto restart;
> +		}
> +
> +	list_splice(&cache->active_mmu_pages, &invalid_list);
> +	list_replace(&cache->pte_list_descs, &pte_list_descs);
> +
> +	/*
> +	 * Reset the mmu cache so that later vcpu will fault on the new
> +	 * mmu cache.
> +	 */
> +	memset(cache, 0, sizeof(*cache));
> +	kvm_mmu_init(kvm);

Xiao,

I suppose zeroing of kvm_mmu_cache can be avoided, if the links are
removed at prepare_zap_page. So perhaps

- spin_lock(mmu_lock)
- for each page
	- zero sp->spt[], remove page from linked lists
- flush remote TLB (batched)
- spin_unlock(mmu_lock)
- free data (which is safe because freeing has its own serialization)
- spin_lock(mmu_lock)
- account for the pages freed
- spin_unlock(mmu_lock)

(or if you think of some other way to not have the mmu_cache zeroing step).

Note the account for pages freed step after pages are actually
freed: as discussed with Takuya, having pages freed and freed page
accounting out of sync across mmu_lock is potentially problematic:
kvm->arch.n_used_mmu_pages and friends do not reflect reality which can
cause problems for SLAB freeing and page allocation throttling.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/6] KVM: MMU: fast zap all shadow pages
  2013-03-18 20:46   ` Marcelo Tosatti
@ 2013-03-19  3:06     ` Xiao Guangrong
  2013-03-19 14:40       ` Marcelo Tosatti
  0 siblings, 1 reply; 15+ messages in thread
From: Xiao Guangrong @ 2013-03-19  3:06 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Gleb Natapov, LKML, KVM

On 03/19/2013 04:46 AM, Marcelo Tosatti wrote:
> On Wed, Mar 13, 2013 at 12:59:12PM +0800, Xiao Guangrong wrote:
>> The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to
>> walk and zap all shadow pages one by one, also it need to zap all guest
>> page's rmap and all shadow page's parent spte list. Particularly, things
>> become worse if guest uses more memory or vcpus. It is not good for
>> scalability.
>>
>> Since all shadow page will be zapped, we can directly zap the mmu-cache
>> and rmap so that vcpu will fault on the new mmu-cache, after that, we can
>> directly free the memory used by old mmu-cache.
>>
>> The root shadow page is little especial since they are currently used by
>> vcpus, we can not directly free them. So, we zap the root shadow pages and
>> re-add them into the new mmu-cache.
>>
>> After this patch, kvm_mmu_zap_all can be faster 113% than before
>>
>> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
>> ---
>>  arch/x86/kvm/mmu.c |   62 ++++++++++++++++++++++++++++++++++++++++++++++-----
>>  1 files changed, 56 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
>> index e326099..536d9ce 100644
>> --- a/arch/x86/kvm/mmu.c
>> +++ b/arch/x86/kvm/mmu.c
>> @@ -4186,18 +4186,68 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
>>
>>  void kvm_mmu_zap_all(struct kvm *kvm)
>>  {
>> -	struct kvm_mmu_page *sp, *node;
>> +	LIST_HEAD(root_mmu_pages);
>>  	LIST_HEAD(invalid_list);
>> +	struct list_head pte_list_descs;
>> +	struct kvm_mmu_cache *cache = &kvm->arch.mmu_cache;
>> +	struct kvm_mmu_page *sp, *node;
>> +	struct pte_list_desc *desc, *ndesc;
>> +	int root_sp = 0;
>>
>>  	spin_lock(&kvm->mmu_lock);
>> +
>>  restart:
>> -	list_for_each_entry_safe(sp, node,
>> -	      &kvm->arch.mmu_cache.active_mmu_pages, link)
>> -		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
>> -			goto restart;
>> +	/*
>> +	 * The root shadow pages are being used on vcpus that can not
>> +	 * directly removed, we filter them out and re-add them to the
>> +	 * new mmu cache.
>> +	 */
>> +	list_for_each_entry_safe(sp, node, &cache->active_mmu_pages, link)
>> +		if (sp->root_count) {
>> +			int ret;
>> +
>> +			root_sp++;
>> +			ret = kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
>> +			list_move(&sp->link, &root_mmu_pages);
>> +			if (ret)
>> +				goto restart;
>> +		}
>> +
>> +	list_splice(&cache->active_mmu_pages, &invalid_list);
>> +	list_replace(&cache->pte_list_descs, &pte_list_descs);
>> +
>> +	/*
>> +	 * Reset the mmu cache so that later vcpu will fault on the new
>> +	 * mmu cache.
>> +	 */
>> +	memset(cache, 0, sizeof(*cache));
>> +	kvm_mmu_init(kvm);
> 
> Xiao,
> 
> I suppose zeroing of kvm_mmu_cache can be avoided, if the links are
> removed at prepare_zap_page. So perhaps

The purpose of zeroing of kvm_mmu_cache is resetting the hashtable and
some count numbers.
[.n_request_mmu_pages and .n_max_mmu_pages should not be changed, i will
fix this].

> 
> - spin_lock(mmu_lock)
> - for each page
> 	- zero sp->spt[], remove page from linked lists

sizeof(mmu_cache) is:
(1 << 10) * sizeof (hlist_head) + 4 * sizeof(unsigned int) = 2^13 + 16
and it is constant. In your way, for every sp, we need to zap:
512 entries + a hash-node = 2^12 + 8
especially the workload depends on the size of guest memory.
Why you think this way is better?

> - flush remote TLB (batched)
> - spin_unlock(mmu_lock)
> - free data (which is safe because freeing has its own serialization)

We should free the root sp in mmu-lock like my patch.

> - spin_lock(mmu_lock)
> - account for the pages freed
> - spin_unlock(mmu_lock)

The count numbers are still inconsistent if other thread hold mmu-lock between
zero shadow page and recount.

Marcelo, i really confused what is the benefit in this way but i might
completely misunderstand it.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/6] KVM: MMU: fast zap all shadow pages
  2013-03-19  3:06     ` Xiao Guangrong
@ 2013-03-19 14:40       ` Marcelo Tosatti
  2013-03-19 15:37         ` Xiao Guangrong
  0 siblings, 1 reply; 15+ messages in thread
From: Marcelo Tosatti @ 2013-03-19 14:40 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Gleb Natapov, LKML, KVM

On Tue, Mar 19, 2013 at 11:06:35AM +0800, Xiao Guangrong wrote:
> On 03/19/2013 04:46 AM, Marcelo Tosatti wrote:
> > On Wed, Mar 13, 2013 at 12:59:12PM +0800, Xiao Guangrong wrote:
> >> The current kvm_mmu_zap_all is really slow - it is holding mmu-lock to
> >> walk and zap all shadow pages one by one, also it need to zap all guest
> >> page's rmap and all shadow page's parent spte list. Particularly, things
> >> become worse if guest uses more memory or vcpus. It is not good for
> >> scalability.
> >>
> >> Since all shadow page will be zapped, we can directly zap the mmu-cache
> >> and rmap so that vcpu will fault on the new mmu-cache, after that, we can
> >> directly free the memory used by old mmu-cache.
> >>
> >> The root shadow page is little especial since they are currently used by
> >> vcpus, we can not directly free them. So, we zap the root shadow pages and
> >> re-add them into the new mmu-cache.
> >>
> >> After this patch, kvm_mmu_zap_all can be faster 113% than before
> >>
> >> Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
> >> ---
> >>  arch/x86/kvm/mmu.c |   62 ++++++++++++++++++++++++++++++++++++++++++++++-----
> >>  1 files changed, 56 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
> >> index e326099..536d9ce 100644
> >> --- a/arch/x86/kvm/mmu.c
> >> +++ b/arch/x86/kvm/mmu.c
> >> @@ -4186,18 +4186,68 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm, int slot)
> >>
> >>  void kvm_mmu_zap_all(struct kvm *kvm)
> >>  {
> >> -	struct kvm_mmu_page *sp, *node;
> >> +	LIST_HEAD(root_mmu_pages);
> >>  	LIST_HEAD(invalid_list);
> >> +	struct list_head pte_list_descs;
> >> +	struct kvm_mmu_cache *cache = &kvm->arch.mmu_cache;
> >> +	struct kvm_mmu_page *sp, *node;
> >> +	struct pte_list_desc *desc, *ndesc;
> >> +	int root_sp = 0;
> >>
> >>  	spin_lock(&kvm->mmu_lock);
> >> +
> >>  restart:
> >> -	list_for_each_entry_safe(sp, node,
> >> -	      &kvm->arch.mmu_cache.active_mmu_pages, link)
> >> -		if (kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list))
> >> -			goto restart;
> >> +	/*
> >> +	 * The root shadow pages are being used on vcpus that can not
> >> +	 * directly removed, we filter them out and re-add them to the
> >> +	 * new mmu cache.
> >> +	 */
> >> +	list_for_each_entry_safe(sp, node, &cache->active_mmu_pages, link)
> >> +		if (sp->root_count) {
> >> +			int ret;
> >> +
> >> +			root_sp++;
> >> +			ret = kvm_mmu_prepare_zap_page(kvm, sp, &invalid_list);
> >> +			list_move(&sp->link, &root_mmu_pages);
> >> +			if (ret)
> >> +				goto restart;
> >> +		}
> >> +
> >> +	list_splice(&cache->active_mmu_pages, &invalid_list);
> >> +	list_replace(&cache->pte_list_descs, &pte_list_descs);
> >> +
> >> +	/*
> >> +	 * Reset the mmu cache so that later vcpu will fault on the new
> >> +	 * mmu cache.
> >> +	 */
> >> +	memset(cache, 0, sizeof(*cache));
> >> +	kvm_mmu_init(kvm);
> > 
> > Xiao,
> > 
> > I suppose zeroing of kvm_mmu_cache can be avoided, if the links are
> > removed at prepare_zap_page. So perhaps
> 
> The purpose of zeroing of kvm_mmu_cache is resetting the hashtable and
> some count numbers.
> [.n_request_mmu_pages and .n_max_mmu_pages should not be changed, i will
> fix this].
> 
> > 
> > - spin_lock(mmu_lock)
> > - for each page
> > 	- zero sp->spt[], remove page from linked lists
> 
> sizeof(mmu_cache) is:
> (1 << 10) * sizeof (hlist_head) + 4 * sizeof(unsigned int) = 2^13 + 16
> and it is constant. In your way, for every sp, we need to zap:
> 512 entries + a hash-node = 2^12 + 8
> especially the workload depends on the size of guest memory.
> Why you think this way is better?

Its not of course. 

> > - flush remote TLB (batched)
> > - spin_unlock(mmu_lock)
> > - free data (which is safe because freeing has its own serialization)
> 
> We should free the root sp in mmu-lock like my patch.
> 
> > - spin_lock(mmu_lock)
> > - account for the pages freed
> > - spin_unlock(mmu_lock)
> 
> The count numbers are still inconsistent if other thread hold mmu-lock between
> zero shadow page and recount.
> 
> Marcelo, i really confused what is the benefit in this way but i might
> completely misunderstand it.

I misunderstood the benefit of your idea (now i got it: zap root
and flush TLB guarantees vcpus will refault). What i'd like to avoid is

memset(cache, 0, sizeof(*cache));
kvm_mmu_init(kvm);

I'd prefer normal operations on those data structures (in mmu_cache).
And also the page accounting is a problem.

Perhaps you can use a generation number to consider whether shadow pages
are valid? So: 

find_sp(gfn_t gfn)
lookup hash
if sp->generation_number != mmu->current_generation_number
	initialize page as if it were just allocated (but keep it in the hash list)

And on kvm_mmu_zap_all()
spin_lock(mmu_lock)
for each page
if page->root_count
	zero sp->spt[]

flush TLB
mmu->current_generation_number++
spin_unlock(mmu_lock)

Then have kvm_mmu_free_all() that actually frees all data.

Hum, not sure if thats any better than your current patchset.
Well, maybe resend patchset with bug fixes / improvements and 
we go from there.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/6] KVM: MMU: fast zap all shadow pages
  2013-03-19 14:40       ` Marcelo Tosatti
@ 2013-03-19 15:37         ` Xiao Guangrong
  2013-03-19 22:37           ` Marcelo Tosatti
  0 siblings, 1 reply; 15+ messages in thread
From: Xiao Guangrong @ 2013-03-19 15:37 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Gleb Natapov, LKML, KVM

On 03/19/2013 10:40 PM, Marcelo Tosatti wrote:

> 
> I misunderstood the benefit of your idea (now i got it: zap root
> and flush TLB guarantees vcpus will refault). What i'd like to avoid is
> 
> memset(cache, 0, sizeof(*cache));
> kvm_mmu_init(kvm);
> 
> I'd prefer normal operations on those data structures (in mmu_cache).
> And also the page accounting is a problem.
> 
> Perhaps you can use a generation number to consider whether shadow pages
> are valid? So: 
> 
> find_sp(gfn_t gfn)
> lookup hash
> if sp->generation_number != mmu->current_generation_number
> 	initialize page as if it were just allocated (but keep it in the hash list)
> 
> And on kvm_mmu_zap_all()
> spin_lock(mmu_lock)
> for each page
> if page->root_count
> 	zero sp->spt[]
> 
> flush TLB
> mmu->current_generation_number++
> spin_unlock(mmu_lock)
> 
> Then have kvm_mmu_free_all() that actually frees all data.
> 
> Hum, not sure if thats any better than your current patchset.

I also got the idea of generation number like yours which i mentioned
it in the [PATCH 0/6]:

* TODO
Avoiding Marcelo beat me :), they are some works not attached to make the
patchset more smaller:
(1): batch kvm_reload_remote_mmus for zapping root shadow pages
(2): free shadow pages by using generation-number
(3): remove unneeded kvm_reload_remote_mmus after kvm_mmu_zap_all
(4): drop unnecessary @npages from kvm_arch_create_memslot
(5): rename init_kvm_mmu to init_vcpu_mmu

> Well, maybe resend patchset with bug fixes / improvements and 
> we go from there.

I agree. Thanks for your time, Marcelo!



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 6/6] KVM: MMU: fast zap all shadow pages
  2013-03-19 15:37         ` Xiao Guangrong
@ 2013-03-19 22:37           ` Marcelo Tosatti
  0 siblings, 0 replies; 15+ messages in thread
From: Marcelo Tosatti @ 2013-03-19 22:37 UTC (permalink / raw)
  To: Xiao Guangrong; +Cc: Gleb Natapov, LKML, KVM

On Tue, Mar 19, 2013 at 11:37:38PM +0800, Xiao Guangrong wrote:
> On 03/19/2013 10:40 PM, Marcelo Tosatti wrote:
> 
> > 
> > I misunderstood the benefit of your idea (now i got it: zap root
> > and flush TLB guarantees vcpus will refault). What i'd like to avoid is
> > 
> > memset(cache, 0, sizeof(*cache));
> > kvm_mmu_init(kvm);
> > 
> > I'd prefer normal operations on those data structures (in mmu_cache).
> > And also the page accounting is a problem.
> > 
> > Perhaps you can use a generation number to consider whether shadow pages
> > are valid? So: 
> > 
> > find_sp(gfn_t gfn)
> > lookup hash
> > if sp->generation_number != mmu->current_generation_number
> > 	initialize page as if it were just allocated (but keep it in the hash list)
> > 
> > And on kvm_mmu_zap_all()
> > spin_lock(mmu_lock)
> > for each page
> > if page->root_count
> > 	zero sp->spt[]
> > 
> > flush TLB
> > mmu->current_generation_number++
> > spin_unlock(mmu_lock)
> > 
> > Then have kvm_mmu_free_all() that actually frees all data.
> > 
> > Hum, not sure if thats any better than your current patchset.
> 
> I also got the idea of generation number like yours which i mentioned
> it in the [PATCH 0/6]:
> 
> * TODO
> Avoiding Marcelo beat me :), they are some works not attached to make the
> patchset more smaller:
> (1): batch kvm_reload_remote_mmus for zapping root shadow pages
> (2): free shadow pages by using generation-number
> (3): remove unneeded kvm_reload_remote_mmus after kvm_mmu_zap_all
> (4): drop unnecessary @npages from kvm_arch_create_memslot
> (5): rename init_kvm_mmu to init_vcpu_mmu

Long patch sets are OK. Problem are unrelated changes in a patchset.


^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2013-03-19 22:37 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-03-13  4:55 [PATCH 0/6] KVM: MMU: fast zap all shadow pages Xiao Guangrong
2013-03-13  4:55 ` [PATCH 1/6] KVM: MMU: move mmu related members into a separate struct Xiao Guangrong
2013-03-13  4:56 ` [PATCH 2/6] KVM: MMU: introduce mmu_cache->pte_list_descs Xiao Guangrong
2013-03-13  4:57 ` [PATCH 3/6] KVM: x86: introduce memslot_set_lpage_disallowed Xiao Guangrong
2013-03-13  4:57 ` [PATCH 4/6] KVM: x86: introduce kvm_clear_all_gfn_page_info Xiao Guangrong
2013-03-13  4:58 ` [PATCH 5/6] KVM: MMU: delete shadow page from hash list in kvm_mmu_prepare_zap_page Xiao Guangrong
2013-03-13  4:59 ` [PATCH 6/6] KVM: MMU: fast zap all shadow pages Xiao Guangrong
2013-03-14  1:07   ` Marcelo Tosatti
2013-03-14  1:35     ` Marcelo Tosatti
2013-03-14  4:42       ` Xiao Guangrong
2013-03-18 20:46   ` Marcelo Tosatti
2013-03-19  3:06     ` Xiao Guangrong
2013-03-19 14:40       ` Marcelo Tosatti
2013-03-19 15:37         ` Xiao Guangrong
2013-03-19 22:37           ` Marcelo Tosatti

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).