stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	stable@vger.kernel.org, "K. Y. Srinivasan" <kys@microsoft.com>,
	Rolf Neugebauer <rolf.neugebauer@docker.com>
Subject: [PATCH 4.9 37/60] Drivers: hv: vmbus: On write cleanup the logic to interrupt the host
Date: Mon, 13 Feb 2017 05:04:09 -0800	[thread overview]
Message-ID: <20170213130338.264194685@linuxfoundation.org> (raw)
In-Reply-To: <20170213130333.057515084@linuxfoundation.org>

4.9-stable review patch.  If anyone has any objections, please let me know.

------------------

From: K. Y. Srinivasan <kys@microsoft.com>

commit 1f6ee4e7d83586c8b10bd4f2f4346353d04ce884 upstream.

Signal the host when we determine the host is to be signaled.
The currrent code determines the need to signal in the ringbuffer
code and actually issues the signal elsewhere. This can result
in the host viewing this interrupt as spurious since the host may also
poll the channel. Make the necessary adjustments.

Signed-off-by: K. Y. Srinivasan <kys@microsoft.com>
Cc: Rolf Neugebauer <rolf.neugebauer@docker.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

---
 drivers/hv/channel.c      |   99 ++++------------------------------------------
 drivers/hv/hyperv_vmbus.h |    6 +-
 drivers/hv/ring_buffer.c  |   30 +++++++++----
 include/linux/hyperv.h    |    1 
 4 files changed, 35 insertions(+), 101 deletions(-)

--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -39,7 +39,7 @@
  * vmbus_setevent- Trigger an event notification on the specified
  * channel.
  */
-static void vmbus_setevent(struct vmbus_channel *channel)
+void vmbus_setevent(struct vmbus_channel *channel)
 {
 	struct hv_monitor_page *monitorpage;
 
@@ -65,6 +65,7 @@ static void vmbus_setevent(struct vmbus_
 		vmbus_set_event(channel);
 	}
 }
+EXPORT_SYMBOL_GPL(vmbus_setevent);
 
 /*
  * vmbus_open - Open the specified channel.
@@ -635,8 +636,6 @@ int vmbus_sendpacket_ctl(struct vmbus_ch
 	u32 packetlen_aligned = ALIGN(packetlen, sizeof(u64));
 	struct kvec bufferlist[3];
 	u64 aligned_data = 0;
-	int ret;
-	bool signal = false;
 	bool lock = channel->acquire_ring_lock;
 	int num_vecs = ((bufferlen != 0) ? 3 : 1);
 
@@ -656,41 +655,9 @@ int vmbus_sendpacket_ctl(struct vmbus_ch
 	bufferlist[2].iov_base = &aligned_data;
 	bufferlist[2].iov_len = (packetlen_aligned - packetlen);
 
-	ret = hv_ringbuffer_write(&channel->outbound, bufferlist, num_vecs,
-				  &signal, lock, channel->signal_policy);
-
-	/*
-	 * Signalling the host is conditional on many factors:
-	 * 1. The ring state changed from being empty to non-empty.
-	 *    This is tracked by the variable "signal".
-	 * 2. The variable kick_q tracks if more data will be placed
-	 *    on the ring. We will not signal if more data is
-	 *    to be placed.
-	 *
-	 * Based on the channel signal state, we will decide
-	 * which signaling policy will be applied.
-	 *
-	 * If we cannot write to the ring-buffer; signal the host
-	 * even if we may not have written anything. This is a rare
-	 * enough condition that it should not matter.
-	 * NOTE: in this case, the hvsock channel is an exception, because
-	 * it looks the host side's hvsock implementation has a throttling
-	 * mechanism which can hurt the performance otherwise.
-	 *
-	 * KYS: Oct. 30, 2016:
-	 * It looks like Windows hosts have logic to deal with DOS attacks that
-	 * can be triggered if it receives interrupts when it is not expecting
-	 * the interrupt. The host expects interrupts only when the ring
-	 * transitions from empty to non-empty (or full to non full on the guest
-	 * to host ring).
-	 * So, base the signaling decision solely on the ring state until the
-	 * host logic is fixed.
-	 */
-
-	if (((ret == 0) && signal))
-		vmbus_setevent(channel);
+	return hv_ringbuffer_write(channel, bufferlist, num_vecs,
+				   lock, kick_q);
 
-	return ret;
 }
 EXPORT_SYMBOL(vmbus_sendpacket_ctl);
 
@@ -731,7 +698,6 @@ int vmbus_sendpacket_pagebuffer_ctl(stru
 				     u32 flags,
 				     bool kick_q)
 {
-	int ret;
 	int i;
 	struct vmbus_channel_packet_page_buffer desc;
 	u32 descsize;
@@ -739,7 +705,6 @@ int vmbus_sendpacket_pagebuffer_ctl(stru
 	u32 packetlen_aligned;
 	struct kvec bufferlist[3];
 	u64 aligned_data = 0;
-	bool signal = false;
 	bool lock = channel->acquire_ring_lock;
 
 	if (pagecount > MAX_PAGE_BUFFER_COUNT)
@@ -777,38 +742,8 @@ int vmbus_sendpacket_pagebuffer_ctl(stru
 	bufferlist[2].iov_base = &aligned_data;
 	bufferlist[2].iov_len = (packetlen_aligned - packetlen);
 
-	ret = hv_ringbuffer_write(&channel->outbound, bufferlist, 3,
-				  &signal, lock, channel->signal_policy);
-
-	/*
-	 * Signalling the host is conditional on many factors:
-	 * 1. The ring state changed from being empty to non-empty.
-	 *    This is tracked by the variable "signal".
-	 * 2. The variable kick_q tracks if more data will be placed
-	 *    on the ring. We will not signal if more data is
-	 *    to be placed.
-	 *
-	 * Based on the channel signal state, we will decide
-	 * which signaling policy will be applied.
-	 *
-	 * If we cannot write to the ring-buffer; signal the host
-	 * even if we may not have written anything. This is a rare
-	 * enough condition that it should not matter.
-	 *
-	 * KYS: Oct. 30, 2016:
-	 * It looks like Windows hosts have logic to deal with DOS attacks that
-	 * can be triggered if it receives interrupts when it is not expecting
-	 * the interrupt. The host expects interrupts only when the ring
-	 * transitions from empty to non-empty (or full to non full on the guest
-	 * to host ring).
-	 * So, base the signaling decision solely on the ring state until the
-	 * host logic is fixed.
-	 */
-
-	if (((ret == 0) && signal))
-		vmbus_setevent(channel);
-
-	return ret;
+	return hv_ringbuffer_write(channel, bufferlist, 3,
+				   lock, kick_q);
 }
 EXPORT_SYMBOL_GPL(vmbus_sendpacket_pagebuffer_ctl);
 
@@ -839,12 +774,10 @@ int vmbus_sendpacket_mpb_desc(struct vmb
 			      u32 desc_size,
 			      void *buffer, u32 bufferlen, u64 requestid)
 {
-	int ret;
 	u32 packetlen;
 	u32 packetlen_aligned;
 	struct kvec bufferlist[3];
 	u64 aligned_data = 0;
-	bool signal = false;
 	bool lock = channel->acquire_ring_lock;
 
 	packetlen = desc_size + bufferlen;
@@ -865,13 +798,8 @@ int vmbus_sendpacket_mpb_desc(struct vmb
 	bufferlist[2].iov_base = &aligned_data;
 	bufferlist[2].iov_len = (packetlen_aligned - packetlen);
 
-	ret = hv_ringbuffer_write(&channel->outbound, bufferlist, 3,
-				  &signal, lock, channel->signal_policy);
-
-	if (ret == 0 && signal)
-		vmbus_setevent(channel);
-
-	return ret;
+	return hv_ringbuffer_write(channel, bufferlist, 3,
+				   lock, true);
 }
 EXPORT_SYMBOL_GPL(vmbus_sendpacket_mpb_desc);
 
@@ -883,14 +811,12 @@ int vmbus_sendpacket_multipagebuffer(str
 				struct hv_multipage_buffer *multi_pagebuffer,
 				void *buffer, u32 bufferlen, u64 requestid)
 {
-	int ret;
 	struct vmbus_channel_packet_multipage_buffer desc;
 	u32 descsize;
 	u32 packetlen;
 	u32 packetlen_aligned;
 	struct kvec bufferlist[3];
 	u64 aligned_data = 0;
-	bool signal = false;
 	bool lock = channel->acquire_ring_lock;
 	u32 pfncount = NUM_PAGES_SPANNED(multi_pagebuffer->offset,
 					 multi_pagebuffer->len);
@@ -930,13 +856,8 @@ int vmbus_sendpacket_multipagebuffer(str
 	bufferlist[2].iov_base = &aligned_data;
 	bufferlist[2].iov_len = (packetlen_aligned - packetlen);
 
-	ret = hv_ringbuffer_write(&channel->outbound, bufferlist, 3,
-				  &signal, lock, channel->signal_policy);
-
-	if (ret == 0 && signal)
-		vmbus_setevent(channel);
-
-	return ret;
+	return hv_ringbuffer_write(channel, bufferlist, 3,
+				   lock, true);
 }
 EXPORT_SYMBOL_GPL(vmbus_sendpacket_multipagebuffer);
 
--- a/drivers/hv/hyperv_vmbus.h
+++ b/drivers/hv/hyperv_vmbus.h
@@ -527,10 +527,10 @@ int hv_ringbuffer_init(struct hv_ring_bu
 
 void hv_ringbuffer_cleanup(struct hv_ring_buffer_info *ring_info);
 
-int hv_ringbuffer_write(struct hv_ring_buffer_info *ring_info,
+int hv_ringbuffer_write(struct vmbus_channel *channel,
 		    struct kvec *kv_list,
-		    u32 kv_count, bool *signal, bool lock,
-		    enum hv_signal_policy policy);
+		    u32 kv_count, bool lock,
+		    bool kick_q);
 
 int hv_ringbuffer_read(struct hv_ring_buffer_info *inring_info,
 		       void *buffer, u32 buflen, u32 *buffer_actual_len,
--- a/drivers/hv/ring_buffer.c
+++ b/drivers/hv/ring_buffer.c
@@ -66,14 +66,25 @@ u32 hv_end_read(struct hv_ring_buffer_in
  *	   once the ring buffer is empty, it will clear the
  *	   interrupt_mask and re-check to see if new data has
  *	   arrived.
+ *
+ * KYS: Oct. 30, 2016:
+ * It looks like Windows hosts have logic to deal with DOS attacks that
+ * can be triggered if it receives interrupts when it is not expecting
+ * the interrupt. The host expects interrupts only when the ring
+ * transitions from empty to non-empty (or full to non full on the guest
+ * to host ring).
+ * So, base the signaling decision solely on the ring state until the
+ * host logic is fixed.
  */
 
-static bool hv_need_to_signal(u32 old_write, struct hv_ring_buffer_info *rbi,
-			      enum hv_signal_policy policy)
+static void hv_signal_on_write(u32 old_write, struct vmbus_channel *channel,
+			       bool kick_q)
 {
+	struct hv_ring_buffer_info *rbi = &channel->outbound;
+
 	virt_mb();
 	if (READ_ONCE(rbi->ring_buffer->interrupt_mask))
-		return false;
+		return;
 
 	/* check interrupt_mask before read_index */
 	virt_rmb();
@@ -82,9 +93,9 @@ static bool hv_need_to_signal(u32 old_wr
 	 * ring transitions from being empty to non-empty.
 	 */
 	if (old_write == READ_ONCE(rbi->ring_buffer->read_index))
-		return true;
+		vmbus_setevent(channel);
 
-	return false;
+	return;
 }
 
 /* Get the next write location for the specified ring buffer. */
@@ -273,9 +284,9 @@ void hv_ringbuffer_cleanup(struct hv_rin
 }
 
 /* Write to the ring buffer. */
-int hv_ringbuffer_write(struct hv_ring_buffer_info *outring_info,
-		    struct kvec *kv_list, u32 kv_count, bool *signal, bool lock,
-		    enum hv_signal_policy policy)
+int hv_ringbuffer_write(struct vmbus_channel *channel,
+		    struct kvec *kv_list, u32 kv_count, bool lock,
+		    bool kick_q)
 {
 	int i = 0;
 	u32 bytes_avail_towrite;
@@ -285,6 +296,7 @@ int hv_ringbuffer_write(struct hv_ring_b
 	u32 old_write;
 	u64 prev_indices = 0;
 	unsigned long flags = 0;
+	struct hv_ring_buffer_info *outring_info = &channel->outbound;
 
 	for (i = 0; i < kv_count; i++)
 		totalbytes_towrite += kv_list[i].iov_len;
@@ -337,7 +349,7 @@ int hv_ringbuffer_write(struct hv_ring_b
 	if (lock)
 		spin_unlock_irqrestore(&outring_info->ring_lock, flags);
 
-	*signal = hv_need_to_signal(old_write, outring_info, policy);
+	hv_signal_on_write(old_write, channel, kick_q);
 	return 0;
 }
 
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1447,6 +1447,7 @@ void hv_event_tasklet_enable(struct vmbu
 
 void hv_process_channel_removal(struct vmbus_channel *channel, u32 relid);
 
+void vmbus_setevent(struct vmbus_channel *channel);
 /*
  * Negotiated version with the Host.
  */



  parent reply	other threads:[~2017-02-13 13:05 UTC|newest]

Thread overview: 62+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-13 13:03 [PATCH 4.9 00/60] 4.9.10-stable review Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 01/60] cpufreq: intel_pstate: Disable energy efficiency optimization Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 02/60] acpi, nfit: fix acpi_nfit_flush_probe() crash Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 03/60] libnvdimm, namespace: do not delete namespace-id 0 Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 04/60] libnvdimm, pfn: fix memmap reservation size versus 4K alignment Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 05/60] dm rq: cope with DM device destruction while in dm_old_request_fn() Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 07/60] crypto: chcr - Check device is allocated before use Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 08/60] crypto: qat - fix bar discovery for c62x Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 09/60] crypto: qat - zero esram only for DH85x devices Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 10/60] crypto: ccp - Fix DMA operations when IOMMU is enabled Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 11/60] crypto: ccp - Fix double add when creating new DMA command Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 12/60] ARC: [arcompact] brown paper bag bug in unaligned access delay slot fixup Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 13/60] Input: uinput - fix crash when mixing old and new init style Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 14/60] selinux: fix off-by-one in setprocattr Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 15/60] Revert "x86/ioapic: Restore IO-APIC irq_chip retrigger callback" Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 16/60] rtlwifi: rtl8192ce: Fix loading of incorrect firmware Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 17/60] cpumask: use nr_cpumask_bits for parsing functions Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 18/60] mm/slub.c: fix random_seq offset destruction Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 19/60] ibmvscsis: Add SGL limit Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 20/60] hns: avoid stack overflow with CONFIG_KASAN Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 21/60] ARM: 8643/3: arm/ptrace: Preserve previous registers for short regset write Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 22/60] drm/i915: fix use-after-free in page_flip_completed() Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 23/60] drm/i915/bxt: Add MST support when do DPLL calculation Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 24/60] drm/atomic: Fix double free in drm_atomic_state_default_clear Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 25/60] target: Dont BUG_ON during NodeACL dynamic -> explicit conversion Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 26/60] target: Use correct SCSI status during EXTENDED_COPY exception Greg Kroah-Hartman
2017-02-13 13:03 ` [PATCH 4.9 27/60] target: Fix early transport_generic_handle_tmr abort scenario Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 28/60] target: Fix multi-session dynamic se_node_acl double free OOPs Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 29/60] target: Fix COMPARE_AND_WRITE ref leak for non GOOD status Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 31/60] ARM: 8642/1: LPAE: catch pending imprecise abort on unmask Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 33/60] nl80211: Fix mesh HT operation check Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 34/60] mac80211: Fix adding of mesh vendor IEs Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 35/60] net/mlx5e: Modify TIRs hash only when its needed Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 36/60] Drivers: hv: vmbus: Base host signaling strictly on the ring state Greg Kroah-Hartman
2017-02-13 13:04 ` Greg Kroah-Hartman [this message]
2017-02-13 13:04 ` [PATCH 4.9 38/60] Drivers: hv: vmbus: On the read path cleanup the logic to interrupt the host Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 39/60] Drivers: hv: vmbus: finally fix hv_need_to_signal_on_read() Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 40/60] scsi: zfcp: fix use-after-free by not tracing WKA port open/close on failed send Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 41/60] scsi: aacraid: Fix INTx/MSI-x issue with older controllers Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 42/60] scsi: mpt3sas: disable ASPM for MPI2 controllers Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 43/60] scsi: qla2xxx: Avoid that issuing a LIP triggers a kernel crash Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 44/60] btrfs: fix btrfs_compat_ioctl failures on non-compat ioctls Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 45/60] powerpc/mm/radix: Update ERAT flushes when invalidating TLB Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 46/60] powerpc/powernv: Fix CPU hotplug to handle waking on HVI Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 47/60] xen-netfront: Delete rx_refill_timer in xennet_disconnect_backend() Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 48/60] ALSA: hda - adding a new NV HDMI/DP codec ID in the driver Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 49/60] ALSA: seq: Fix race at creating a queue Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 50/60] ALSA: seq: Dont handle loop timeout at snd_seq_pool_done() Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 51/60] Revert "ALSA: line6: Only determine control port properties if needed" Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 52/60] x86/mm/ptdump: Fix soft lockup in page table walker Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 53/60] x86/CPU/AMD: Bring back Compute Unit ID Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 54/60] x86/CPU/AMD: Fix Zen SMT topology Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 55/60] IB/rxe: Fix resid update Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 56/60] IB/rxe: Fix mem_check_range integer overflow Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 57/60] stacktrace, lockdep: Fix address, newline ugliness Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 58/60] perf diff: Fix -o/--order option behavior (again) Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 59/60] perf diff: Fix segfault on perf diff -o N option Greg Kroah-Hartman
2017-02-13 13:04 ` [PATCH 4.9 60/60] perf/core: Fix crash in perf_event_read() Greg Kroah-Hartman
2017-02-13 17:09 ` [PATCH 4.9 00/60] 4.9.10-stable review Shuah Khan
2017-02-13 17:24   ` Greg Kroah-Hartman
2017-02-13 20:03 ` Guenter Roeck
2017-02-14 22:54   ` Greg Kroah-Hartman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170213130338.264194685@linuxfoundation.org \
    --to=gregkh@linuxfoundation.org \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rolf.neugebauer@docker.com \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).