From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: linux-kernel@vger.kernel.org
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
stable@vger.kernel.org, Dave Jiang <dave.jiang@intel.com>,
Vinod Koul <vinod.koul@intel.com>,
Alexander Duyck <alexander.h.duyck@intel.com>,
David Whipple <whipple@securedatainnovations.ch>,
"David S. Miller" <davem@davemloft.net>,
Dan Williams <dan.j.williams@intel.com>
Subject: [PATCH 3.4 18/43] net_dma: mark broken
Date: Mon, 6 Jan 2014 14:39:35 -0800 [thread overview]
Message-ID: <20140106223942.772275601@linuxfoundation.org> (raw)
In-Reply-To: <20140106223942.259651490@linuxfoundation.org>
3.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams <dan.j.williams@intel.com>
commit 77873803363c9e831fc1d1e6895c084279090c22 upstream.
net_dma can cause data to be copied to a stale mapping if a
copy-on-write fault occurs during dma. The application sees missing
data.
The following trace is triggered by modifying the kernel to WARN if it
ever triggers copy-on-write on a page that is undergoing dma:
WARNING: CPU: 24 PID: 2529 at lib/dma-debug.c:485 debug_dma_assert_idle+0xd2/0x120()
ioatdma 0000:00:04.0: DMA-API: cpu touching an active dma mapped page [pfn=0x16bcd9]
Modules linked in: iTCO_wdt iTCO_vendor_support ioatdma lpc_ich pcspkr dca
CPU: 24 PID: 2529 Comm: linbug Tainted: G W 3.13.0-rc1+ #353
00000000000001e5 ffff88016f45f688 ffffffff81751041 ffff88017ab0ef70
ffff88016f45f6d8 ffff88016f45f6c8 ffffffff8104ed9c ffffffff810f3646
ffff8801768f4840 0000000000000282 ffff88016f6cca10 00007fa2bb699349
Call Trace:
[<ffffffff81751041>] dump_stack+0x46/0x58
[<ffffffff8104ed9c>] warn_slowpath_common+0x8c/0xc0
[<ffffffff810f3646>] ? ftrace_pid_func+0x26/0x30
[<ffffffff8104ee86>] warn_slowpath_fmt+0x46/0x50
[<ffffffff8139c062>] debug_dma_assert_idle+0xd2/0x120
[<ffffffff81154a40>] do_wp_page+0xd0/0x790
[<ffffffff811582ac>] handle_mm_fault+0x51c/0xde0
[<ffffffff813830b9>] ? copy_user_enhanced_fast_string+0x9/0x20
[<ffffffff8175fc2c>] __do_page_fault+0x19c/0x530
[<ffffffff8175c196>] ? _raw_spin_lock_bh+0x16/0x40
[<ffffffff810f3539>] ? trace_clock_local+0x9/0x10
[<ffffffff810fa1f4>] ? rb_reserve_next_event+0x64/0x310
[<ffffffffa0014c00>] ? ioat2_dma_prep_memcpy_lock+0x60/0x130 [ioatdma]
[<ffffffff8175ffce>] do_page_fault+0xe/0x10
[<ffffffff8175c862>] page_fault+0x22/0x30
[<ffffffff81643991>] ? __kfree_skb+0x51/0xd0
[<ffffffff813830b9>] ? copy_user_enhanced_fast_string+0x9/0x20
[<ffffffff81388ea2>] ? memcpy_toiovec+0x52/0xa0
[<ffffffff8164770f>] skb_copy_datagram_iovec+0x5f/0x2a0
[<ffffffff8169d0f4>] tcp_rcv_established+0x674/0x7f0
[<ffffffff816a68c5>] tcp_v4_do_rcv+0x2e5/0x4a0
[..]
---[ end trace e30e3b01191b7617 ]---
Mapped at:
[<ffffffff8139c169>] debug_dma_map_page+0xb9/0x160
[<ffffffff8142bf47>] dma_async_memcpy_pg_to_pg+0x127/0x210
[<ffffffff8142cce9>] dma_memcpy_pg_to_iovec+0x119/0x1f0
[<ffffffff81669d3c>] dma_skb_copy_datagram_iovec+0x11c/0x2b0
[<ffffffff8169d1ca>] tcp_rcv_established+0x74a/0x7f0:
...the problem is that the receive path falls back to cpu-copy in
several locations and this trace is just one of the areas. A few
options were considered to fix this:
1/ sync all dma whenever a cpu copy branch is taken
2/ modify the page fault handler to hold off while dma is in-flight
Option 1 adds yet more cpu overhead to an "offload" that struggles to compete
with cpu-copy. Option 2 adds checks for behavior that is already documented as
broken when using get_user_pages(). At a minimum a debug mode is warranted to
catch and flag these violations of the dma-api vs get_user_pages().
Thanks to David for his reproducer.
Cc: Dave Jiang <dave.jiang@intel.com>
Cc: Vinod Koul <vinod.koul@intel.com>
Cc: Alexander Duyck <alexander.h.duyck@intel.com>
Reported-by: David Whipple <whipple@securedatainnovations.ch>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
drivers/dma/Kconfig | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/dma/Kconfig
+++ b/drivers/dma/Kconfig
@@ -269,6 +269,7 @@ config NET_DMA
bool "Network: TCP receive copy offload"
depends on DMA_ENGINE && NET
default (INTEL_IOATDMA || FSL_DMA)
+ depends on BROKEN
help
This enables the use of DMA engines in the network stack to
offload receive copy-to-user operations, freeing CPU cycles.
next prev parent reply other threads:[~2014-01-06 22:39 UTC|newest]
Thread overview: 46+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-06 22:39 [PATCH 3.4 00/43] 3.4.76-stable review Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 01/43] USB: serial: fix race in generic write Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 02/43] ceph: cleanup aborted requests when re-sending requests Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 03/43] ceph: wake up safe waiters when unregistering request Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 04/43] powerpc: kvm: fix rare but potential deadlock scene Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 05/43] TTY: pmac_zilog, check existence of ports in pmz_console_init() Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 06/43] ASoC: wm8904: fix DSP mode B configuration Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 07/43] ALSA: Add SNDRV_PCM_STATE_PAUSED case in wait_for_avail function Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 09/43] selinux: fix broken peer recv check Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 10/43] selinux: selinux_setprocattr()->ptrace_parent() needs rcu_read_lock() Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 11/43] ftrace: Initialize the ftrace profiler for each possible cpu Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 12/43] intel_idle: initial IVB support Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 13/43] intel_idle: enable IVB Xeon support Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 14/43] ext4: fix use-after-free in ext4_mb_new_blocks Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 16/43] sched/rt: Fix rqs cpupri leak while enqueue/dequeue child RT entities Greg Kroah-Hartman
2014-01-06 22:39 ` Greg Kroah-Hartman [this message]
2014-01-06 22:39 ` [PATCH 3.4 19/43] drm/radeon: fix asic gfx values for scrapper asics Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 20/43] drm/radeon: 0x9649 is SUMO2 not SUMO Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 21/43] ceph: Avoid data inconsistency due to d-cache aliasing in readpage() Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 22/43] tg3: Expand 4g_overflow_test workaround to skb fragments of any size Greg Kroah-Hartman
2014-01-06 22:58 ` Eric Dumazet
2014-01-06 23:05 ` Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 23/43] dm9601: fix reception of full size ethernet frames on dm9620/dm9621a Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 24/43] dm9601: work around tx fifo sync issue on dm962x Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 25/43] ath9k: Fix interrupt handling for the AR9002 family Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 26/43] ath9k_htc: properly set MAC address and BSSID mask Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 27/43] powerpc: Fix bad stack check in exception entry Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 28/43] powerpc: Align p_end Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 29/43] cpupower: Fix segfault due to incorrect getopt_long arugments Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 30/43] libata: add ATA_HORKAGE_BROKEN_FPDMA_AA quirk for Seagate Momentus SpinPoint M8 Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 31/43] radiotap: fix bitmap-end-finding buffer overrun Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 32/43] rtlwifi: pci: Fix oops on driver unload Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 33/43] mm/hugetlb: check for pte NULL pointer in __page_check_address() Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 34/43] Input: allocate absinfo data when setting ABS capability Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 35/43] GFS2: dont hold s_umount over blkdev_put Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 36/43] GFS2: Fix incorrect invalidation for DIO/buffered I/O Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 37/43] jbd2: dont BUG but return ENOSPC if a handle runs out of space Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 38/43] gpio: msm: Fix irq mask/unmask by writing bits instead of numbers Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 39/43] sched: Avoid throttle_cfs_rq() racing with period_timer stopping Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 40/43] sh: always link in helper functions extracted from libgcc Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 41/43] selinux: look for IPsec labels on both inbound and outbound packets Greg Kroah-Hartman
2014-01-06 22:39 ` [PATCH 3.4 42/43] selinux: process labeled IPsec TCP SYN-ACK packets properly in selinux_ip_postroute() Greg Kroah-Hartman
2014-01-06 22:40 ` [PATCH 3.4 43/43] hwmon: (w83l768ng) Fix fan speed control range Greg Kroah-Hartman
2014-01-07 5:01 ` [PATCH 3.4 00/43] 3.4.76-stable review Guenter Roeck
2014-01-07 15:22 ` Greg Kroah-Hartman
2014-01-07 19:09 ` Shuah Khan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140106223942.772275601@linuxfoundation.org \
--to=gregkh@linuxfoundation.org \
--cc=alexander.h.duyck@intel.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=davem@davemloft.net \
--cc=linux-kernel@vger.kernel.org \
--cc=stable@vger.kernel.org \
--cc=vinod.koul@intel.com \
--cc=whipple@securedatainnovations.ch \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).