* [PATCH 0/3] Series short description
@ 2012-04-04 23:10 Dave Jiang
2012-04-04 23:10 ` [PATCH 1/3] ioat: ring size variables need to be 32bit to avoid overflow Dave Jiang
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Dave Jiang @ 2012-04-04 23:10 UTC (permalink / raw)
To: dan.j.williams, linux-kernel; +Cc: netdev, davem
The following series address issues with the ioatdma driver. The first
patch fixes a potential ring size overflow. The next two patches put
in alignment requirement for silicon errata on the ioatdma hardware related to
M2M ops and impacts NETDMA. The last patch will probably need to be
ack'd by David Miller as it touches the network subsystem.
---
Dave Jiang (3):
netdma: adding alignment check for NETDMA ops
ioatdma: DMA copy alignment needed to address IOAT DMA silicon errata
ioat: ring size variables need to be 32bit to avoid overflow
drivers/dma/dmaengine.c | 14 ++++++++++++++
drivers/dma/ioat/dma_v2.c | 4 ++--
drivers/dma/ioat/dma_v2.h | 4 ++--
drivers/dma/ioat/dma_v3.c | 41 +++++++++++++++++++++++++++++++++++++++++
include/linux/dmaengine.h | 1 +
net/ipv4/tcp.c | 4 ++--
net/ipv4/tcp_input.c | 2 +-
net/ipv4/tcp_ipv4.c | 2 +-
net/ipv6/tcp_ipv6.c | 2 +-
9 files changed, 65 insertions(+), 9 deletions(-)
--
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH 1/3] ioat: ring size variables need to be 32bit to avoid overflow
2012-04-04 23:10 [PATCH 0/3] Series short description Dave Jiang
@ 2012-04-04 23:10 ` Dave Jiang
2012-04-04 23:10 ` [PATCH 2/3] ioatdma: DMA copy alignment needed to address IOAT DMA silicon errata Dave Jiang
2012-04-04 23:10 ` [PATCH 3/3] netdma: adding alignment check for NETDMA ops Dave Jiang
2 siblings, 0 replies; 7+ messages in thread
From: Dave Jiang @ 2012-04-04 23:10 UTC (permalink / raw)
To: dan.j.williams, linux-kernel; +Cc: netdev, davem, Dave Jiang
The alloc order can be up to 16 and 1 << 16 will over flow the 16bit
integer. Change the appropriate variables to 16bit to avoid overflow.
Reported-by: Jim Harris <james.r.harris@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
drivers/dma/ioat/dma_v2.c | 4 ++--
drivers/dma/ioat/dma_v2.h | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/dma/ioat/dma_v2.c b/drivers/dma/ioat/dma_v2.c
index cb8864d..143cb1b 100644
--- a/drivers/dma/ioat/dma_v2.c
+++ b/drivers/dma/ioat/dma_v2.c
@@ -575,9 +575,9 @@ bool reshape_ring(struct ioat2_dma_chan *ioat, int order)
*/
struct ioat_chan_common *chan = &ioat->base;
struct dma_chan *c = &chan->common;
- const u16 curr_size = ioat2_ring_size(ioat);
+ const u32 curr_size = ioat2_ring_size(ioat);
const u16 active = ioat2_ring_active(ioat);
- const u16 new_size = 1 << order;
+ const u32 new_size = 1 << order;
struct ioat_ring_ent **ring;
u16 i;
diff --git a/drivers/dma/ioat/dma_v2.h b/drivers/dma/ioat/dma_v2.h
index a2c413b..be2a55b 100644
--- a/drivers/dma/ioat/dma_v2.h
+++ b/drivers/dma/ioat/dma_v2.h
@@ -74,7 +74,7 @@ static inline struct ioat2_dma_chan *to_ioat2_chan(struct dma_chan *c)
return container_of(chan, struct ioat2_dma_chan, base);
}
-static inline u16 ioat2_ring_size(struct ioat2_dma_chan *ioat)
+static inline u32 ioat2_ring_size(struct ioat2_dma_chan *ioat)
{
return 1 << ioat->alloc_order;
}
@@ -91,7 +91,7 @@ static inline u16 ioat2_ring_pending(struct ioat2_dma_chan *ioat)
return CIRC_CNT(ioat->head, ioat->issued, ioat2_ring_size(ioat));
}
-static inline u16 ioat2_ring_space(struct ioat2_dma_chan *ioat)
+static inline u32 ioat2_ring_space(struct ioat2_dma_chan *ioat)
{
return ioat2_ring_size(ioat) - ioat2_ring_active(ioat);
}
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 2/3] ioatdma: DMA copy alignment needed to address IOAT DMA silicon errata
2012-04-04 23:10 [PATCH 0/3] Series short description Dave Jiang
2012-04-04 23:10 ` [PATCH 1/3] ioat: ring size variables need to be 32bit to avoid overflow Dave Jiang
@ 2012-04-04 23:10 ` Dave Jiang
2012-04-04 23:10 ` [PATCH 3/3] netdma: adding alignment check for NETDMA ops Dave Jiang
2 siblings, 0 replies; 7+ messages in thread
From: Dave Jiang @ 2012-04-04 23:10 UTC (permalink / raw)
To: dan.j.williams, linux-kernel; +Cc: netdev, davem, Dave Jiang
Silicon errata where when RAID and legacy descriptors are mixed, the legacy
(memcpy and friends) operation must have alignment of 64 bytes to avoid
hanging. This effects Intel Xeon C55xx, C35xx, E5-2600.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
drivers/dma/ioat/dma_v3.c | 41 +++++++++++++++++++++++++++++++++++++++++
1 files changed, 41 insertions(+), 0 deletions(-)
diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
index 2dbf32b..c5cc3ba 100644
--- a/drivers/dma/ioat/dma_v3.c
+++ b/drivers/dma/ioat/dma_v3.c
@@ -1147,6 +1147,44 @@ static int ioat3_reset_hw(struct ioat_chan_common *chan)
return ioat2_reset_sync(chan, msecs_to_jiffies(200));
}
+static bool is_jf_ioat(struct pci_dev *pdev)
+{
+ switch (pdev->device) {
+ case PCI_DEVICE_ID_INTEL_IOAT_JSF0:
+ case PCI_DEVICE_ID_INTEL_IOAT_JSF1:
+ case PCI_DEVICE_ID_INTEL_IOAT_JSF2:
+ case PCI_DEVICE_ID_INTEL_IOAT_JSF3:
+ case PCI_DEVICE_ID_INTEL_IOAT_JSF4:
+ case PCI_DEVICE_ID_INTEL_IOAT_JSF5:
+ case PCI_DEVICE_ID_INTEL_IOAT_JSF6:
+ case PCI_DEVICE_ID_INTEL_IOAT_JSF7:
+ case PCI_DEVICE_ID_INTEL_IOAT_JSF8:
+ case PCI_DEVICE_ID_INTEL_IOAT_JSF9:
+ return true;
+ default:
+ return false;
+ }
+}
+
+static bool is_snb_ioat(struct pci_dev *pdev)
+{
+ switch (pdev->device) {
+ case PCI_DEVICE_ID_INTEL_IOAT_SNB0:
+ case PCI_DEVICE_ID_INTEL_IOAT_SNB1:
+ case PCI_DEVICE_ID_INTEL_IOAT_SNB2:
+ case PCI_DEVICE_ID_INTEL_IOAT_SNB3:
+ case PCI_DEVICE_ID_INTEL_IOAT_SNB4:
+ case PCI_DEVICE_ID_INTEL_IOAT_SNB5:
+ case PCI_DEVICE_ID_INTEL_IOAT_SNB6:
+ case PCI_DEVICE_ID_INTEL_IOAT_SNB7:
+ case PCI_DEVICE_ID_INTEL_IOAT_SNB8:
+ case PCI_DEVICE_ID_INTEL_IOAT_SNB9:
+ return true;
+ default:
+ return false;
+ }
+}
+
int __devinit ioat3_dma_probe(struct ioatdma_device *device, int dca)
{
struct pci_dev *pdev = device->pdev;
@@ -1167,6 +1205,9 @@ int __devinit ioat3_dma_probe(struct ioatdma_device *device, int dca)
dma->device_alloc_chan_resources = ioat2_alloc_chan_resources;
dma->device_free_chan_resources = ioat2_free_chan_resources;
+ if (is_jf_ioat(pdev) || is_snb_ioat(pdev))
+ dma->copy_align = 6;
+
dma_cap_set(DMA_INTERRUPT, dma->cap_mask);
dma->device_prep_dma_interrupt = ioat3_prep_interrupt_lock;
^ permalink raw reply related [flat|nested] 7+ messages in thread* [PATCH 3/3] netdma: adding alignment check for NETDMA ops
2012-04-04 23:10 [PATCH 0/3] Series short description Dave Jiang
2012-04-04 23:10 ` [PATCH 1/3] ioat: ring size variables need to be 32bit to avoid overflow Dave Jiang
2012-04-04 23:10 ` [PATCH 2/3] ioatdma: DMA copy alignment needed to address IOAT DMA silicon errata Dave Jiang
@ 2012-04-04 23:10 ` Dave Jiang
2012-04-05 0:18 ` David Miller
2 siblings, 1 reply; 7+ messages in thread
From: Dave Jiang @ 2012-04-04 23:10 UTC (permalink / raw)
To: dan.j.williams, linux-kernel; +Cc: netdev, davem, Dave Jiang
This is the fallout from adding memcpy alignment workaround for certain
IOATDMA hardware. NetDMA will only use DMA engine that can handle byte align
ops.
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
drivers/dma/dmaengine.c | 14 ++++++++++++++
include/linux/dmaengine.h | 1 +
net/ipv4/tcp.c | 4 ++--
net/ipv4/tcp_input.c | 2 +-
net/ipv4/tcp_ipv4.c | 2 +-
net/ipv6/tcp_ipv6.c | 2 +-
6 files changed, 20 insertions(+), 5 deletions(-)
diff --git a/drivers/dma/dmaengine.c b/drivers/dma/dmaengine.c
index a6c6051..0f1ca74 100644
--- a/drivers/dma/dmaengine.c
+++ b/drivers/dma/dmaengine.c
@@ -332,6 +332,20 @@ struct dma_chan *dma_find_channel(enum dma_transaction_type tx_type)
}
EXPORT_SYMBOL(dma_find_channel);
+/*
+ * net_dma_find_channel - find a channel for net_dma
+ * net_dma has alignment requirements
+ */
+struct dma_chan *net_dma_find_channel(void)
+{
+ struct dma_chan *chan = dma_find_channel(DMA_MEMCPY);
+ if (chan && !is_dma_copy_aligned(chan->device, 1, 1, 1))
+ return NULL;
+
+ return chan;
+}
+EXPORT_SYMBOL(net_dma_find_channel);
+
/**
* dma_issue_pending_all - flush all pending operations across all channels
*/
diff --git a/include/linux/dmaengine.h b/include/linux/dmaengine.h
index 679b349..a5bb3ad 100644
--- a/include/linux/dmaengine.h
+++ b/include/linux/dmaengine.h
@@ -948,6 +948,7 @@ int dma_async_device_register(struct dma_device *device);
void dma_async_device_unregister(struct dma_device *device);
void dma_run_dependencies(struct dma_async_tx_descriptor *tx);
struct dma_chan *dma_find_channel(enum dma_transaction_type tx_type);
+struct dma_chan *net_dma_find_channel(void);
#define dma_request_channel(mask, x, y) __dma_request_channel(&(mask), x, y)
/* --- Helper iov-locking functions --- */
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 22ef5f9..8712c5d 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1450,7 +1450,7 @@ int tcp_recvmsg(struct kiocb *iocb, struct sock *sk, struct msghdr *msg,
if ((available < target) &&
(len > sysctl_tcp_dma_copybreak) && !(flags & MSG_PEEK) &&
!sysctl_tcp_low_latency &&
- dma_find_channel(DMA_MEMCPY)) {
+ net_dma_find_channel()) {
preempt_enable_no_resched();
tp->ucopy.pinned_list =
dma_pin_iovec_pages(msg->msg_iov, len);
@@ -1665,7 +1665,7 @@ do_prequeue:
if (!(flags & MSG_TRUNC)) {
#ifdef CONFIG_NET_DMA
if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
- tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
+ tp->ucopy.dma_chan = net_dma_find_channel();
if (tp->ucopy.dma_chan) {
tp->ucopy.dma_cookie = dma_skb_copy_datagram_iovec(
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index b5e315f..27c676d 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -5190,7 +5190,7 @@ static int tcp_dma_try_early_copy(struct sock *sk, struct sk_buff *skb,
return 0;
if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
- tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
+ tp->ucopy.dma_chan = net_dma_find_channel();
if (tp->ucopy.dma_chan && skb_csum_unnecessary(skb)) {
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c
index fd54c5f..3810b6f 100644
--- a/net/ipv4/tcp_ipv4.c
+++ b/net/ipv4/tcp_ipv4.c
@@ -1727,7 +1727,7 @@ process:
#ifdef CONFIG_NET_DMA
struct tcp_sock *tp = tcp_sk(sk);
if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
- tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
+ tp->ucopy.dma_chan = net_dma_find_channel();
if (tp->ucopy.dma_chan)
ret = tcp_v4_do_rcv(sk, skb);
else
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 3edd05a..fcb3e4f 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -1755,7 +1755,7 @@ process:
#ifdef CONFIG_NET_DMA
struct tcp_sock *tp = tcp_sk(sk);
if (!tp->ucopy.dma_chan && tp->ucopy.pinned_list)
- tp->ucopy.dma_chan = dma_find_channel(DMA_MEMCPY);
+ tp->ucopy.dma_chan = net_dma_find_channel();
if (tp->ucopy.dma_chan)
ret = tcp_v6_do_rcv(sk, skb);
else
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 0/3] Series short description
@ 2010-11-30 10:14 Balbir Singh
0 siblings, 0 replies; 7+ messages in thread
From: Balbir Singh @ 2010-11-30 10:14 UTC (permalink / raw)
To: linux-mm, Christoph Lameter; +Cc: akpm, linux-kernel, kvm
The following series implements page cache control,
this is a split out version of patch 1 of version 3 of the
page cache optimization patches posted earlier at
http://www.mail-archive.com/kvm@vger.kernel.org/msg43654.html
Christoph Lamater recommended splitting out patch 1, which
is what this series does
Detailed Description
====================
This patch implements unmapped page cache control via preferred
page cache reclaim. The current patch hooks into kswapd and reclaims
page cache if the user has requested for unmapped page control.
This is useful in the following scenario
- In a virtualized environment with cache=writethrough, we see
double caching - (one in the host and one in the guest). As
we try to scale guests, cache usage across the system grows.
The goal of this patch is to reclaim page cache when Linux is running
as a guest and get the host to hold the page cache and manage it.
There might be temporary duplication, but in the long run, memory
in the guests would be used for mapped pages.
- The option is controlled via a boot option and the administrator
can selectively turn it on, on a need to use basis.
A lot of the code is borrowed from zone_reclaim_mode logic for
__zone_reclaim(). One might argue that the with ballooning and
KSM this feature is not very useful, but even with ballooning,
we need extra logic to balloon multiple VM machines and it is hard
to figure out the correct amount of memory to balloon. With these
patches applied, each guest has a sufficient amount of free memory
available, that can be easily seen and reclaimed by the balloon driver.
The additional memory in the guest can be reused for additional
applications or used to start additional guests/balance memory in
the host.
KSM currently does not de-duplicate host and guest page cache. The goal
of this patch is to help automatically balance unmapped page cache when
instructed to do so.
There are some magic numbers in use in the code, UNMAPPED_PAGE_RATIO
and the number of pages to reclaim when unmapped_page_control argument
is supplied. These numbers were chosen to avoid aggressiveness in
reaping page cache ever so frequently, at the same time providing control.
The sysctl for min_unmapped_ratio provides further control from
within the guest on the amount of unmapped pages to reclaim.
For a single VM - running kernbench
Enabled
Optimal load -j 8 run number 1...
Optimal load -j 8 run number 2...
Optimal load -j 8 run number 3...
Optimal load -j 8 run number 4...
Optimal load -j 8 run number 5...
Average Optimal load -j 8 Run (std deviation):
Elapsed Time 273.726 (1.2683)
User Time 190.014 (0.589941)
System Time 298.758 (1.72574)
Percent CPU 178 (0)
Context Switches 119953 (865.74)
Sleeps 38758 (795.074)
Disabled
Optimal load -j 8 run number 1...
Optimal load -j 8 run number 2...
Optimal load -j 8 run number 3...
Optimal load -j 8 run number 4...
Optimal load -j 8 run number 5...
Average Optimal load -j 8 Run (std deviation):
Elapsed Time 272.672 (0.453178)
User Time 189.7 (0.718157)
System Time 296.77 (0.845606)
Percent CPU 178 (0)
Context Switches 118822 (277.434)
Sleeps 37542.8 (545.922)
More data on the test results with the earlier patch is
at http://www.mail-archive.com/kvm@vger.kernel.org/msg43655.html
---
Balbir Singh (3):
Move zone_reclaim() outside of CONFIG_NUMA
Refactor zone_reclaim, move reusable functionality outside
Provide control over unmapped pages
include/linux/mmzone.h | 4 +-
include/linux/swap.h | 5 +-
mm/page_alloc.c | 7 ++-
mm/vmscan.c | 109 +++++++++++++++++++++++++++++++++++++++++-------
4 files changed, 104 insertions(+), 21 deletions(-)
--
Balbir
^ permalink raw reply [flat|nested] 7+ messages in thread* [PATCH 0/3] Series short description
@ 2009-07-10 22:59 Ben Blum
0 siblings, 0 replies; 7+ messages in thread
From: Ben Blum @ 2009-07-10 22:59 UTC (permalink / raw)
To: linux-kernel, containers, akpm, serue, lizf, menage
The following series implements...
---
Ben Blum (3):
Quick vmalloc vs kmalloc fix to the case where array size > KMALLOC_MAX_SIZE
Ensures correct concurrent opening/reading of pidlists across pid namespaces
Adds a read-only "procs" file similar to "tasks" that shows only unique tgids
include/linux/cgroup.h | 46 +++++-
kernel/cgroup.c | 378 ++++++++++++++++++++++++++++++++++--------------
2 files changed, 309 insertions(+), 115 deletions(-)
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2012-04-05 0:18 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-04 23:10 [PATCH 0/3] Series short description Dave Jiang
2012-04-04 23:10 ` [PATCH 1/3] ioat: ring size variables need to be 32bit to avoid overflow Dave Jiang
2012-04-04 23:10 ` [PATCH 2/3] ioatdma: DMA copy alignment needed to address IOAT DMA silicon errata Dave Jiang
2012-04-04 23:10 ` [PATCH 3/3] netdma: adding alignment check for NETDMA ops Dave Jiang
2012-04-05 0:18 ` David Miller
-- strict thread matches above, loose matches on Subject: below --
2010-11-30 10:14 [PATCH 0/3] Series short description Balbir Singh
2009-07-10 22:59 Ben Blum
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox