Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] 8139cp: Prevent dev_close/cp_interrupt race on MTU change
From: David Woodhouse @ 2012-12-19 20:55 UTC (permalink / raw)
  To: David Miller; +Cc: jogreene, netdev
In-Reply-To: <20121219.124014.891279563982531558.davem@davemloft.net>

[-- Attachment #1: Type: text/plain, Size: 438 bytes --]

On Wed, 2012-12-19 at 12:40 -0800, David Miller wrote:
> You sent this as a "request for testing" last week, but I saw
> no testing on real hardware whatsoever.

Thanks for the reminder :)

Seems to work fine here. I haven't confirmed whether I actually see the
race or not but changing MTU on a live device works fine, even when it's
being ping-flooded.

Tested-by: David Woodhouse <David.Woodhouse@intel.com>

-- 
dwmw2


[-- Attachment #2: smime.p7s --]
[-- Type: application/x-pkcs7-signature, Size: 6171 bytes --]

^ permalink raw reply

* Re: TCP delayed ACK heuristic
From: David Miller @ 2012-12-19 20:59 UTC (permalink / raw)
  To: rick.jones2
  Cc: amwang, David.Laight, netdev, greearb, eric.dumazet, shemminger,
	tgraf
In-Reply-To: <50D209E9.2000504@hp.com>

From: Rick Jones <rick.jones2@hp.com>
Date: Wed, 19 Dec 2012 10:39:37 -0800

> On 12/18/2012 11:00 PM, Cong Wang wrote:
>> On Tue, 2012-12-18 at 16:39 +0000, David Laight wrote:
>>> There are problems with only implementing the acks
>>> specified by RFC1122.
>>
>> Yeah, the problem is if we can violate this RFC for getting better
>> performance. Or it is just a no-no?
>>
>> Although RFC 2525 mentions this as "Stretch ACK Violation", I am still
>> not sure if that means we can violate RFC1122 legally.
> 
> The term used in RFC1122 is "SHOULD" not "MUST."  Same for RFC2525
> when it talks about "Stretch ACK Violation."  A TCP stack may have
> behaviour which differs from a SHOULD so long as there is a reasonable
> reason for it.

Yes, but RFC2525 makes it very clear why we should not even
consider doing crap like this.

ACKs are the only information we have to detect loss.

And, for the same reasons that TCP VEGAS is fundamentally broken, we
cannot measure the pipe or some other receiver-side-visible piece of
information to determine when it's "safe" to stretch ACK.

And even if it's "safe", we should not do it so that losses are
accurately detected and we don't spuriously retransmit.

The only way to know when the bandwidth increases is to "test" it, by
sending more and more packets until drops happen.  That's why all
successful congestion control algorithms must operate on explicited
tested pieces of information.

Similarly, it's not really possible to universally know if it's safe
to stretch ACK or not.

Can we please drop this idea?  It has zero value and all downside as
far as I'm concerned.

Thanks.

^ permalink raw reply

* [PATCH 2/4] solos-pci: remove superfluous debug output
From: David Woodhouse @ 2012-12-19 21:01 UTC (permalink / raw)
  To: netdev; +Cc: Nathan Williams
In-Reply-To: <1355950881-31550-1-git-send-email-dwmw2@infradead.org>

From: Nathan Williams <nathan@traverse.com.au>

Signed-off-by: Nathan Williams <nathan@traverse.com.au>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
 drivers/atm/solos-pci.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/atm/solos-pci.c b/drivers/atm/solos-pci.c
index 473d808..50e5c56 100644
--- a/drivers/atm/solos-pci.c
+++ b/drivers/atm/solos-pci.c
@@ -452,7 +452,6 @@ static ssize_t console_show(struct device *dev, struct device_attribute *attr,
 
 	len = skb->len;
 	memcpy(buf, skb->data, len);
-	dev_dbg(&card->dev->dev, "len: %d\n", len);
 
 	kfree_skb(skb);
 	return len;
-- 
1.8.0.1

^ permalink raw reply related

* [PATCH 1/4] solos-pci: add GPIO support for newer versions on Geos board
From: David Woodhouse @ 2012-12-19 21:01 UTC (permalink / raw)
  To: netdev; +Cc: Nathan Williams

From: Nathan Williams <nathan@traverse.com.au>

dwmw2: Tidy up a little, simpler matching on which GPIO is being accessed,
       only register on newer boards, register under PCI device instead of
       duplicating them under each ATM device.

Signed-off-by: Nathan Williams <nathan@traverse.com.au>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
 drivers/atm/solos-pci.c | 105 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 105 insertions(+)

diff --git a/drivers/atm/solos-pci.c b/drivers/atm/solos-pci.c
index c909b7b..473d808 100644
--- a/drivers/atm/solos-pci.c
+++ b/drivers/atm/solos-pci.c
@@ -56,6 +56,7 @@
 #define FLASH_BUSY	0x60
 #define FPGA_MODE	0x5C
 #define FLASH_MODE	0x58
+#define GPIO_STATUS	0x54
 #define TX_DMA_ADDR(port)	(0x40 + (4 * (port)))
 #define RX_DMA_ADDR(port)	(0x30 + (4 * (port)))
 
@@ -498,6 +499,78 @@ static ssize_t console_store(struct device *dev, struct device_attribute *attr,
 	return err?:count;
 }
 
+struct geos_gpio_attr {
+	struct device_attribute attr;
+	int offset;
+};
+
+#define SOLOS_GPIO_ATTR(_name, _mode, _show, _store, _offset)	\
+	struct geos_gpio_attr gpio_attr_##_name = {		\
+		.attr = __ATTR(_name, _mode, _show, _store),	\
+		.offset = _offset }
+
+static ssize_t geos_gpio_store(struct device *dev, struct device_attribute *attr,
+			       const char *buf, size_t count)
+{
+	struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
+	struct geos_gpio_attr *gattr = container_of(attr, struct geos_gpio_attr, attr);
+	struct solos_card *card = pci_get_drvdata(pdev);
+	uint32_t data32;
+
+	if (count != 1 && (count != 2 || buf[1] != '\n'))
+		return -EINVAL;
+
+	spin_lock_irq(&card->param_queue_lock);
+	data32 = ioread32(card->config_regs + GPIO_STATUS);
+	if (buf[0] == '1') {
+		data32 |= 1 << gattr->offset;
+		iowrite32(data32, card->config_regs + GPIO_STATUS);
+	} else if (buf[0] == '0') {
+		data32 &= ~(1 << gattr->offset);
+		iowrite32(data32, card->config_regs + GPIO_STATUS);
+	} else {
+		count = -EINVAL;
+	}
+	spin_lock_irq(&card->param_queue_lock);
+	return count;
+}
+
+static ssize_t geos_gpio_show(struct device *dev, struct device_attribute *attr,
+			      char *buf)
+{
+	struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
+	struct geos_gpio_attr *gattr = container_of(attr, struct geos_gpio_attr, attr);
+	struct solos_card *card = pci_get_drvdata(pdev);
+	uint32_t data32;
+
+	data32 = ioread32(card->config_regs + GPIO_STATUS);
+	data32 = (data32 >> gattr->offset) & 1;
+
+	return sprintf(buf, "%d\n", data32);
+}
+
+static ssize_t hardware_show(struct device *dev, struct device_attribute *attr,
+			     char *buf)
+{
+	struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
+	struct geos_gpio_attr *gattr = container_of(attr, struct geos_gpio_attr, attr);
+	struct solos_card *card = pci_get_drvdata(pdev);
+	uint32_t data32;
+
+	data32 = ioread32(card->config_regs + GPIO_STATUS);
+	switch (gattr->offset) {
+	case 0:
+		/* HardwareVersion */
+		data32 = data32 & 0x1F;
+		break;
+	case 1:
+		/* HardwareVariant */
+		data32 = (data32 >> 5) & 0x0F;
+		break;
+	}
+	return sprintf(buf, "%d\n", data32);
+}
+
 static DEVICE_ATTR(console, 0644, console_show, console_store);
 
 
@@ -506,6 +579,14 @@ static DEVICE_ATTR(console, 0644, console_show, console_store);
 
 #include "solos-attrlist.c"
 
+static SOLOS_GPIO_ATTR(GPIO1, 0644, geos_gpio_show, geos_gpio_store, 9);
+static SOLOS_GPIO_ATTR(GPIO2, 0644, geos_gpio_show, geos_gpio_store, 10);
+static SOLOS_GPIO_ATTR(GPIO3, 0644, geos_gpio_show, geos_gpio_store, 11);
+static SOLOS_GPIO_ATTR(GPIO4, 0644, geos_gpio_show, geos_gpio_store, 12);
+static SOLOS_GPIO_ATTR(GPIO5, 0644, geos_gpio_show, geos_gpio_store, 13);
+static SOLOS_GPIO_ATTR(PushButton, 0444, geos_gpio_show, NULL, 14);
+static SOLOS_GPIO_ATTR(HardwareVersion, 0444, hardware_show, NULL, 0);
+static SOLOS_GPIO_ATTR(HardwareVariant, 0444, hardware_show, NULL, 1);
 #undef SOLOS_ATTR_RO
 #undef SOLOS_ATTR_RW
 
@@ -522,6 +603,23 @@ static struct attribute_group solos_attr_group = {
 	.name = "parameters",
 };
 
+static struct attribute *gpio_attrs[] = {
+	&gpio_attr_GPIO1.attr.attr,
+	&gpio_attr_GPIO2.attr.attr,
+	&gpio_attr_GPIO3.attr.attr,
+	&gpio_attr_GPIO4.attr.attr,
+	&gpio_attr_GPIO5.attr.attr,
+	&gpio_attr_PushButton.attr.attr,
+	&gpio_attr_HardwareVersion.attr.attr,
+	&gpio_attr_HardwareVariant.attr.attr,
+	NULL
+};
+
+static struct attribute_group gpio_attr_group = {
+	.attrs = gpio_attrs,
+	.name = "gpio",
+};
+
 static int flash_upgrade(struct solos_card *card, int chip)
 {
 	const struct firmware *fw;
@@ -1179,6 +1277,10 @@ static int fpga_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	if (err)
 		goto out_free_irq;
 
+	if (card->fpga_version >= DMA_SUPPORTED &&
+	    sysfs_create_group(&card->dev->dev.kobj, &gpio_attr_group))
+		dev_err(&card->dev->dev, "Could not register parameter group for GPIOs\n");
+
 	return 0;
 
  out_free_irq:
@@ -1289,6 +1391,9 @@ static void fpga_remove(struct pci_dev *dev)
 	iowrite32(1, card->config_regs + FPGA_MODE);
 	(void)ioread32(card->config_regs + FPGA_MODE); 
 
+	if (card->fpga_version >= DMA_SUPPORTED)
+		sysfs_remove_group(&card->dev->dev.kobj, &gpio_attr_group);
+
 	atm_remove(card);
 
 	free_irq(dev->irq, card);
-- 
1.8.0.1

^ permalink raw reply related

* [PATCH 3/4] solos-pci: add firmware upgrade support for new models
From: David Woodhouse @ 2012-12-19 21:01 UTC (permalink / raw)
  To: netdev; +Cc: Nathan Williams
In-Reply-To: <1355950881-31550-1-git-send-email-dwmw2@infradead.org>

From: Nathan Williams <nathan@traverse.com.au>

Signed-off-by: Nathan Williams <nathan@traverse.com.au>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
 drivers/atm/solos-pci.c | 53 +++++++++++++++++++++++++++++++++++++++----------
 1 file changed, 42 insertions(+), 11 deletions(-)

diff --git a/drivers/atm/solos-pci.c b/drivers/atm/solos-pci.c
index 50e5c56..aa4f35d 100644
--- a/drivers/atm/solos-pci.c
+++ b/drivers/atm/solos-pci.c
@@ -42,7 +42,8 @@
 #include <linux/swab.h>
 #include <linux/slab.h>
 
-#define VERSION "0.07"
+#define VERSION "1.04"
+#define DRIVER_VERSION 0x01
 #define PTAG "solos-pci"
 
 #define CONFIG_RAM_SIZE	128
@@ -57,16 +58,20 @@
 #define FPGA_MODE	0x5C
 #define FLASH_MODE	0x58
 #define GPIO_STATUS	0x54
+#define DRIVER_VER	0x50
 #define TX_DMA_ADDR(port)	(0x40 + (4 * (port)))
 #define RX_DMA_ADDR(port)	(0x30 + (4 * (port)))
 
 #define DATA_RAM_SIZE	32768
 #define BUF_SIZE	2048
 #define OLD_BUF_SIZE	4096 /* For FPGA versions <= 2*/
-#define FPGA_PAGE	528 /* FPGA flash page size*/
-#define SOLOS_PAGE	512 /* Solos flash page size*/
-#define FPGA_BLOCK	(FPGA_PAGE * 8) /* FPGA flash block size*/
-#define SOLOS_BLOCK	(SOLOS_PAGE * 8) /* Solos flash block size*/
+/* Old boards use ATMEL AD45DB161D flash */
+#define ATMEL_FPGA_PAGE	528 /* FPGA flash page size*/
+#define ATMEL_SOLOS_PAGE	512 /* Solos flash page size*/
+#define ATMEL_FPGA_BLOCK	(ATMEL_FPGA_PAGE * 8) /* FPGA block size*/
+#define ATMEL_SOLOS_BLOCK	(ATMEL_SOLOS_PAGE * 8) /* Solos block size*/
+/* Current boards use M25P/M25PE SPI flash */
+#define SPI_FLASH_BLOCK	(256 * 64)
 
 #define RX_BUF(card, nr) ((card->buffers) + (nr)*(card->buffer_size)*2)
 #define TX_BUF(card, nr) ((card->buffers) + (nr)*(card->buffer_size)*2 + (card->buffer_size))
@@ -128,6 +133,7 @@ struct solos_card {
 	int using_dma;
 	int fpga_version;
 	int buffer_size;
+	int atmel_flash;
 };
 
 
@@ -630,16 +636,25 @@ static int flash_upgrade(struct solos_card *card, int chip)
 	switch (chip) {
 	case 0:
 		fw_name = "solos-FPGA.bin";
-		blocksize = FPGA_BLOCK;
+		if (card->atmel_flash)
+			blocksize = ATMEL_FPGA_BLOCK;
+		else
+			blocksize = SPI_FLASH_BLOCK;
 		break;
 	case 1:
 		fw_name = "solos-Firmware.bin";
-		blocksize = SOLOS_BLOCK;
+		if (card->atmel_flash)
+			blocksize = ATMEL_SOLOS_BLOCK;
+		else
+			blocksize = SPI_FLASH_BLOCK;
 		break;
 	case 2:
 		if (card->fpga_version > LEGACY_BUFFERS){
 			fw_name = "solos-db-FPGA.bin";
-			blocksize = FPGA_BLOCK;
+			if (card->atmel_flash)
+				blocksize = ATMEL_FPGA_BLOCK;
+			else
+				blocksize = SPI_FLASH_BLOCK;
 		} else {
 			dev_info(&card->dev->dev, "FPGA version doesn't support"
 					" daughter board upgrades\n");
@@ -649,7 +664,10 @@ static int flash_upgrade(struct solos_card *card, int chip)
 	case 3:
 		if (card->fpga_version > LEGACY_BUFFERS){
 			fw_name = "solos-Firmware.bin";
-			blocksize = SOLOS_BLOCK;
+			if (card->atmel_flash)
+				blocksize = ATMEL_SOLOS_BLOCK;
+			else
+				blocksize = SPI_FLASH_BLOCK;
 		} else {
 			dev_info(&card->dev->dev, "FPGA version doesn't support"
 					" daughter board upgrades\n");
@@ -665,6 +683,9 @@ static int flash_upgrade(struct solos_card *card, int chip)
 
 	dev_info(&card->dev->dev, "Flash upgrade starting\n");
 
+	/* New FPGAs require driver version before permitting flash upgrades */
+	iowrite32(DRIVER_VERSION, card->config_regs + DRIVER_VER);
+
 	numblocks = fw->size / blocksize;
 	dev_info(&card->dev->dev, "Firmware size: %zd\n", fw->size);
 	dev_info(&card->dev->dev, "Number of blocks: %d\n", numblocks);
@@ -694,9 +715,13 @@ static int flash_upgrade(struct solos_card *card, int chip)
 		/* dev_info(&card->dev->dev, "Set FPGA Flash mode to Block Write\n"); */
 		iowrite32(((chip * 2) + 1), card->config_regs + FLASH_MODE);
 
-		/* Copy block to buffer, swapping each 16 bits */
+		/* Copy block to buffer, swapping each 16 bits for Atmel flash */
 		for(i = 0; i < blocksize; i += 4) {
-			uint32_t word = swahb32p((uint32_t *)(fw->data + offset + i));
+			uint32_t word;
+			if (card->atmel_flash)
+				word = swahb32p((uint32_t *)(fw->data + offset + i));
+			else
+				word = *(uint32_t *)(fw->data + offset + i);
 			if(card->fpga_version > LEGACY_BUFFERS)
 				iowrite32(word, FLASH_BUF + i);
 			else
@@ -1230,6 +1255,12 @@ static int fpga_probe(struct pci_dev *dev, const struct pci_device_id *id)
 		db_fpga_upgrade = db_firmware_upgrade = 0;
 	}
 
+	/* Stopped using Atmel flash after 0.03-38 */
+	if (fpga_ver < 39)
+		card->atmel_flash = 1;
+	else
+		card->atmel_flash = 0;
+
 	if (card->fpga_version >= DMA_SUPPORTED) {
 		pci_set_master(dev);
 		card->using_dma = 1;
-- 
1.8.0.1

^ permalink raw reply related

* [PATCH 4/4] solos-pci: ensure all TX packets are aligned to 4 bytes
From: David Woodhouse @ 2012-12-19 21:01 UTC (permalink / raw)
  To: netdev; +Cc: David Woodhouse
In-Reply-To: <1355950881-31550-1-git-send-email-dwmw2@infradead.org>

From: David Woodhouse <David.Woodhouse@intel.com>

The FPGA can't handled unaligned DMA (yet). So copy into an aligned buffer,
if skb->data isn't suitably aligned.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
---
 drivers/atm/solos-pci.c | 27 +++++++++++++++++++++++----
 1 file changed, 23 insertions(+), 4 deletions(-)

diff --git a/drivers/atm/solos-pci.c b/drivers/atm/solos-pci.c
index aa4f35d..d70abe7 100644
--- a/drivers/atm/solos-pci.c
+++ b/drivers/atm/solos-pci.c
@@ -128,9 +128,11 @@ struct solos_card {
 	struct sk_buff_head cli_queue[4];
 	struct sk_buff *tx_skb[4];
 	struct sk_buff *rx_skb[4];
+	unsigned char *dma_bounce;
 	wait_queue_head_t param_wq;
 	wait_queue_head_t fw_wq;
 	int using_dma;
+	int dma_alignment;
 	int fpga_version;
 	int buffer_size;
 	int atmel_flash;
@@ -1083,7 +1085,12 @@ static uint32_t fpga_tx(struct solos_card *card)
 				tx_started |= 1 << port;
 				oldskb = skb; /* We're done with this skb already */
 			} else if (skb && card->using_dma) {
-				SKB_CB(skb)->dma_addr = pci_map_single(card->dev, skb->data,
+				unsigned char *data = skb->data;
+				if ((unsigned long)data & card->dma_alignment) {
+					data = card->dma_bounce + (BUF_SIZE * port);
+					memcpy(data, skb->data, skb->len);
+				}
+				SKB_CB(skb)->dma_addr = pci_map_single(card->dev, data,
 								       skb->len, PCI_DMA_TODEVICE);
 				card->tx_skb[port] = skb;
 				iowrite32(SKB_CB(skb)->dma_addr,
@@ -1261,18 +1268,27 @@ static int fpga_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	else
 		card->atmel_flash = 0;
 
+	data32 = ioread32(card->config_regs + PORTS);
+	card->nr_ports = (data32 & 0x000000FF);
+
 	if (card->fpga_version >= DMA_SUPPORTED) {
 		pci_set_master(dev);
 		card->using_dma = 1;
+		if (1) { /* All known FPGA versions so far */
+			card->dma_alignment = 3;
+			card->dma_bounce = kmalloc(card->nr_ports * BUF_SIZE, GFP_KERNEL);
+			if (!card->dma_bounce) {
+				dev_warn(&card->dev->dev, "Failed to allocate DMA bounce buffers\n");
+				/* Fallback to MMIO doesn't work */
+				goto out_unmap_both;
+			}
+		}
 	} else {
 		card->using_dma = 0;
 		/* Set RX empty flag for all ports */
 		iowrite32(0xF0, card->config_regs + FLAGS_ADDR);
 	}
 
-	data32 = ioread32(card->config_regs + PORTS);
-	card->nr_ports = (data32 & 0x000000FF);
-
 	pci_set_drvdata(dev, card);
 
 	tasklet_init(&card->tlet, solos_bh, (unsigned long)card);
@@ -1319,6 +1335,7 @@ static int fpga_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	tasklet_kill(&card->tlet);
 	
  out_unmap_both:
+	kfree(card->dma_bounce);
 	pci_set_drvdata(dev, NULL);
 	pci_iounmap(dev, card->buffers);
  out_unmap_config:
@@ -1429,6 +1446,8 @@ static void fpga_remove(struct pci_dev *dev)
 	free_irq(dev->irq, card);
 	tasklet_kill(&card->tlet);
 
+	kfree(card->dma_bounce);
+
 	/* Release device from reset */
 	iowrite32(0, card->config_regs + FPGA_MODE);
 	(void)ioread32(card->config_regs + FPGA_MODE); 
-- 
1.8.0.1

^ permalink raw reply related

* Re: [PATCH 0/2] iputils: minor ninfod and ping6 fixes
From: YOSHIFUJI Hideaki @ 2012-12-19 21:05 UTC (permalink / raw)
  To: Jan Synacek; +Cc: netdev
In-Reply-To: <1354868724-15549-1-git-send-email-jsynacek@redhat.com>

Jan Synacek wrote:

> When calling ping6 with the flowlabel (e.g. `ping6 -F 123 ::1'), it exited with
> an error. For some reason, the errno was set when it should not have been. Maybe
> it shouldn't be checked at all, maybe just checking flowlabel for values below
> zero would be enough. I wanted to be on the safer side so I left the errno check
> in there.
> 
> Also, I fixed the rest of the unused variables in ninfod.
> 
> Jan Synacek (2):
>   ninfod: Fix more unused variables.
>   ping6: Fix -F switch.
> 
>  ninfod/ni_ifaddrs.c | 8 +-------
>  ping6.c             | 3 ++-
>  2 files changed, 3 insertions(+), 8 deletions(-)
> 

Fixes committed. Thank you!

--yoshfuji

^ permalink raw reply

* Re: [PATCH] 8139cp: Prevent dev_close/cp_interrupt race on MTU change
From: David Miller @ 2012-12-19 22:31 UTC (permalink / raw)
  To: dwmw2; +Cc: jogreene, netdev
In-Reply-To: <1355950547.18919.93.camel@shinybook.infradead.org>

From: David Woodhouse <dwmw2@infradead.org>
Date: Wed, 19 Dec 2012 20:55:47 +0000

> On Wed, 2012-12-19 at 12:40 -0800, David Miller wrote:
>> You sent this as a "request for testing" last week, but I saw
>> no testing on real hardware whatsoever.
> 
> Thanks for the reminder :)
> 
> Seems to work fine here. I haven't confirmed whether I actually see the
> race or not but changing MTU on a live device works fine, even when it's
> being ping-flooded.
> 
> Tested-by: David Woodhouse <David.Woodhouse@intel.com>

That's more like it, applied, thanks everyone. :-)

^ permalink raw reply

* Re: [PATCH 1/4] solos-pci: add GPIO support for newer versions on Geos board
From: David Miller @ 2012-12-19 22:54 UTC (permalink / raw)
  To: dwmw2; +Cc: netdev, nathan
In-Reply-To: <1355950881-31550-1-git-send-email-dwmw2@infradead.org>

From: David Woodhouse <dwmw2@infradead.org>
Date: Wed, 19 Dec 2012 21:01:18 +0000

> From: Nathan Williams <nathan@traverse.com.au>
> 
> dwmw2: Tidy up a little, simpler matching on which GPIO is being accessed,
>        only register on newer boards, register under PCI device instead of
>        duplicating them under each ATM device.
> 
> Signed-off-by: Nathan Williams <nathan@traverse.com.au>
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

Applied.

^ permalink raw reply

* Re: [PATCH 2/4] solos-pci: remove superfluous debug output
From: David Miller @ 2012-12-19 22:54 UTC (permalink / raw)
  To: dwmw2; +Cc: netdev, nathan
In-Reply-To: <1355950881-31550-2-git-send-email-dwmw2@infradead.org>

From: David Woodhouse <dwmw2@infradead.org>
Date: Wed, 19 Dec 2012 21:01:19 +0000

> From: Nathan Williams <nathan@traverse.com.au>
> 
> Signed-off-by: Nathan Williams <nathan@traverse.com.au>
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

Applied.

^ permalink raw reply

* Re: [PATCH 3/4] solos-pci: add firmware upgrade support for new models
From: David Miller @ 2012-12-19 22:54 UTC (permalink / raw)
  To: dwmw2; +Cc: netdev, nathan
In-Reply-To: <1355950881-31550-3-git-send-email-dwmw2@infradead.org>

From: David Woodhouse <dwmw2@infradead.org>
Date: Wed, 19 Dec 2012 21:01:20 +0000

> From: Nathan Williams <nathan@traverse.com.au>
> 
> Signed-off-by: Nathan Williams <nathan@traverse.com.au>
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

Applied.

^ permalink raw reply

* Re: [PATCH 4/4] solos-pci: ensure all TX packets are aligned to 4 bytes
From: David Miller @ 2012-12-19 22:54 UTC (permalink / raw)
  To: dwmw2; +Cc: netdev, David.Woodhouse
In-Reply-To: <1355950881-31550-4-git-send-email-dwmw2@infradead.org>

From: David Woodhouse <dwmw2@infradead.org>
Date: Wed, 19 Dec 2012 21:01:21 +0000

> From: David Woodhouse <David.Woodhouse@intel.com>
> 
> The FPGA can't handled unaligned DMA (yet). So copy into an aligned buffer,
> if skb->data isn't suitably aligned.
> 
> Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next V4 00/13] Add basic VLAN support to bridges
From: Vlad Yasevich @ 2012-12-19 22:58 UTC (permalink / raw)
  To: Andrew Collins
  Cc: netdev, shemminger, davem, or.gerlitz, jhs, mst, erdnetdev, jiri
In-Reply-To: <CAKTPYJTAB-oOW5UE9EbNxwA+XbhmJu1FLrvq_mU8B1Qi6trxeA@mail.gmail.com>

On 12/19/2012 05:54 PM, Andrew Collins wrote:
> On Wed, Dec 19, 2012 at 10:48 AM, Vlad Yasevich <vyasevic@redhat.com> wrote:
>> This series of patches provides an ability to add VLANs to the bridge
>> ports.  This is similar to what can be found in most switches.  The bridge
>> port may have any number of VLANs added to it including vlan 0 priority tagged
>> traffic.  When vlans are added to the port, only traffic tagged with particular
>> vlan will forwarded over this port.  Additionally, vlan ids are added to FDB
>> entries and become part of the lookup.  This way we correctly identify the FDB
>> entry.
>
> This is likely well beyond the scope of this change, but I figured I'd
> throw out the question anyway.  This changeset looks to bring the
> Linux bridging code closer to the 802.1Q-2005 definition of a bridge,
> which is nice to see, I'm curious if this changeset also opens up the
> possibility of supporting MSTP in the future?  The big thing I see
> missing is per-VLAN port state, although I'm not very familiar with
> the current STP/bridge interactions.  Has anyone put any thought into
> what other necessary bridge pieces might be missing for MSTP support?
> (specifically regarding bridge/vlan interaction, obviously something
> to handle the MSTP protocol itself would need to exist as well)
>

heh..  opening up all sorts of cans of worms today... :)

Have only given it some very passing thoughts.  Absolutely nothing 
concrete here.  Maybe someone else has.

-vlad

^ permalink raw reply

* Re: [PATCH V2 00/12] Add basic VLAN support to bridges
From: Vlad Yasevich @ 2012-12-19 22:59 UTC (permalink / raw)
  To: vyasevic; +Cc: Shmulik Ladkani, netdev, shemminger, davem, or.gerlitz, jhs, mst
In-Reply-To: <50D21D98.7020907@redhat.com>

On 12/19/2012 03:03 PM, Vlad Yasevich wrote:
> On 12/19/2012 02:37 PM, Shmulik Ladkani wrote:
>> Hi Vlad,
>>
>> On Wed, 19 Dec 2012 09:13:10 -0500 Vlad Yasevich <vyasevic@redhat.com>
>> wrote:
>>>> Why the "untagged vlan" is per-bridge global?
>>> It's not.  There is a per port untagged pointer where you can designate
>>> which VLAN is untagged/native on a port.
>>
>> Ok (misinterpreted the text in the cover letter).
>>
>>>> 802.1q switches usually allow conifguring per-vlan, per-port
>>>> tagged/untagged egress policy: each vid has its port membership map and
>>>> an accompanying port egress-policy map.
>>>> This gives great flexibility defining all sorts of configurations.
>>>
>>> Right, and that's what's provided here.
>>>    * Each VLAN has port membership map (net_bridge_vlan.portgroup).
>>>    * Each port has a list of vlans configured as well
>>> (net_port_vlan.vlan_list).
>>>    * Each port also has a single vlan that can be untagged
>>> (net_bridge_port.untagged).
>>>    * The bridge also has a single untagged vlan (net_bridge.untagged)
>>>
>>> The limitation (in switches as well) is that only a single VLAN
>>> may be untagged on any 1 port.
>>
>> Switches usually allow you to configure each port's egress policy per
>> vlan, and allow you to configure multiple vlans to _egress_ untagged
>> on a port.
>>
>>> If you have more then 1, you don't know
>>> which VLAN the untagged traffic belongs to.
>>
>> The port's PVID uniquely determines VID to associate with the frame
>> during _ingress_ on that port - in the case frame arrived untagged.
>>
>> This is unrelated to whether a frame having a specific VID would _egress_
>> tagged or untagged on that port.
>>
>
>
> Ahh...  I see what you mean.  You would like to separate
> ingress policy and egress policy with regard to how tags are applied...
> I haven't seen that type of config before.
>
> I did say "Basic VLAN support". :)
>
> In this set of patches ingress and egress policies are hardcoded the
> same...
>
> So, consider that what I am calling "untagged" in this series is
> really vlan associated with PVID.  To change the egress policy, we
> could add an untagged bitmap into the vlan.  Then the bitmap from the
> vlan would determine the egress policy.  If the port is in the "tagged"
> bitmap, frame leaves tagged. If the port is in the "untagged" bitmap,
> frame leaves untagged.
>
> The code to make this would would be simple enough.  The more
> interesting part would be the configuration :)


Actually, this looks much simpler then I originally thought.  I think I 
might have something half-baked tomorrow.

-vlad

>
>
>>>> Personally, I'd prefer a fully flexible vlan bridge allowing all sorts
>>>> of configurations (as available in 802.1q switches).
>>>>
>>>> What's the reason limiting such configurations?
>>>
>>> So, what do you see that's missing?
>>
>
> [ snip good example ]
>
>>
>> The bridge constructs needed for supporting such setups are:
>> - per port: PVID
>> - per VLAN: port membership map
>> - per VLAN: port egress policy map
>
> Ok, so from above, membership map is the exiting port_bitmap.  Egress
> policy map could be new untagged_bitmap.  We wouldn't need a tagged
> policy map since a port can't be "in egress policy, but not in
> membership map".
>
> Membership port_bitmap is consulted on egress for basic forward/drop
> decision (just as it is now).  Egress policy (untagged bitmap) is
> consulted to see how the forwarding is done.
>
> Sounds about right?  If so, I could probably work something up.
> Will probably leave the configuration for later as that might take a bit
> longer to figure out.
>
> -vlad
>
>>
>> I agree, tools other than a vlan bridge may implement such setups, but
>> using the vlan bridge would be preferred, mainly due to the simplicity.
>>
>> Regards,
>> Shmulik
>>
>

^ permalink raw reply

* Re: [PATCH] pkt_sched: act_xt support new Xtables interface
From: Jamal Hadi Salim @ 2012-12-19 23:00 UTC (permalink / raw)
  To: Hasan Chowdhury
  Cc: Stephen Hemminger, Jan Engelhardt, Yury Stankevich,
	netdev@vger.kernel.org, pablo, netfilter-devel
In-Reply-To: <CAASe=fQZGwjM_2PStRE0tje33Doi6TuwJJ3p7x-SRcwq3mQvRg@mail.gmail.com>


On 12-12-19 10:51 AM, Hasan Chowdhury wrote:
> Hi Jamal,
> I will test it once I get some opportunity , but think I like to know
> even before any  testing
>
> 1. What will be the new procedure to compile iproute2 after the patch
> apply (any new library or any configuration that needs to be adjusted )

git pull Stephens latest tree.
Apply patch 1 and compile.
When you are done compiling an m_xt.so will sit in the tc directory;
unfortunate that we are putting out this shared libs. Backup your distro
version and copy this over to that location. On ubuntu 12.04:
sudo cp tc/m_xt.so /usr/lib/tc/m_xt.so
will do it.

> 2.  tc filter add dev eth0 protocol ip parent 1: prio 3 u32 match ip src
> 192.168.0.0/16 <http://192.168.0.0/16>  flowid 1:1  action xt  -j MARK
> --set-mark 3
>
>
> is this still a valid command ?
>

Indeed it is.

cheers,
jamal

^ permalink raw reply

* Re: [PATCH net-next V4 00/13] Add basic VLAN support to bridges
From: Andrew Collins @ 2012-12-19 22:54 UTC (permalink / raw)
  To: Vlad Yasevich
  Cc: netdev, shemminger, davem, or.gerlitz, jhs, mst, erdnetdev, jiri
In-Reply-To: <1355939304-21804-1-git-send-email-vyasevic@redhat.com>

On Wed, Dec 19, 2012 at 10:48 AM, Vlad Yasevich <vyasevic@redhat.com> wrote:
> This series of patches provides an ability to add VLANs to the bridge
> ports.  This is similar to what can be found in most switches.  The bridge
> port may have any number of VLANs added to it including vlan 0 priority tagged
> traffic.  When vlans are added to the port, only traffic tagged with particular
> vlan will forwarded over this port.  Additionally, vlan ids are added to FDB
> entries and become part of the lookup.  This way we correctly identify the FDB
> entry.

This is likely well beyond the scope of this change, but I figured I'd
throw out the question anyway.  This changeset looks to bring the
Linux bridging code closer to the 802.1Q-2005 definition of a bridge,
which is nice to see, I'm curious if this changeset also opens up the
possibility of supporting MSTP in the future?  The big thing I see
missing is per-VLAN port state, although I'm not very familiar with
the current STP/bridge interactions.  Has anyone put any thought into
what other necessary bridge pieces might be missing for MSTP support?
(specifically regarding bridge/vlan interaction, obviously something
to handle the MSTP protocol itself would need to exist as well)

^ permalink raw reply

* Re: [PATCH] pkt_sched: act_xt support new Xtables interface
From: Jamal Hadi Salim @ 2012-12-19 23:05 UTC (permalink / raw)
  To: Jan Engelhardt
  Cc: Hasan Chowdhury, Stephen Hemminger, Yury Stankevich,
	netdev@vger.kernel.org, pablo, netfilter-devel
In-Reply-To: <alpine.LNX.2.01.1212191648490.13317@nerf07.vanv.qr>

On 12-12-19 10:52 AM, Jan Engelhardt wrote:
>
>
> Humm... that's a huge patch for what seems to be equal to act_ipt.c
> Let's do a cross-diff:
>

I was thinking of our little discussion when doing that.
The one reason i separated the two is so when the time is right you
can patch on top of only act_xt.c and eventually act_ipt.c will die..
Does changes on top of act_xt.c sound palatable to you?
Otherwise, you are right - it is overkill

cheers,
jamal

^ permalink raw reply

* Re: TCP delayed ACK heuristic
From: Eric Dumazet @ 2012-12-19 23:08 UTC (permalink / raw)
  To: Rick Jones
  Cc: Cong Wang, David Laight, netdev, Ben Greear, David Miller,
	Stephen Hemminger, Thomas Graf
In-Reply-To: <50D209E9.2000504@hp.com>

On Wed, 2012-12-19 at 10:39 -0800, Rick Jones wrote:
> On 12/18/2012 11:00 PM, Cong Wang wrote:
> > On Tue, 2012-12-18 at 16:39 +0000, David Laight wrote:
> >> There are problems with only implementing the acks
> >> specified by RFC1122.
> >
> > Yeah, the problem is if we can violate this RFC for getting better
> > performance. Or it is just a no-no?
> >
> > Although RFC 2525 mentions this as "Stretch ACK Violation", I am still
> > not sure if that means we can violate RFC1122 legally.
> 
> The term used in RFC1122 is "SHOULD" not "MUST."  Same for RFC2525 when 
> it talks about "Stretch ACK Violation."   A TCP stack may have behaviour 
> which differs from a SHOULD so long as there is a reasonable reason for it.

Generally speaking, there are no reasonable reasons, unless you control
both sender and receiver, and the path between.

ACK can be incredibly useful to recover from losses in a short time.

The vast majority of TCP sessions are small lived, and we send one ACK
per received segment anyway at beginning [1] or retransmits to let the
sender smoothly increase its cwnd, so an auto-tuning facility wont help
them that much.

For long and fast sessions, we have the LRO/GRO heuristic.

This leaves a fraction of flows where the ACK rate should not really
matter.


[1] This refers to the quickack mode

^ permalink raw reply

* Re: [PATCH] build: unbreak linkage of m_xt.so
From: Stephen Hemminger @ 2012-12-20  0:03 UTC (permalink / raw)
  To: Mike Frysinger
  Cc: Jan Engelhardt, stephen.hemminger, netdev, jhs, urykhy, shemonc,
	pablo, netfilter-devel
In-Reply-To: <201212181348.00379.vapier@gentoo.org>

[-- Attachment #1: Type: text/plain, Size: 393 bytes --]

On Tue, 18 Dec 2012 13:47:58 -0500
Mike Frysinger <vapier@gentoo.org> wrote:

> this patch is no longer necessary one you merged my:
> 	configure: move toolchain init to a function
> 
> it's actually undesirable to apply this after that since it makes the configure 
> script less clear again ...
> 
> sorry if my commit message wasn't obvious.
> -mike

ok, went back to old way.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply

* [GIT] Networking
From: David Miller @ 2012-12-20  0:06 UTC (permalink / raw)
  To: torvalds; +Cc: akpm, netdev, linux-kernel


1) Really fix tuntap SKB use after free bug, from Eric Dumazet.

2) Adjust SKB data pointer to point past the transport header before
   calling icmpv6_notify() so that the headers are in the state which
   that function expects.  From Duan Jiong.

3) Fix ambiguities in the new tuntap multi-queue APIs.  From
   Jason Wang.

4) mISDN needs to use del_timer_sync(), from Konstantin Khlebnikov.

5) Don't destroy mutex after freeing up device private in mac802154,
   fix also from Konstantin Khlebnikov.

6) Fix INET request socket leak in TCP and DCCP, from Christoph Paasch.

7) SCTP HMAC kconfig rework, from Neil Horman.

8) Fix SCTP jprobes function signature, otherwise things explode,
   from Daniel Borkmann.

9) Fix typo in ipv6-offload Makefile variable reference, from
   Simon Arlott.

10) Don't fail USBNET open just because remote wakeup isn't
    supported, from Oliver Neukum.

11) be2net driver bug fixes from Sathya Perla.

12) SOLOS PCI ATM driver bug fixes from Nathan Williams and David
    Woodhouse.
   
13) Fix MTU changing regression in 8139cp driver, from John Greene.

Please pull, thanks a lot!

The following changes since commit 17bc14b767cf0692420c43dbe5310ae98a5a7836:

  Revert "sched: Update_cfs_shares at period edge" (2012-12-14 07:20:43 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/davem/net master

for you to fetch changes up to 152a2a8b5e1d4cbe91a7c66f1028db15164a3766:

  solos-pci: ensure all TX packets are aligned to 4 bytes (2012-12-19 14:53:53 -0800)

----------------------------------------------------------------
Amerigo Wang (2):
      bridge: update selinux perm table for RTM_NEWMDB and RTM_DELMDB
      bridge: add flags to distinguish permanent mdb entires

Ang Way Chuang (1):
      bridge: remove temporary variable for MLDv2 maximum response code computation

Bjørn Mork (1):
      net: qmi_wwan: add ZTE MF880

Christoph Paasch (1):
      inet: Fix kmemleak in tcp_v4/6_syn_recv_sock and dccp_v4/6_request_recv_sock

Cong Ding (1):
      ipv6: addrconf.c: remove unnecessary "if"

Dan Williams (3):
      i2400m: add Intel 6150 device IDs
      qmi_wwan/cdc_ether: add Dell Wireless 5800 (Novatel E362) USB IDs
      cdc_ether: cleanup: use USB_DEVICE_AND_INTERFACE_INFO for Novatel 551/E362

Daniel Borkmann (1):
      sctp: jsctp_sf_eat_sack: fix jprobes function signature mismatch

David Woodhouse (1):
      solos-pci: ensure all TX packets are aligned to 4 bytes

Duan Jiong (1):
      ipv6: Change skb->data before using icmpv6_notify() to propagate redirect

Eric Dumazet (1):
      tuntap: reset network header before calling skb_get_rxhash()

Gabor Juhos (1):
      rt2x00: zero-out rx_status

Hannes Frederic Sowa (2):
      netlink: change presentation of portid in procfs to unsigned
      netlink: validate addr_len on bind

Jason Wang (2):
      tuntap: fix ambigious multiqueue API
      tuntap: fix sparse warning

John Greene (1):
      8139cp: Prevent dev_close/cp_interrupt race on MTU change

John W. Linville (1):
      Merge branch 'master' of git://git.kernel.org/.../linville/wireless into for-davem

Konstantin Khlebnikov (4):
      mISDN: fix race in timer canceling on module unloading
      stmmac: fix platform driver unregistering
      bonding: do not cancel works in bond_uninit()
      mac802154: fix destructon ordering for ieee802154 devices

Lennert Buytenhek (1):
      ksz884x: fix receive polling race condition

Marc Kleine-Budde (1):
      can: sja1000: fix compilation on x86

Nathan Williams (3):
      solos-pci: add GPIO support for newer versions on Geos board
      solos-pci: remove superfluous debug output
      solos-pci: add firmware upgrade support for new models

Neil Horman (1):
      sctp: Change defaults on cookie hmac selection

Oliver Neukum (3):
      usbnet: handle PM failure gracefully
      usbnet: generic manage_power()
      use generic usbnet_manage_power()

Sachin Kamat (2):
      drivers/net: Use of_match_ptr() macro in smc91x.c
      drivers/net: Use of_match_ptr() macro in smsc911x.c

Sathya Perla (2):
      be2net: fix be_close() to ensure all events are ack'ed
      be2net: fix wrong frag_idx reported by RX CQ

Shahed Shaikh (1):
      qlcnic: fix unused variable warnings

Shawn Guo (1):
      net: fec: forbid FEC_PTP on SoCs that do not support

Signed-off-by: Sony Chacko (1):
      qlcnic: update driver version

Simon Arlott (1):
      ipv6: Fix Makefile offload objects

Tony Lindgren (1):
      cpts: Fix build error caused by include of plat/clock.h

Vlad Yasevich (2):
      bridge: Do not unregister all PF_BRIDGE rtnl operations
      bridge: Correctly encode addresses when dumping mdb entries

Vladimir Kondratiev (1):
      wireless: fix Atheros drivers compilation

chas williams - CONTRACTOR (1):
      atm: use scnprintf() instead of sprintf()

 drivers/atm/solos-pci.c                              | 186 ++++++++++++++++++++++++++++++++++++++++----
 drivers/isdn/mISDN/dsp_core.c                        |   3 +-
 drivers/net/bonding/bond_main.c                      |   2 -
 drivers/net/can/sja1000/sja1000_of_platform.c        |   2 +-
 drivers/net/ethernet/emulex/benet/be.h               |   2 +-
 drivers/net/ethernet/emulex/benet/be_cmds.c          |   5 ++
 drivers/net/ethernet/emulex/benet/be_main.c          |  59 ++++++++++----
 drivers/net/ethernet/freescale/Kconfig               |   3 +-
 drivers/net/ethernet/micrel/ksz884x.c                |  12 ++-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic.h          |   4 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_ctx.c      |   5 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_hw.c       |   5 +-
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_main.c     |   5 --
 drivers/net/ethernet/qlogic/qlcnic/qlcnic_minidump.c |   3 +-
 drivers/net/ethernet/realtek/8139cp.c                |  18 +++--
 drivers/net/ethernet/smsc/smc91x.c                   |   4 +-
 drivers/net/ethernet/smsc/smsc911x.c                 |   4 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac.h         |   6 +-
 drivers/net/ethernet/stmicro/stmmac/stmmac_main.c    |  22 +++---
 drivers/net/ethernet/ti/cpts.c                       |   2 -
 drivers/net/tun.c                                    |  87 +++++++++++++++------
 drivers/net/usb/cdc_ether.c                          |  45 +++++------
 drivers/net/usb/cdc_ncm.c                            |  10 +--
 drivers/net/usb/qmi_wwan.c                           |  15 ++++
 drivers/net/usb/usbnet.c                             |  25 ++++--
 drivers/net/wimax/i2400m/i2400m-usb.h                |   3 +
 drivers/net/wimax/i2400m/usb.c                       |   6 ++
 drivers/net/wireless/Makefile                        |   2 +-
 drivers/net/wireless/rt2x00/rt2x00dev.c              |   8 ++
 include/linux/usb/usbnet.h                           |   3 +
 include/net/inet_connection_sock.h                   |   1 +
 include/net/ndisc.h                                  |   7 ++
 include/uapi/linux/if_bridge.h                       |   3 +
 net/atm/atm_sysfs.c                                  |  40 ++++------
 net/bridge/br_mdb.c                                  |  22 ++++--
 net/bridge/br_multicast.c                            |  13 ++--
 net/bridge/br_netlink.c                              |   1 -
 net/bridge/br_private.h                              |   5 +-
 net/dccp/ipv4.c                                      |   4 +-
 net/dccp/ipv6.c                                      |   3 +-
 net/ipv4/inet_connection_sock.c                      |  16 ++++
 net/ipv4/tcp_ipv4.c                                  |   6 +-
 net/ipv6/Makefile                                    |   2 +-
 net/ipv6/addrconf.c                                  |   3 +-
 net/ipv6/ndisc.c                                     |  17 ++++
 net/ipv6/tcp_ipv6.c                                  |   3 +-
 net/mac802154/ieee802154_dev.c                       |   4 +-
 net/netlink/af_netlink.c                             |   5 +-
 net/sctp/Kconfig                                     |  27 ++++++-
 net/sctp/probe.c                                     |   3 +-
 net/sctp/protocol.c                                  |   4 +-
 security/selinux/nlmsgtab.c                          |   2 +
 52 files changed, 545 insertions(+), 202 deletions(-)

^ permalink raw reply

* Re: TCP delayed ACK heuristic
From: Cong Wang @ 2012-12-20  3:23 UTC (permalink / raw)
  To: David Miller
  Cc: rick.jones2, David.Laight, netdev, greearb, eric.dumazet,
	shemminger, tgraf
In-Reply-To: <20121219.125939.1674292599518627751.davem@davemloft.net>

On Wed, 2012-12-19 at 12:59 -0800, David Miller wrote:
> 
> Yes, but RFC2525 makes it very clear why we should not even
> consider doing crap like this.
> 
> ACKs are the only information we have to detect loss.
> 
> And, for the same reasons that TCP VEGAS is fundamentally broken, we
> cannot measure the pipe or some other receiver-side-visible piece of
> information to determine when it's "safe" to stretch ACK.
> 
> And even if it's "safe", we should not do it so that losses are
> accurately detected and we don't spuriously retransmit.
> 
> The only way to know when the bandwidth increases is to "test" it, by
> sending more and more packets until drops happen.  That's why all
> successful congestion control algorithms must operate on explicited
> tested pieces of information.
> 
> Similarly, it's not really possible to universally know if it's safe
> to stretch ACK or not.

Sounds reasonable. Thanks for your explanation.

> 
> Can we please drop this idea?  It has zero value and all downside as
> far as I'm concerned.
> 

Yeah, I am just trying to see if there is any way to get a reasonable
heuristic.

So, can we at least have a sysctl to control the timeout of the delayed
ACK? I mean the minimum 40ms. TCP_QUICKACK can help too, but it requires
the receiver to modify the application and has to be set every time when
calling recv().

Thanks!

^ permalink raw reply

* Re: [PATCH V2 00/12] Add basic VLAN support to bridges
From: Shmulik Ladkani @ 2012-12-20  7:00 UTC (permalink / raw)
  To: vyasevic; +Cc: netdev, shemminger, davem, or.gerlitz, jhs, mst
In-Reply-To: <50D21D98.7020907@redhat.com>

Hi Vlad,

On Wed, 19 Dec 2012 15:03:36 -0500 Vlad Yasevich <vyasevic@redhat.com> wrote:
> > The port's PVID uniquely determines VID to associate with the frame
> > during _ingress_ on that port - in the case frame arrived untagged.
> >
> > This is unrelated to whether a frame having a specific VID would _egress_
> > tagged or untagged on that port.
> 
> Ahh...  I see what you mean.  You would like to separate
> ingress policy and egress policy with regard to how tags are applied...

Exactly.
Those are two different things; sometimes their configuration collide,
sometimes not.

> > The bridge constructs needed for supporting such setups are:
> > - per port: PVID
> > - per VLAN: port membership map
> > - per VLAN: port egress policy map
> 
> Ok, so from above, membership map is the exiting port_bitmap.

Ok.

> Egress policy map could be new untagged_bitmap.  We wouldn't need a tagged 
> policy map since a port can't be "in egress policy, but not in 
> membership map".

Yes, that is correct.

However I wouldn't call it "untagged_bitmap".
The name might suggest that "egress untagged" is an anomaly, where
"normal" behavior is egress tagged.
But as said, both are valid, its just a matter of configuration.

You basically need one more bit for each member port, stating
egress tagged/untagged.

> Sounds about right?  If so, I could probably work something up.

Yes, looking forward to review the code.

P.S.
Sorry for late spotting this; I don't follow net-dev regularly.
I hope to take a look at the code soon, see if I have any meaningful
comments.

Regards,
Shmulik

^ permalink raw reply

* Re: PMTU discovery is broken on kernel 3.7.1 for UDP sockets
From: Yurij M. Plotnikov @ 2012-12-20  7:14 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: netdev, Alexandra N. Kossovsky
In-Reply-To: <1355945864.2676.21.camel@bwh-desktop.uk.solarflarecom.com>

On 12/19/12 23:37, Ben Hutchings wrote:
> On Wed, 2012-12-19 at 18:27 +0400, Yurij M. Plotnikov wrote:
>    
>> On 12/19/12 17:35, Ben Hutchings wrote:
>>      
>>> On Wed, 2012-12-19 at 17:10 +0400, Yurij M. Plotnikov wrote:
>>>
>>>        
>>>> On kernel 3.7.1 I get strange behaviour of IP_MTU_DISCOVER socket
>>>> option. The behaviour in case of IP_PMTUDISC_DO and IP_PMTUDISC_WANT
>>>> values of IP_MTU_DISCOVER socket option on SOCK_DGRAM socket are the
>>>> same and packet is always sent with "Don't Fragment" bit in case of
>>>> IP_PMTUDISC_WANT. Also, the value of IP_MTU socket option is not updated.
>>>>
>>>>          
>>> You could try reverting:
>>>
>>> commit ee9a8f7ab2edf801b8b514c310455c94acc232f6
>>> Author: Steffen Klassert<steffen.klassert@secunet.com>
>>> Date:   Mon Oct 8 00:56:54 2012 +0000
>>>
>>>       ipv4: Don't report stale pmtu values to userspace
>>>
>>>       We report cached pmtu values even if they are already expired.
>>>       Change this to not report these values after they are expired
>>>       and fix a race in the expire time calculation, as suggested by
>>>       Eric Dumazet.
>>>
>>> Still, PMTU information is not supposed to expire for 10 minutes...
>>>
>>>
>>>        
>> With reverted commit there is no such problem on 3.7.1: IP_MTU is
>> updated and DF is set only for the first packet in case of
>> IP_PMTUDISC_WANT.
>>      
> [...]
>
> So it looks like something is going wrong with the expiry calculation
> here.
>
> This change shouldn't affect the PMTU actually used by the kernel, but
> could affect Onload since that relies on netlink route updates to keep
> in synch.  You didn't say you were using Onload, but if you are then we
> should not bother netdev with this until we can demonstrate a problem
> that involves only the kernel stack.
>
>    
The results were obtained on pure Linux kernel without using Onload.

Yurij.

^ permalink raw reply

* Re: [PATCH net-next V4 03/13] bridge: Validate that vlan is permitted on ingress
From: Cong Wang @ 2012-12-20  7:27 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1355939304-21804-4-git-send-email-vyasevic@redhat.com>

On Wed, 19 Dec 2012 at 17:48 GMT, Vlad Yasevich <vyasevic@redhat.com> wrote:
> +static inline u16 br_get_vlan(const struct sk_buff *skb)
> +{
> +	u16 tag;
> +
> +	if (vlan_tx_tag_present(skb))
> +		return vlan_tx_tag_get(skb) & VLAN_VID_MASK;
> +
> +	if (vlan_get_tag(skb, &tag))
> +		return 0;
> +
> +	return tag & VLAN_VID_MASK;
> +}
> +

Nitpick:
The name br_get_vlan() can easily confuse people with br_vlan_find().

Also, this function looks like not bridge-specific, how about moving
it to if_vlan.h?

^ permalink raw reply

* Re: PMTU discovery is broken on kernel 3.7.1 for UDP sockets
From: Steffen Klassert @ 2012-12-20  7:34 UTC (permalink / raw)
  To: Ben Hutchings; +Cc: Yurij M. Plotnikov, netdev, Alexandra N. Kossovsky
In-Reply-To: <1355945864.2676.21.camel@bwh-desktop.uk.solarflarecom.com>

On Wed, Dec 19, 2012 at 07:37:44PM +0000, Ben Hutchings wrote:
> On Wed, 2012-12-19 at 18:27 +0400, Yurij M. Plotnikov wrote:
> > On 12/19/12 17:35, Ben Hutchings wrote:
> > > On Wed, 2012-12-19 at 17:10 +0400, Yurij M. Plotnikov wrote:
> > >    
> > >> On kernel 3.7.1 I get strange behaviour of IP_MTU_DISCOVER socket
> > >> option. The behaviour in case of IP_PMTUDISC_DO and IP_PMTUDISC_WANT
> > >> values of IP_MTU_DISCOVER socket option on SOCK_DGRAM socket are the
> > >> same and packet is always sent with "Don't Fragment" bit in case of
> > >> IP_PMTUDISC_WANT. Also, the value of IP_MTU socket option is not updated.
> > >>      
> > > You could try reverting:
> > >
> > > commit ee9a8f7ab2edf801b8b514c310455c94acc232f6
> > > Author: Steffen Klassert<steffen.klassert@secunet.com>
> > > Date:   Mon Oct 8 00:56:54 2012 +0000
> > >
> > >      ipv4: Don't report stale pmtu values to userspace
> > >
> > >      We report cached pmtu values even if they are already expired.
> > >      Change this to not report these values after they are expired
> > >      and fix a race in the expire time calculation, as suggested by
> > >      Eric Dumazet.
> > >
> > > Still, PMTU information is not supposed to expire for 10 minutes...
> > >
> > >    
> > With reverted commit there is no such problem on 3.7.1: IP_MTU is 
> > updated and DF is set only for the first packet in case of 
> > IP_PMTUDISC_WANT.
> [...]
> 
> So it looks like something is going wrong with the expiry calculation
> here.
> 
> This change shouldn't affect the PMTU actually used by the kernel, but
> could affect Onload since that relies on netlink route updates to keep
> in synch.  You didn't say you were using Onload, but if you are then we
> should not bother netdev with this until we can demonstrate a problem
> that involves only the kernel stack.
> 

I'm really surprised that this change can have such an effect,
it changes nothing at the kernels pmtu handling. When looking
at the code, I found that we may report a mtu value from a stale
dst_entry when we query the mtu value with the IP_MTU socket
option. But a subsequent send() should update the socket cached
dst_entry, so at most one packet should be affected.

Does the patch below change anything?


diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index 3c9d208..1049ce0 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -1198,7 +1198,7 @@ static int do_ip_getsockopt(struct sock *sk, int level, int optname,
 	{
 		struct dst_entry *dst;
 		val = 0;
-		dst = sk_dst_get(sk);
+		dst = sk_dst_check(sk, 0);
 		if (dst) {
 			val = dst_mtu(dst);
 			dst_release(dst);

^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox