* [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver
@ 2005-06-30 10:16 Michael Ellerman
2005-06-30 10:20 ` [PATCH 1/12] iseries_veth: Make error messages more user friendly, and add a debug macro Michael Ellerman
` (12 more replies)
0 siblings, 13 replies; 14+ messages in thread
From: Michael Ellerman @ 2005-06-30 10:16 UTC (permalink / raw)
To: PPC64-dev, netdev, LKML
[-- Attachment #1: Type: text/plain, Size: 508 bytes --]
Hi y'all,
The following is a series of patches for the iseries_veth driver.
They're not ready for merging yet, as we need to do more extensive testing.
However any feedback you have will be greatly appreciated.
cheers
--
Michael Ellerman
IBM OzLabs
email: michael:ellerman.id.au
inmsg: mpe:jabber.org
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)
We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 6/12] iseries_veth: Fix broken promiscuous handling
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
2005-06-30 10:20 ` [PATCH 1/12] iseries_veth: Make error messages more user friendly, and add a debug macro Michael Ellerman
@ 2005-06-30 10:20 ` Michael Ellerman
2005-06-30 10:20 ` [PATCH 9/12] iseries_veth: Use ref counts to track lifecycle of connection structs Michael Ellerman
` (10 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2005-06-30 10:20 UTC (permalink / raw)
To: linuxppc64-dev, netdev, linux-kernel
Due to a logic bug, once promiscuous mode is enabled in the iseries_veth
driver it is never disabled.
The driver keeps two flags, promiscuous and all_mcast which have exactly the
same effect. This is because we only ever receive packets destined for us,
or multicast packets. So consolidate them into one promiscuous flag for
simplicity.
---
drivers/net/iseries_veth.c | 16 +++++-----------
1 files changed, 5 insertions(+), 11 deletions(-)
Index: veth-dev/drivers/net/iseries_veth.c
===================================================================
--- veth-dev.orig/drivers/net/iseries_veth.c
+++ veth-dev/drivers/net/iseries_veth.c
@@ -159,7 +159,6 @@ struct veth_port {
rwlock_t mcast_gate;
int promiscuous;
- int all_mcast;
int num_mcast;
u64 mcast_addr[VETH_MAX_MCAST];
};
@@ -754,17 +753,15 @@ static void veth_set_multicast_list(stru
write_lock_irqsave(&port->mcast_gate, flags);
- if (dev->flags & IFF_PROMISC) { /* set promiscuous mode */
- printk(KERN_INFO "%s: Promiscuous mode enabled.\n",
- dev->name);
+ if ((dev->flags & IFF_PROMISC) || (dev->flags & IFF_ALLMULTI) ||
+ (dev->mc_count > VETH_MAX_MCAST)) {
port->promiscuous = 1;
- } else if ( (dev->flags & IFF_ALLMULTI)
- || (dev->mc_count > VETH_MAX_MCAST) ) {
- port->all_mcast = 1;
} else {
struct dev_mc_list *dmi = dev->mc_list;
int i;
+ port->promiscuous = 0;
+
/* Update table */
port->num_mcast = 0;
@@ -1147,12 +1144,9 @@ static inline int veth_frame_wanted(stru
if ( (mac_addr == port->mac_addr) || (mac_addr == 0xffffffffffff0000) )
return 1;
- if (! (((char *) &mac_addr)[0] & 0x01))
- return 0;
-
read_lock_irqsave(&port->mcast_gate, flags);
- if (port->promiscuous || port->all_mcast) {
+ if (port->promiscuous) {
wanted = 1;
goto out;
}
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 4/12] iseries_veth: Remove a FIXME WRT deletion of the ack_timer
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
` (2 preceding siblings ...)
2005-06-30 10:20 ` [PATCH 9/12] iseries_veth: Use ref counts to track lifecycle of connection structs Michael Ellerman
@ 2005-06-30 10:20 ` Michael Ellerman
2005-06-30 10:20 ` [PATCH 5/12] iseries_veth: Try to avoid pathological reset behaviour Michael Ellerman
` (8 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2005-06-30 10:20 UTC (permalink / raw)
To: linuxppc64-dev, netdev, linux-kernel
The iseries_veth driver has a timer which we use to send acks. When the
connection is reset or stopped we need to delete the timer.
Currently we only call del_timer() when resetting a connection, which means
the timer might run again while the connection is being re-setup. As it turns
out that's ok, because the flags the timer consults have been reset.
It's cleaner though to call del_timer_sync() once we've dropped the lock,
although the timer may still run between us dropping the lock and calling
del_timer_sync(), but as above that's ok.
---
drivers/net/iseries_veth.c | 21 +++++++++++++--------
1 files changed, 13 insertions(+), 8 deletions(-)
Index: veth-dev/drivers/net/iseries_veth.c
===================================================================
--- veth-dev.orig/drivers/net/iseries_veth.c
+++ veth-dev/drivers/net/iseries_veth.c
@@ -450,13 +450,15 @@ static void veth_statemachine(void *p)
if (cnx->state & VETH_STATE_RESET) {
int i;
- del_timer(&cnx->ack_timer);
-
if (cnx->state & VETH_STATE_OPEN)
HvCallEvent_closeLpEventPath(cnx->remote_lp,
HvLpEvent_Type_VirtualLan);
- /* reset ack data */
+ /*
+ * Reset ack data. This prevents the ack_timer actually
+ * doing anything, even if it runs one more time when
+ * we drop the lock below.
+ */
memset(&cnx->pending_acks, 0xff, sizeof (cnx->pending_acks));
cnx->num_pending_acks = 0;
@@ -469,9 +471,16 @@ static void veth_statemachine(void *p)
if (cnx->msgs)
for (i = 0; i < VETH_NUMBUFFERS; ++i)
veth_recycle_msg(cnx, cnx->msgs + i);
+
+ /* Drop the lock so we can do stuff that might sleep or
+ * take other locks. */
spin_unlock_irq(&cnx->lock);
+
+ del_timer_sync(&cnx->ack_timer);
veth_flush_pending(cnx);
+
spin_lock_irq(&cnx->lock);
+
if (cnx->state & VETH_STATE_RESET)
goto restart;
}
@@ -658,12 +667,8 @@ static void veth_stop_connection(u8 rlp)
veth_kick_statemachine(cnx);
spin_unlock_irq(&cnx->lock);
+ /* Wait for the state machine to run. */
flush_scheduled_work();
-
- /* FIXME: not sure if this is necessary - will already have
- * been deleted by the state machine, just want to make sure
- * its not running any more */
- del_timer_sync(&cnx->ack_timer);
}
static void veth_destroy_connection(u8 rlp)
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 8/12] iseries_veth: Replace lock-protected atomic with an ordinary variable
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
` (4 preceding siblings ...)
2005-06-30 10:20 ` [PATCH 5/12] iseries_veth: Try to avoid pathological reset behaviour Michael Ellerman
@ 2005-06-30 10:20 ` Michael Ellerman
2005-06-30 10:20 ` [PATCH 10/12] iseries_veth: Remove TX timeout code Michael Ellerman
` (6 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2005-06-30 10:20 UTC (permalink / raw)
To: linuxppc64-dev, netdev, linux-kernel
The iseries_veth driver uses atomic ops to manipulate the in_use field of
one of its per-connection structures. However all references to the
flag occur while the connection's lock is held, so the atomic ops aren't
necessary.
---
drivers/net/iseries_veth.c | 13 +++++++------
1 files changed, 7 insertions(+), 6 deletions(-)
Index: veth-dev/drivers/net/iseries_veth.c
===================================================================
--- veth-dev.orig/drivers/net/iseries_veth.c
+++ veth-dev/drivers/net/iseries_veth.c
@@ -117,7 +117,7 @@ struct veth_msg {
struct veth_msg *next;
struct VethFramesData data;
int token;
- unsigned long in_use;
+ int in_use;
struct sk_buff *skb;
struct device *dev;
};
@@ -959,6 +959,8 @@ static int veth_transmit_to_one(struct s
goto drop;
}
+ msg->in_use = 1;
+
dma_length = skb->len;
dma_address = dma_map_single(port->dev, skb->data,
dma_length, DMA_TO_DEVICE);
@@ -973,7 +975,6 @@ static int veth_transmit_to_one(struct s
msg->data.addr[0] = dma_address;
msg->data.len[0] = dma_length;
msg->data.eofmask = 1 << VETH_EOF_SHIFT;
- set_bit(0, &(msg->in_use));
rc = veth_signaldata(cnx, VethEventTypeFrames, msg->token, &msg->data);
if (rc != HvLpEvent_Rc_Good)
@@ -983,10 +984,8 @@ static int veth_transmit_to_one(struct s
return 0;
recycle_and_drop:
+ /* we free the skb below, so tell veth_recycle_msg() not to. */
msg->skb = NULL;
- /* need to set in use to make veth_recycle_msg in case this
- * was a mapping failure */
- set_bit(0, &msg->in_use);
veth_recycle_msg(cnx, msg);
drop:
port->stats.tx_errors++;
@@ -1068,12 +1067,14 @@ static int veth_start_xmit(struct sk_buf
return 0;
}
+/* You musT hold the connection's lock when you call this function. */
static void veth_recycle_msg(struct veth_lpar_connection *cnx,
struct veth_msg *msg)
{
u32 dma_address, dma_length;
- if (test_and_clear_bit(0, &msg->in_use)) {
+ if (msg->in_use) {
+ msg->in_use = 0;
dma_address = msg->data.addr[0];
dma_length = msg->data.len[0];
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 5/12] iseries_veth: Try to avoid pathological reset behaviour
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
` (3 preceding siblings ...)
2005-06-30 10:20 ` [PATCH 4/12] iseries_veth: Remove a FIXME WRT deletion of the ack_timer Michael Ellerman
@ 2005-06-30 10:20 ` Michael Ellerman
2005-06-30 10:20 ` [PATCH 8/12] iseries_veth: Replace lock-protected atomic with an ordinary variable Michael Ellerman
` (7 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2005-06-30 10:20 UTC (permalink / raw)
To: linuxppc64-dev, netdev, linux-kernel
The iseries_veth driver contains a state machine which is used to manage
how connections are setup and neogotiated between LPARs.
If one side of a connection resets for some reason, the two LPARs can get
stuck in a race to re-setup the connection. This can lead to the connection
being declared dead by one or both ends. In practice this happens ~8/10 times
a connection is reset, although it's rare for connections to be reset.
(an example here: http://michael.ellerman.id.au/files/misc/veth-trace.html)
The core of the problem is that the end that resets the connection doesn't
wait for the other end to become aware of the reset. So the resetting end
starts setting the connection back up, and then receives a reset from the
other end (which is the response to the initial reset). And so on.
We're severely limited in what we can do to fix this. The protocol between
LPARs is essentially fixed, as we have to interoperate with both OS/400
and old Linux drivers. Which also means we need a fix that only changes the
code on one end.
The only fix I've found given that, is to just blindly sleep for a bit when
resetting the connection, in the hope that the other end will get itself
sorted. Needless to say I'd love it if someone has a better idea.
This does work, I've so far been unable to get it to break, whereas without
the fix a reset of one end will lead to a dead connection ~8/10 times.
---
drivers/net/iseries_veth.c | 23 +++++++++++++++++++++--
1 files changed, 21 insertions(+), 2 deletions(-)
Index: veth-dev/drivers/net/iseries_veth.c
===================================================================
--- veth-dev.orig/drivers/net/iseries_veth.c
+++ veth-dev/drivers/net/iseries_veth.c
@@ -324,8 +324,12 @@ static void veth_take_monitor_ack(struct
spin_lock_irqsave(&cnx->lock, flags);
veth_debug("cnx %d: lost connection.\n", cnx->remote_lp);
- cnx->state |= VETH_STATE_RESET;
- veth_kick_statemachine(cnx);
+ /* Avoid kicking the statemachine once we're shutdown.
+ * It's unnecessary and it could break veth_stop_connection(). */
+ if (! (cnx->state & VETH_STATE_SHUTDOWN)) {
+ cnx->state |= VETH_STATE_RESET;
+ veth_kick_statemachine(cnx);
+ }
spin_unlock_irqrestore(&cnx->lock, flags);
}
@@ -483,6 +487,12 @@ static void veth_statemachine(void *p)
if (cnx->state & VETH_STATE_RESET)
goto restart;
+
+ /* Hack, wait for the other end to reset itself. */
+ if (! (cnx->state & VETH_STATE_SHUTDOWN)) {
+ schedule_delayed_work(&cnx->statemachine_wq, 5 * HZ);
+ goto out;
+ }
}
if (cnx->state & VETH_STATE_SHUTDOWN)
@@ -667,6 +677,15 @@ static void veth_stop_connection(u8 rlp)
veth_kick_statemachine(cnx);
spin_unlock_irq(&cnx->lock);
+ /* There's a slim chance the reset code has just queued the
+ * statemachine to run in five seconds. If so we need to cancel
+ * that and requeue the work to run now. */
+ if (cancel_delayed_work(&cnx->statemachine_wq)) {
+ spin_lock_irq(&cnx->lock);
+ veth_kick_statemachine(cnx);
+ spin_unlock_irq(&cnx->lock);
+ }
+
/* Wait for the state machine to run. */
flush_scheduled_work();
}
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 2/12] iseries_veth: Cleanup error and debug messages
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
` (8 preceding siblings ...)
2005-06-30 10:20 ` [PATCH 7/12] iseries_veth: Remove redundant message stack lock Michael Ellerman
@ 2005-06-30 10:20 ` Michael Ellerman
2005-06-30 10:20 ` [PATCH 12/12] iseries_veth: Simplify full-queue handling Michael Ellerman
` (2 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2005-06-30 10:20 UTC (permalink / raw)
To: linuxppc64-dev, netdev, linux-kernel
This patch:
* converts uses of veth_printk() to veth_debug()/veth_error()
* makes terminology consistent, ie. always refer to LPAR not lpar
* be consistent about printing return codes as %d not %x
* make printf formats fit in 80 columns
---
drivers/net/iseries_veth.c | 87 ++++++++++++++++++++++-----------------------
1 files changed, 43 insertions(+), 44 deletions(-)
Index: veth-dev/drivers/net/iseries_veth.c
===================================================================
--- veth-dev.orig/drivers/net/iseries_veth.c
+++ veth-dev/drivers/net/iseries_veth.c
@@ -287,7 +287,7 @@ static void veth_take_cap(struct veth_lp
HvLpEvent_Type_VirtualLan);
if (cnx->state & VETH_STATE_GOTCAPS) {
- veth_error("Received a second capabilities from lpar %d\n",
+ veth_error("Received a second capabilities from LPAR %d.\n",
cnx->remote_lp);
event->base_event.xRc = HvLpEvent_Rc_BufferNotAvailable;
HvCallEvent_ackLpEvent((struct HvLpEvent *) event);
@@ -306,7 +306,7 @@ static void veth_take_cap_ack(struct vet
spin_lock_irqsave(&cnx->lock, flags);
if (cnx->state & VETH_STATE_GOTCAPACK) {
- veth_error("Received a second capabilities ack from lpar %d\n",
+ veth_error("Received a second capabilities ack from LPAR %d.\n",
cnx->remote_lp);
} else {
memcpy(&cnx->cap_ack_event, event,
@@ -323,8 +323,7 @@ static void veth_take_monitor_ack(struct
unsigned long flags;
spin_lock_irqsave(&cnx->lock, flags);
- veth_printk(KERN_DEBUG, "Monitor ack returned for lpar %d\n",
- cnx->remote_lp);
+ veth_debug("cnx %d: lost connection.\n", cnx->remote_lp);
cnx->state |= VETH_STATE_RESET;
veth_kick_statemachine(cnx);
spin_unlock_irqrestore(&cnx->lock, flags);
@@ -345,8 +344,8 @@ static void veth_handle_ack(struct VethL
veth_take_monitor_ack(cnx, event);
break;
default:
- veth_error("Unknown ack type %d from lpar %d\n",
- event->base_event.xSubtype, rlp);
+ veth_error("Unknown ack type %d from LPAR %d.\n",
+ event->base_event.xSubtype, rlp);
};
}
@@ -382,8 +381,8 @@ static void veth_handle_int(struct VethL
veth_receive(cnx, event);
break;
default:
- veth_error("Unknown interrupt type %d from lpar %d\n",
- event->base_event.xSubtype, rlp);
+ veth_error("Unknown interrupt type %d from LPAR %d.\n",
+ event->base_event.xSubtype, rlp);
};
}
@@ -409,8 +408,8 @@ static int veth_process_caps(struct veth
|| (remote_caps->ack_threshold > VETH_MAX_ACKS_PER_MSG)
|| (remote_caps->ack_threshold == 0)
|| (cnx->ack_timeout == 0) ) {
- veth_error("Received incompatible capabilities from lpar %d\n",
- cnx->remote_lp);
+ veth_error("Received incompatible capabilities from LPAR %d.\n",
+ cnx->remote_lp);
return HvLpEvent_Rc_InvalidSubtypeData;
}
@@ -427,8 +426,8 @@ static int veth_process_caps(struct veth
cnx->num_ack_events += num;
if (cnx->num_ack_events < num_acks_needed) {
- veth_error("Couldn't allocate enough ack events for lpar %d\n",
- cnx->remote_lp);
+ veth_error("Couldn't allocate enough ack events "
+ "for LPAR %d.\n", cnx->remote_lp);
return HvLpEvent_Rc_BufferNotAvailable;
}
@@ -507,9 +506,8 @@ static void veth_statemachine(void *p)
} else {
if ( (rc != HvLpEvent_Rc_PartitionDead)
&& (rc != HvLpEvent_Rc_PathClosed) )
- veth_error("Error sending monitor to "
- "lpar %d, rc=%x\n",
- rlp, (int) rc);
+ veth_error("Error sending monitor to LPAR %d, "
+ "rc = %d\n", rlp, rc);
/* Oh well, hope we get a cap from the other
* end and do better when that kicks us */
@@ -532,9 +530,9 @@ static void veth_statemachine(void *p)
} else {
if ( (rc != HvLpEvent_Rc_PartitionDead)
&& (rc != HvLpEvent_Rc_PathClosed) )
- veth_error("Error sending caps to "
- "lpar %d, rc=%x\n",
- rlp, (int) rc);
+ veth_error("Error sending caps to LPAR %d, "
+ "rc = %d\n", rlp, rc);
+
/* Oh well, hope we get a cap from the other
* end and do better when that kicks us */
goto out;
@@ -574,10 +572,8 @@ static void veth_statemachine(void *p)
add_timer(&cnx->ack_timer);
cnx->state |= VETH_STATE_READY;
} else {
- veth_printk(KERN_ERR, "Caps rejected (rc=%d) by "
- "lpar %d\n",
- cnx->cap_ack_event.base_event.xRc,
- rlp);
+ veth_error("Caps rejected by LPAR %d, rc = %d\n",
+ rlp, cnx->cap_ack_event.base_event.xRc);
goto cant_cope;
}
}
@@ -590,8 +586,8 @@ static void veth_statemachine(void *p)
/* FIXME: we get here if something happens we really can't
* cope with. The link will never work once we get here, and
* all we can do is not lock the rest of the system up */
- veth_error("Badness on connection to lpar %d (state=%04lx) "
- " - shutting down\n", rlp, cnx->state);
+ veth_error("Unrecoverable error on connection to LPAR %d, shutting down"
+ " (state = 0x%04lx)\n", rlp, cnx->state);
cnx->state |= VETH_STATE_SHUTDOWN;
spin_unlock_irq(&cnx->lock);
}
@@ -623,7 +619,7 @@ static int veth_init_connection(u8 rlp)
msgs = kmalloc(VETH_NUMBUFFERS * sizeof(struct veth_msg), GFP_KERNEL);
if (! msgs) {
- veth_error("Can't allocate buffers for lpar %d\n", rlp);
+ veth_error("Can't allocate buffers for LPAR %d.\n", rlp);
return -ENOMEM;
}
@@ -639,8 +635,7 @@ static int veth_init_connection(u8 rlp)
cnx->num_events = veth_allocate_events(rlp, 2 + VETH_NUMBUFFERS);
if (cnx->num_events < (2 + VETH_NUMBUFFERS)) {
- veth_error("Can't allocate events for lpar %d, only got %d\n",
- rlp, cnx->num_events);
+ veth_error("Can't allocate enough events for LPAR %d.\n", rlp);
return -ENOMEM;
}
@@ -898,15 +893,17 @@ static struct net_device * __init veth_p
rc = register_netdev(dev);
if (rc != 0) {
- veth_printk(KERN_ERR,
- "Failed to register ethernet device for vlan %d\n",
- vlan);
+ veth_error("Failed registering net device for vlan%d.\n", vlan);
free_netdev(dev);
return NULL;
}
- veth_printk(KERN_DEBUG, "%s attached to iSeries vlan %d (lpar_map=0x%04x)\n",
- dev->name, vlan, port->lpar_map);
+ veth_info("%s attached to iSeries vlan %d.\n", dev->name, vlan);
+
+ for (i = 0; i < HVMAXARCHITECTEDLPS; i++) {
+ if (port->lpar_map & (1 << i))
+ veth_info("%s connected to LPAR %d.\n", dev->name, i);
+ }
return dev;
}
@@ -1039,7 +1036,7 @@ static int veth_start_xmit(struct sk_buf
dev_kfree_skb(skb);
} else {
if (port->pending_skb) {
- veth_error("%s: Tx while skb was pending!\n",
+ veth_error("%s: TX while skb was pending!\n",
dev->name);
dev_kfree_skb(skb);
spin_unlock_irqrestore(&port->pending_gate, flags);
@@ -1075,10 +1072,10 @@ static void veth_recycle_msg(struct veth
memset(&msg->data, 0, sizeof(msg->data));
veth_stack_push(cnx, msg);
- } else
- if (cnx->state & VETH_STATE_OPEN)
- veth_error("Bogus frames ack from lpar %d (#%d)\n",
- cnx->remote_lp, msg->token);
+ } else if (cnx->state & VETH_STATE_OPEN) {
+ veth_error("Non-pending frame (# %d) acked by LPAR %d.\n",
+ cnx->remote_lp, msg->token);
+ }
}
static void veth_flush_pending(struct veth_lpar_connection *cnx)
@@ -1188,8 +1185,8 @@ static void veth_flush_acks(struct veth_
0, &cnx->pending_acks);
if (rc != HvLpEvent_Rc_Good)
- veth_error("Error 0x%x acking frames from lpar %d!\n",
- (unsigned)rc, cnx->remote_lp);
+ veth_error("Failed acking frames from LPAR %d, rc = %d\n",
+ cnx->remote_lp, (int)rc);
cnx->num_pending_acks = 0;
memset(&cnx->pending_acks, 0xff, sizeof(cnx->pending_acks));
@@ -1225,9 +1222,10 @@ static void veth_receive(struct veth_lpa
/* make sure that we have at least 1 EOF entry in the
* remaining entries */
if (! (senddata->eofmask >> (startchunk + VETH_EOF_SHIFT))) {
- veth_error("missing EOF frag in event "
- "eofmask=0x%x startchunk=%d\n",
- (unsigned) senddata->eofmask, startchunk);
+ veth_error("Missing EOF fragment in event "
+ "eofmask = 0x%x startchunk = %d\n",
+ (unsigned)senddata->eofmask,
+ startchunk);
break;
}
@@ -1246,8 +1244,9 @@ static void veth_receive(struct veth_lpa
/* nchunks == # of chunks in this frame */
if ((length - ETH_HLEN) > VETH_MAX_MTU) {
- veth_error("Received oversize frame from lpar %d "
- "(length=%d)\n", cnx->remote_lp, length);
+ veth_error("Received oversize frame from LPAR %d "
+ "(length = %d)\n",
+ cnx->remote_lp, length);
continue;
}
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 7/12] iseries_veth: Remove redundant message stack lock
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
` (7 preceding siblings ...)
2005-06-30 10:20 ` [PATCH 3/12] iseries_veth: Make init_connection() & destroy_connection() symmetrical Michael Ellerman
@ 2005-06-30 10:20 ` Michael Ellerman
2005-06-30 10:20 ` [PATCH 2/12] iseries_veth: Cleanup error and debug messages Michael Ellerman
` (3 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2005-06-30 10:20 UTC (permalink / raw)
To: linuxppc64-dev, netdev, linux-kernel
The iseries_veth driver keeps a stack of messages for each connection
and a lock to protect the stack. However there is also a per-connection lock
which makes the message stack redundant.
Remove the message stack lock and document the fact that callers of the
stack-manipulation functions must hold the connection's lock.
---
drivers/net/iseries_veth.c | 12 +++---------
1 files changed, 3 insertions(+), 9 deletions(-)
Index: veth-dev/drivers/net/iseries_veth.c
===================================================================
--- veth-dev.orig/drivers/net/iseries_veth.c
+++ veth-dev/drivers/net/iseries_veth.c
@@ -143,7 +143,6 @@ struct veth_lpar_connection {
struct VethCapData remote_caps;
u32 ack_timeout;
- spinlock_t msg_stack_lock;
struct veth_msg *msg_stack_head;
};
@@ -190,27 +189,23 @@ static void veth_timed_ack(unsigned long
#define veth_debug(fmt, args...) do {} while (0)
#endif
+/* You must hold the connection's lock when you call this function. */
static inline void veth_stack_push(struct veth_lpar_connection *cnx,
struct veth_msg *msg)
{
- unsigned long flags;
-
- spin_lock_irqsave(&cnx->msg_stack_lock, flags);
msg->next = cnx->msg_stack_head;
cnx->msg_stack_head = msg;
- spin_unlock_irqrestore(&cnx->msg_stack_lock, flags);
}
+/* You must hold the connection's lock when you call this function. */
static inline struct veth_msg *veth_stack_pop(struct veth_lpar_connection *cnx)
{
- unsigned long flags;
struct veth_msg *msg;
- spin_lock_irqsave(&cnx->msg_stack_lock, flags);
msg = cnx->msg_stack_head;
if (msg)
cnx->msg_stack_head = cnx->msg_stack_head->next;
- spin_unlock_irqrestore(&cnx->msg_stack_lock, flags);
+
return msg;
}
@@ -643,7 +638,6 @@ static int veth_init_connection(u8 rlp)
cnx->msgs = msgs;
memset(msgs, 0, VETH_NUMBUFFERS * sizeof(struct veth_msg));
- spin_lock_init(&cnx->msg_stack_lock);
for (i = 0; i < VETH_NUMBUFFERS; i++) {
msgs[i].token = i;
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 10/12] iseries_veth: Remove TX timeout code
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
` (5 preceding siblings ...)
2005-06-30 10:20 ` [PATCH 8/12] iseries_veth: Replace lock-protected atomic with an ordinary variable Michael Ellerman
@ 2005-06-30 10:20 ` Michael Ellerman
2005-06-30 10:20 ` [PATCH 3/12] iseries_veth: Make init_connection() & destroy_connection() symmetrical Michael Ellerman
` (5 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2005-06-30 10:20 UTC (permalink / raw)
To: linuxppc64-dev, netdev, linux-kernel
The iseries_veth driver uses the generic TX timeout watchdog, however a better
solution is in the works, so remove this code.
---
drivers/net/iseries_veth.c | 48 ---------------------------------------------
1 files changed, 48 deletions(-)
Index: veth-dev/drivers/net/iseries_veth.c
===================================================================
--- veth-dev.orig/drivers/net/iseries_veth.c
+++ veth-dev/drivers/net/iseries_veth.c
@@ -813,49 +813,6 @@ static struct ethtool_ops ops = {
.get_link = veth_get_link,
};
-static void veth_tx_timeout(struct net_device *dev)
-{
- struct veth_port *port = (struct veth_port *)dev->priv;
- struct net_device_stats *stats = &port->stats;
- unsigned long flags;
- int i;
-
- stats->tx_errors++;
-
- spin_lock_irqsave(&port->pending_gate, flags);
-
- if (!port->pending_lpmask) {
- spin_unlock_irqrestore(&port->pending_gate, flags);
- return;
- }
-
- printk(KERN_WARNING "%s: Tx timeout! Resetting lp connections: %08x\n",
- dev->name, port->pending_lpmask);
-
- for (i = 0; i < HVMAXARCHITECTEDLPS; i++) {
- struct veth_lpar_connection *cnx = veth_cnx[i];
-
- if (! (port->pending_lpmask & (1<<i)))
- continue;
-
- /* If we're pending on it, we must be connected to it,
- * so we should certainly have a structure for it. */
- BUG_ON(! cnx);
-
- /* Theoretically we could be kicking a connection
- * which doesn't deserve it, but in practice if we've
- * had a Tx timeout, the pending_lpmask will have
- * exactly one bit set - the connection causing the
- * problem. */
- spin_lock(&cnx->lock);
- cnx->state |= VETH_STATE_RESET;
- veth_kick_statemachine(cnx);
- spin_unlock(&cnx->lock);
- }
-
- spin_unlock_irqrestore(&port->pending_gate, flags);
-}
-
static struct net_device * __init veth_probe_one(int vlan, struct device *vdev)
{
struct net_device *dev;
@@ -904,9 +861,6 @@ static struct net_device * __init veth_p
dev->set_multicast_list = veth_set_multicast_list;
SET_ETHTOOL_OPS(dev, &ops);
- dev->watchdog_timeo = 2 * (VETH_ACKTIMEOUT * HZ / 1000000);
- dev->tx_timeout = veth_tx_timeout;
-
SET_NETDEV_DEV(dev, vdev);
rc = register_netdev(dev);
@@ -1047,8 +1001,6 @@ static int veth_start_xmit(struct sk_buf
lpmask = veth_transmit_to_many(skb, lpmask, dev);
- dev->trans_start = jiffies;
-
if (! lpmask) {
dev_kfree_skb(skb);
} else {
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 3/12] iseries_veth: Make init_connection() & destroy_connection() symmetrical
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
` (6 preceding siblings ...)
2005-06-30 10:20 ` [PATCH 10/12] iseries_veth: Remove TX timeout code Michael Ellerman
@ 2005-06-30 10:20 ` Michael Ellerman
2005-06-30 10:20 ` [PATCH 7/12] iseries_veth: Remove redundant message stack lock Michael Ellerman
` (4 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2005-06-30 10:20 UTC (permalink / raw)
To: linuxppc64-dev, netdev, linux-kernel
This patch makes veth_init_connection() and veth_destroy_connection()
symmetrical in that they allocate/deallocate the same data.
Currently if there's an error while initialising connections (ie. ENOMEM)
we call veth_module_cleanup(), however this will oops because we call
driver_unregister() before we've called driver_register(). I've never seen
this actually happen though.
So instead we explicitly call veth_destroy_connection() in a reverse
loop for the connections we've successfully initialised.
---
drivers/net/iseries_veth.c | 22 +++++++++++-----------
1 files changed, 11 insertions(+), 11 deletions(-)
Index: veth-dev/drivers/net/iseries_veth.c
===================================================================
--- veth-dev.orig/drivers/net/iseries_veth.c
+++ veth-dev/drivers/net/iseries_veth.c
@@ -664,6 +664,14 @@ static void veth_stop_connection(u8 rlp)
* been deleted by the state machine, just want to make sure
* its not running any more */
del_timer_sync(&cnx->ack_timer);
+}
+
+static void veth_destroy_connection(u8 rlp)
+{
+ struct veth_lpar_connection *cnx = veth_cnx[rlp];
+
+ if (! cnx)
+ return;
if (cnx->num_events > 0)
mf_deallocate_lp_events(cnx->remote_lp,
@@ -675,14 +683,6 @@ static void veth_stop_connection(u8 rlp)
HvLpEvent_Type_VirtualLan,
cnx->num_ack_events,
NULL, NULL);
-}
-
-static void veth_destroy_connection(u8 rlp)
-{
- struct veth_lpar_connection *cnx = veth_cnx[rlp];
-
- if (! cnx)
- return;
kfree(cnx->msgs);
kfree(cnx);
@@ -1424,15 +1424,15 @@ module_exit(veth_module_cleanup);
int __init veth_module_init(void)
{
- int i;
- int rc;
+ int i, rc;
this_lp = HvLpConfig_getLpIndex_outline();
for (i = 0; i < HVMAXARCHITECTEDLPS; ++i) {
rc = veth_init_connection(i);
if (rc != 0) {
- veth_module_cleanup();
+ for (; i >= 0; i--)
+ veth_destroy_connection(i);
return rc;
}
}
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 9/12] iseries_veth: Use ref counts to track lifecycle of connection structs
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
2005-06-30 10:20 ` [PATCH 1/12] iseries_veth: Make error messages more user friendly, and add a debug macro Michael Ellerman
2005-06-30 10:20 ` [PATCH 6/12] iseries_veth: Fix broken promiscuous handling Michael Ellerman
@ 2005-06-30 10:20 ` Michael Ellerman
2005-06-30 10:20 ` [PATCH 4/12] iseries_veth: Remove a FIXME WRT deletion of the ack_timer Michael Ellerman
` (9 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2005-06-30 10:20 UTC (permalink / raw)
To: linuxppc64-dev, netdev, linux-kernel
The iseries_veth driver can attach to multiple vlans, which correspond to
multiple net devices. However there is only 1 connection between each LPAR,
so the connection structure may be shared by multiple net devices.
This makes module removal messy, because we can't deallocate the connections
until we know there are no net devices still using them. The solution is to
use ref counts on the connections, so we can delete them (actually stop) as
soon as the ref count hits zero.
This patch fixes (part of) a bug we were seeing with IPv6 sending probes to
a dead LPAR, which would then hang us forever due to leftover skbs.
This patch has the (minor?) side effect that we only start negotiating a
connection with LPARs which are on one of our vlans. The previous behaviour
was to start negotiation with all LPARs unconditionally, will have the think
about that one.
---
drivers/net/iseries_veth.c | 89 ++++++++++++++++++++++++++++++---------------
1 files changed, 61 insertions(+), 28 deletions(-)
Index: veth-dev/drivers/net/iseries_veth.c
===================================================================
--- veth-dev.orig/drivers/net/iseries_veth.c
+++ veth-dev/drivers/net/iseries_veth.c
@@ -129,6 +129,7 @@ struct veth_lpar_connection {
int num_events;
struct VethCapData local_caps;
+ struct kref refcount;
struct timer_list ack_timer;
spinlock_t lock;
@@ -620,6 +621,10 @@ static int veth_init_connection(u8 rlp)
return -ENOMEM;
memset(cnx, 0, sizeof(*cnx));
+ /* This gets us 1 reference, which is held on behalf of the driver
+ * infrastructure. It's released at module unload. */
+ kref_init(&cnx->refcount);
+
cnx->remote_lp = rlp;
spin_lock_init(&cnx->lock);
INIT_WORK(&cnx->statemachine_wq, veth_statemachine, cnx);
@@ -658,12 +663,10 @@ static int veth_init_connection(u8 rlp)
return 0;
}
-static void veth_stop_connection(u8 rlp)
+static void veth_stop_connection(struct kref *ref)
{
- struct veth_lpar_connection *cnx = veth_cnx[rlp];
-
- if (! cnx)
- return;
+ struct veth_lpar_connection *cnx;
+ cnx = container_of(ref, struct veth_lpar_connection, refcount);
spin_lock_irq(&cnx->lock);
cnx->state |= VETH_STATE_RESET | VETH_STATE_SHUTDOWN;
@@ -1352,15 +1355,31 @@ static void veth_timed_ack(unsigned long
static int veth_remove(struct vio_dev *vdev)
{
- int i = vdev->unit_address;
+ struct veth_lpar_connection *cnx;
struct net_device *dev;
+ struct veth_port *port;
+ int i;
- dev = veth_dev[i];
- if (dev != NULL) {
- veth_dev[i] = NULL;
- unregister_netdev(dev);
- free_netdev(dev);
+ dev = veth_dev[vdev->unit_address];
+
+ if (! dev)
+ return 0;
+
+ port = netdev_priv(dev);
+
+ for (i = 0; i < HVMAXARCHITECTEDLPS; i++) {
+ cnx = veth_cnx[i];
+
+ if (cnx && (port->lpar_map & (1 << i))) {
+ /* Drop our reference to connections on our VLAN */
+ kref_put(&cnx->refcount, veth_stop_connection);
+ }
}
+
+ veth_dev[vdev->unit_address] = NULL;
+ unregister_netdev(dev);
+ free_netdev(dev);
+
return 0;
}
@@ -1368,6 +1387,7 @@ static int veth_probe(struct vio_dev *vd
{
int i = vdev->unit_address;
struct net_device *dev;
+ struct veth_port *port;
dev = veth_probe_one(i, &vdev->dev);
if (dev == NULL) {
@@ -1376,11 +1396,19 @@ static int veth_probe(struct vio_dev *vd
}
veth_dev[i] = dev;
- /* Start the state machine on each connection, to commence
- * link negotiation */
- for (i = 0; i < HVMAXARCHITECTEDLPS; i++)
- if (veth_cnx[i])
+ port = (struct veth_port*)netdev_priv(dev);
+
+ /* Start the state machine on each connection on this vlan. If we're
+ * the first dev to do so this will commence link negotiation */
+ for (i = 0; i < HVMAXARCHITECTEDLPS; i++) {
+ if (! (port->lpar_map & (1 << i)))
+ continue;
+
+ if (veth_cnx[i]) {
+ kref_get(&(veth_cnx[i]->refcount));
veth_kick_statemachine(veth_cnx[i]);
+ }
+ }
return 0;
}
@@ -1409,26 +1437,31 @@ static struct vio_driver veth_driver = {
void __exit veth_module_cleanup(void)
{
int i;
+ struct veth_lpar_connection *cnx;
- /* Stop the queues first to stop any new packets being sent. */
- for (i = 0; i < HVMAXARCHITECTEDVIRTUALLANS; i++)
- if (veth_dev[i])
- netif_stop_queue(veth_dev[i]);
+ /* Drop the driver's references to the connections. */
+ for (i = 0; i < HVMAXARCHITECTEDLPS; ++i) {
+ cnx = veth_cnx[i];
- /* Stop the connections before we unregister the driver. This
- * ensures there's no skbs lying around holding the device open. */
- for (i = 0; i < HVMAXARCHITECTEDLPS; ++i)
- veth_stop_connection(i);
+ if (cnx) {
+ kref_put(&cnx->refcount, veth_stop_connection);
+ }
+ }
- HvLpEvent_unregisterHandler(HvLpEvent_Type_VirtualLan);
+ /* Unregister the driver, which will close all the netdevs and stop
+ * the connections when they're no longer referenced. */
+ vio_unregister_driver(&veth_driver);
- /* Hypervisor callbacks may have scheduled more work while we
- * were stoping connections. Now that we've disconnected from
- * the hypervisor make sure everything's finished. */
+ /* Make sure each connection's state machine has run to completion. */
flush_scheduled_work();
- vio_unregister_driver(&veth_driver);
+ /* Disconnect our "irq" to stop events coming from the Hypervisor. */
+ HvLpEvent_unregisterHandler(HvLpEvent_Type_VirtualLan);
+
+ /* Make sure any work queued from Hypervisor callbacks is finished. */
+ flush_scheduled_work();
+ /* Deallocate everything. */
for (i = 0; i < HVMAXARCHITECTEDLPS; ++i)
veth_destroy_connection(i);
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 1/12] iseries_veth: Make error messages more user friendly, and add a debug macro
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
@ 2005-06-30 10:20 ` Michael Ellerman
2005-06-30 10:20 ` [PATCH 6/12] iseries_veth: Fix broken promiscuous handling Michael Ellerman
` (11 subsequent siblings)
12 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2005-06-30 10:20 UTC (permalink / raw)
To: linuxppc64-dev, netdev, linux-kernel
Currently the iseries_veth driver prints the file name and line number in its
error messages. This isn't very useful for most users, so just print
"iseries_veth: message" instead.
Also add a veth_debug() and veth_info() macro to replace the current
veth_printk().
---
drivers/net/iseries_veth.c | 15 ++++++++++++---
1 files changed, 12 insertions(+), 3 deletions(-)
Index: veth-dev/drivers/net/iseries_veth.c
===================================================================
--- veth-dev.orig/drivers/net/iseries_veth.c
+++ veth-dev/drivers/net/iseries_veth.c
@@ -79,6 +79,8 @@
#include <asm/iommu.h>
#include <asm/vio.h>
+#define DEBUG 1
+
#include "iseries_veth.h"
MODULE_AUTHOR("Kyle Lucke <klucke@us.ibm.com>");
@@ -176,11 +178,18 @@ static void veth_timed_ack(unsigned long
* Utility functions
*/
-#define veth_printk(prio, fmt, args...) \
- printk(prio "%s: " fmt, __FILE__, ## args)
+#define veth_info(fmt, args...) \
+ printk(KERN_INFO "iseries_veth: " fmt, ## args)
#define veth_error(fmt, args...) \
- printk(KERN_ERR "(%s:%3.3d) ERROR: " fmt, __FILE__, __LINE__ , ## args)
+ printk(KERN_ERR "iseries_veth: Error: " fmt, ## args)
+
+#ifdef DEBUG
+#define veth_debug(fmt, args...) \
+ printk(KERN_DEBUG "iseries_veth: " fmt, ## args)
+#else
+#define veth_debug(fmt, args...) do {} while (0)
+#endif
static inline void veth_stack_push(struct veth_lpar_connection *cnx,
struct veth_msg *msg)
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 12/12] iseries_veth: Simplify full-queue handling
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
` (9 preceding siblings ...)
2005-06-30 10:20 ` [PATCH 2/12] iseries_veth: Cleanup error and debug messages Michael Ellerman
@ 2005-06-30 10:20 ` Michael Ellerman
2005-06-30 10:20 ` [PATCH 11/12] iseries_veth: Add a per-connection ack timer Michael Ellerman
2005-06-30 14:41 ` [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Jeff Garzik
12 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2005-06-30 10:20 UTC (permalink / raw)
To: linuxppc64-dev, netdev, linux-kernel
The iseries_veth driver may have multiple netdevices sending packets over
a single connection to another LPAR. If the bandwidth to the other LPAR is
exceeded all the netdevices must have their queue's stopped.
The current code achieves this by queueing one incoming skb on the
per-netdevice port structure. When the connection is able to send more packets
it flushes the queued packet for all netdevices and restarts their queues.
This arrangement makes less sense now that we have per-connection TX timers,
rather than the per-netdevice generic TX timer.
The new code simply detects when one of the connections is full, and stops
the queue of all associated netdevices. Then when a packet is acked on that
connection (ie. there is space again) all the queues are woken up.
---
drivers/net/iseries_veth.c | 108 ++++++++++++++++++++++++++-------------------
1 files changed, 64 insertions(+), 44 deletions(-)
Index: veth-dev/drivers/net/iseries_veth.c
===================================================================
--- veth-dev.orig/drivers/net/iseries_veth.c
+++ veth-dev/drivers/net/iseries_veth.c
@@ -158,10 +158,11 @@ struct veth_port {
u64 mac_addr;
HvLpIndexMap lpar_map;
- spinlock_t pending_gate;
- struct sk_buff *pending_skb;
- HvLpIndexMap pending_lpmask;
+ /* queue_lock protects the stopped_map and dev's queue. */
+ spinlock_t queue_lock;
+ HvLpIndexMap stopped_map;
+ /* mcast_gate protects promiscuous, num_mcast & mcast_addr. */
rwlock_t mcast_gate;
int promiscuous;
int num_mcast;
@@ -174,7 +175,8 @@ static struct net_device *veth_dev[HVMAX
static int veth_start_xmit(struct sk_buff *skb, struct net_device *dev);
static void veth_recycle_msg(struct veth_lpar_connection *, struct veth_msg *);
-static void veth_flush_pending(struct veth_lpar_connection *cnx);
+static void veth_wake_queues(struct veth_lpar_connection *cnx);
+static void veth_stop_queues(struct veth_lpar_connection *cnx);
static void veth_receive(struct veth_lpar_connection *, struct VethLpEvent *);
static void veth_timed_ack(unsigned long ptr);
static void veth_timed_reset(unsigned long ptr);
@@ -216,6 +218,12 @@ static inline struct veth_msg *veth_stac
return msg;
}
+/* You must hold the connection's lock when you call this function. */
+static inline int veth_stack_is_empty(struct veth_lpar_connection *cnx)
+{
+ return cnx->msg_stack_head == NULL;
+}
+
static inline HvLpEvent_Rc
veth_signalevent(struct veth_lpar_connection *cnx, u16 subtype,
HvLpEvent_AckInd ackind, HvLpEvent_AckType acktype,
@@ -384,12 +392,12 @@ static void veth_handle_int(struct VethL
}
}
- if (acked > 0)
+ if (acked > 0) {
cnx->last_contact = jiffies;
+ veth_wake_queues(cnx);
+ }
spin_unlock_irqrestore(&cnx->lock, flags);
-
- veth_flush_pending(cnx);
break;
case VethEventTypeFrames:
veth_receive(cnx, event);
@@ -485,7 +493,9 @@ static void veth_statemachine(void *p)
for (i = 0; i < VETH_NUMBUFFERS; ++i)
veth_recycle_msg(cnx, cnx->msgs + i);
}
+
cnx->outstanding_tx = 0;
+ veth_wake_queues(cnx);
/* Drop the lock so we can do stuff that might sleep or
* take other locks. */
@@ -494,8 +504,6 @@ static void veth_statemachine(void *p)
del_timer_sync(&cnx->ack_timer);
del_timer_sync(&cnx->reset_timer);
- veth_flush_pending(cnx);
-
spin_lock_irq(&cnx->lock);
if (cnx->state & VETH_STATE_RESET)
@@ -852,8 +860,9 @@ static struct net_device * __init veth_p
port = (struct veth_port *) dev->priv;
- spin_lock_init(&port->pending_gate);
+ spin_lock_init(&port->queue_lock);
rwlock_init(&port->mcast_gate);
+ port->stopped_map = 0;
for (i = 0; i < HVMAXARCHITECTEDLPS; i++) {
HvLpVirtualLanIndexMap map;
@@ -969,6 +978,9 @@ static int veth_transmit_to_one(struct s
cnx->last_contact = jiffies;
cnx->outstanding_tx++;
+ if (veth_stack_is_empty(cnx))
+ veth_stop_queues(cnx);
+
spin_unlock_irqrestore(&cnx->lock, flags);
return 0;
@@ -1012,7 +1024,6 @@ static int veth_start_xmit(struct sk_buf
{
unsigned char *frame = skb->data;
struct veth_port *port = (struct veth_port *) dev->priv;
- unsigned long flags;
HvLpIndexMap lpmask;
if (! (frame[0] & 0x01)) {
@@ -1029,27 +1040,9 @@ static int veth_start_xmit(struct sk_buf
lpmask = port->lpar_map;
}
- spin_lock_irqsave(&port->pending_gate, flags);
-
- lpmask = veth_transmit_to_many(skb, lpmask, dev);
+ veth_transmit_to_many(skb, lpmask, dev);
- if (! lpmask) {
- dev_kfree_skb(skb);
- } else {
- if (port->pending_skb) {
- veth_error("%s: TX while skb was pending!\n",
- dev->name);
- dev_kfree_skb(skb);
- spin_unlock_irqrestore(&port->pending_gate, flags);
- return 1;
- }
-
- port->pending_skb = skb;
- port->pending_lpmask = lpmask;
- netif_stop_queue(dev);
- }
-
- spin_unlock_irqrestore(&port->pending_gate, flags);
+ dev_kfree_skb(skb);
return 0;
}
@@ -1081,9 +1074,10 @@ static void veth_recycle_msg(struct veth
}
}
-static void veth_flush_pending(struct veth_lpar_connection *cnx)
+static void veth_wake_queues(struct veth_lpar_connection *cnx)
{
int i;
+
for (i = 0; i < HVMAXARCHITECTEDVIRTUALLANS; i++) {
struct net_device *dev = veth_dev[i];
struct veth_port *port;
@@ -1097,19 +1091,45 @@ static void veth_flush_pending(struct ve
if (! (port->lpar_map & (1<<cnx->remote_lp)))
continue;
- spin_lock_irqsave(&port->pending_gate, flags);
- if (port->pending_skb) {
- port->pending_lpmask =
- veth_transmit_to_many(port->pending_skb,
- port->pending_lpmask,
- dev);
- if (! port->pending_lpmask) {
- dev_kfree_skb_any(port->pending_skb);
- port->pending_skb = NULL;
- netif_wake_queue(dev);
- }
+ spin_lock_irqsave(&port->queue_lock, flags);
+
+ port->stopped_map &= ~(1 << cnx->remote_lp);
+
+ if (0 == port->stopped_map && netif_queue_stopped(dev)) {
+ veth_debug("cnx %d: woke queue for %s.\n",
+ cnx->remote_lp, dev->name);
+ netif_wake_queue(dev);
}
- spin_unlock_irqrestore(&port->pending_gate, flags);
+ spin_unlock_irqrestore(&port->queue_lock, flags);
+ }
+}
+
+static void veth_stop_queues(struct veth_lpar_connection *cnx)
+{
+ int i;
+
+ for (i = 0; i < HVMAXARCHITECTEDVIRTUALLANS; i++) {
+ struct net_device *dev = veth_dev[i];
+ struct veth_port *port;
+
+ if (! dev)
+ continue;
+
+ port = (struct veth_port *)dev->priv;
+
+ /* If this cnx is not on the vlan for this port, continue */
+ if (! (port->lpar_map & (1 << cnx->remote_lp)))
+ continue;
+
+ spin_lock(&port->queue_lock);
+
+ netif_stop_queue(dev);
+ port->stopped_map |= (1 << cnx->remote_lp);
+
+ veth_debug("cnx %d: stopped queue for %s, map = 0x%x.\n",
+ cnx->remote_lp, dev->name, port->stopped_map);
+
+ spin_unlock(&port->queue_lock);
}
}
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH 11/12] iseries_veth: Add a per-connection ack timer
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
` (10 preceding siblings ...)
2005-06-30 10:20 ` [PATCH 12/12] iseries_veth: Simplify full-queue handling Michael Ellerman
@ 2005-06-30 10:20 ` Michael Ellerman
2005-06-30 14:41 ` [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Jeff Garzik
12 siblings, 0 replies; 14+ messages in thread
From: Michael Ellerman @ 2005-06-30 10:20 UTC (permalink / raw)
To: linuxppc64-dev, netdev, linux-kernel
Currently the iseries_veth driver contravenes the specification in
Documentation/networking/driver.txt, in that if packets are not acked by
the other LPAR they will sit around forever.
This patch adds a per-connection timer which fires if we've had no acks for
five seconds. This is superior to the generic TX timer because it catches
the case of a small number of packets being sent and never acked.
---
drivers/net/iseries_veth.c | 75 +++++++++++++++++++++++++++++++++++++++++----
1 files changed, 69 insertions(+), 6 deletions(-)
Index: veth-dev/drivers/net/iseries_veth.c
===================================================================
--- veth-dev.orig/drivers/net/iseries_veth.c
+++ veth-dev/drivers/net/iseries_veth.c
@@ -132,6 +132,11 @@ struct veth_lpar_connection {
struct kref refcount;
struct timer_list ack_timer;
+ struct timer_list reset_timer;
+ unsigned int reset_timeout;
+ unsigned long last_contact;
+ int outstanding_tx;
+
spinlock_t lock;
unsigned long state;
HvLpInstanceId src_inst;
@@ -171,7 +176,8 @@ static int veth_start_xmit(struct sk_buf
static void veth_recycle_msg(struct veth_lpar_connection *, struct veth_msg *);
static void veth_flush_pending(struct veth_lpar_connection *cnx);
static void veth_receive(struct veth_lpar_connection *, struct VethLpEvent *);
-static void veth_timed_ack(unsigned long connectionPtr);
+static void veth_timed_ack(unsigned long ptr);
+static void veth_timed_reset(unsigned long ptr);
/*
* Utility functions
@@ -353,7 +359,7 @@ static void veth_handle_int(struct VethL
HvLpIndex rlp = event->base_event.xSourceLp;
struct veth_lpar_connection *cnx = veth_cnx[rlp];
unsigned long flags;
- int i;
+ int i, acked = 0;
BUG_ON(! cnx);
@@ -367,13 +373,22 @@ static void veth_handle_int(struct VethL
break;
case VethEventTypeFramesAck:
spin_lock_irqsave(&cnx->lock, flags);
+
for (i = 0; i < VETH_MAX_ACKS_PER_MSG; ++i) {
u16 msgnum = event->u.frames_ack_data.token[i];
- if (msgnum < VETH_NUMBUFFERS)
+ if (msgnum < VETH_NUMBUFFERS) {
veth_recycle_msg(cnx, cnx->msgs + msgnum);
+ cnx->outstanding_tx--;
+ acked++;
+ }
}
+
+ if (acked > 0)
+ cnx->last_contact = jiffies;
+
spin_unlock_irqrestore(&cnx->lock, flags);
+
veth_flush_pending(cnx);
break;
case VethEventTypeFrames:
@@ -447,8 +462,6 @@ static void veth_statemachine(void *p)
restart:
if (cnx->state & VETH_STATE_RESET) {
- int i;
-
if (cnx->state & VETH_STATE_OPEN)
HvCallEvent_closeLpEventPath(cnx->remote_lp,
HvLpEvent_Type_VirtualLan);
@@ -467,15 +480,20 @@ static void veth_statemachine(void *p)
| VETH_STATE_SENTCAPACK | VETH_STATE_READY);
/* Clean up any leftover messages */
- if (cnx->msgs)
+ if (cnx->msgs) {
+ int i;
for (i = 0; i < VETH_NUMBUFFERS; ++i)
veth_recycle_msg(cnx, cnx->msgs + i);
+ }
+ cnx->outstanding_tx = 0;
/* Drop the lock so we can do stuff that might sleep or
* take other locks. */
spin_unlock_irq(&cnx->lock);
del_timer_sync(&cnx->ack_timer);
+ del_timer_sync(&cnx->reset_timer);
+
veth_flush_pending(cnx);
spin_lock_irq(&cnx->lock);
@@ -628,9 +646,16 @@ static int veth_init_connection(u8 rlp)
cnx->remote_lp = rlp;
spin_lock_init(&cnx->lock);
INIT_WORK(&cnx->statemachine_wq, veth_statemachine, cnx);
+
init_timer(&cnx->ack_timer);
cnx->ack_timer.function = veth_timed_ack;
cnx->ack_timer.data = (unsigned long) cnx;
+
+ init_timer(&cnx->reset_timer);
+ cnx->reset_timer.function = veth_timed_reset;
+ cnx->reset_timer.data = (unsigned long) cnx;
+ cnx->reset_timeout = 5 * HZ * (VETH_ACKTIMEOUT / 1000000);
+
memset(&cnx->pending_acks, 0xff, sizeof (cnx->pending_acks));
veth_cnx[rlp] = cnx;
@@ -937,6 +962,13 @@ static int veth_transmit_to_one(struct s
if (rc != HvLpEvent_Rc_Good)
goto recycle_and_drop;
+ /* If the timer's not already running, start it now. */
+ if (0 == cnx->outstanding_tx)
+ mod_timer(&cnx->reset_timer, jiffies + cnx->reset_timeout);
+
+ cnx->last_contact = jiffies;
+ cnx->outstanding_tx++;
+
spin_unlock_irqrestore(&cnx->lock, flags);
return 0;
@@ -1081,6 +1113,37 @@ static void veth_flush_pending(struct ve
}
}
+static void veth_timed_reset(unsigned long ptr)
+{
+ struct veth_lpar_connection *cnx = (struct veth_lpar_connection *)ptr;
+ unsigned long trigger_time, flags;
+
+ /* FIXME is it possible this fires after veth_stop_connection()?
+ * That would reschedule the statemachine for 5 seconds and probably
+ * execute it after the module's been unloaded. Hmm. */
+
+ spin_lock_irqsave(&cnx->lock, flags);
+
+ if (cnx->outstanding_tx > 0) {
+ trigger_time = cnx->last_contact + cnx->reset_timeout;
+
+ if (trigger_time < jiffies) {
+ cnx->state |= VETH_STATE_RESET;
+ veth_kick_statemachine(cnx);
+ veth_error("%d packets not acked by LPAR %d within %d "
+ "seconds, resetting.\n",
+ cnx->outstanding_tx, cnx->remote_lp,
+ cnx->reset_timeout / HZ);
+ } else {
+ /* Reschedule the timer */
+ trigger_time = jiffies + cnx->reset_timeout;
+ mod_timer(&cnx->reset_timer, trigger_time);
+ }
+ }
+
+ spin_unlock_irqrestore(&cnx->lock, flags);
+}
+
/*
* Rx path
*/
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
` (11 preceding siblings ...)
2005-06-30 10:20 ` [PATCH 11/12] iseries_veth: Add a per-connection ack timer Michael Ellerman
@ 2005-06-30 14:41 ` Jeff Garzik
12 siblings, 0 replies; 14+ messages in thread
From: Jeff Garzik @ 2005-06-30 14:41 UTC (permalink / raw)
To: michael; +Cc: PPC64-dev, netdev, LKML
Michael Ellerman wrote:
> Hi y'all,
>
> The following is a series of patches for the iseries_veth driver.
>
> They're not ready for merging yet, as we need to do more extensive testing.
> However any feedback you have will be greatly appreciated.
Note, make sure to CC me, and also the new netdev list
(netdev@vger.kernel.org).
Jeff
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2005-06-30 14:41 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-06-30 10:16 [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Michael Ellerman
2005-06-30 10:20 ` [PATCH 1/12] iseries_veth: Make error messages more user friendly, and add a debug macro Michael Ellerman
2005-06-30 10:20 ` [PATCH 6/12] iseries_veth: Fix broken promiscuous handling Michael Ellerman
2005-06-30 10:20 ` [PATCH 9/12] iseries_veth: Use ref counts to track lifecycle of connection structs Michael Ellerman
2005-06-30 10:20 ` [PATCH 4/12] iseries_veth: Remove a FIXME WRT deletion of the ack_timer Michael Ellerman
2005-06-30 10:20 ` [PATCH 5/12] iseries_veth: Try to avoid pathological reset behaviour Michael Ellerman
2005-06-30 10:20 ` [PATCH 8/12] iseries_veth: Replace lock-protected atomic with an ordinary variable Michael Ellerman
2005-06-30 10:20 ` [PATCH 10/12] iseries_veth: Remove TX timeout code Michael Ellerman
2005-06-30 10:20 ` [PATCH 3/12] iseries_veth: Make init_connection() & destroy_connection() symmetrical Michael Ellerman
2005-06-30 10:20 ` [PATCH 7/12] iseries_veth: Remove redundant message stack lock Michael Ellerman
2005-06-30 10:20 ` [PATCH 2/12] iseries_veth: Cleanup error and debug messages Michael Ellerman
2005-06-30 10:20 ` [PATCH 12/12] iseries_veth: Simplify full-queue handling Michael Ellerman
2005-06-30 10:20 ` [PATCH 11/12] iseries_veth: Add a per-connection ack timer Michael Ellerman
2005-06-30 14:41 ` [RFC/PATCH 0/12] Updates & bug fixes for iseries_veth network driver Jeff Garzik
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox