Netdev List
 help / color / mirror / Atom feed
* RE: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
From: Xin, Xiaohui @ 2009-09-14  5:57 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Ira W. Snyder, netdev@vger.kernel.org,
	virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, mingo@elte.hu, linux-mm@kvack.org,
	akpm@linux-foundation.org, hpa@zytor.com,
	gregory.haskins@gmail.com, Rusty Russell, s.hetze@linux-ag.com,
	avi@redhat.com
In-Reply-To: <20090913054610.GA4446@redhat.com>

>The irqfd/ioeventfd patches are part of Avi's kvm.git tree:
>git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git
>
>I expect them to be merged by 2.6.32-rc1 - right, Avi?

Michael,

I think I have the kernel patch for kvm_irqfd and kvm_ioeventfd, but missed the qemu side patch for irqfd and ioeventfd.

I met the compile error when I compiled virtio-pci.c file in qemu-kvm like this:

/root/work/vmdq/vhost/qemu-kvm/hw/virtio-pci.c:384: error: `KVM_IRQFD` undeclared (first use in this function)
/root/work/vmdq/vhost/qemu-kvm/hw/virtio-pci.c:400: error: `KVM_IOEVENTFD` undeclared (first use in this function)

Which qemu tree or patch do you use for kvm_irqfd and kvm_ioeventfd?

Thanks
Xiaohui

-----Original Message-----
From: Michael S. Tsirkin [mailto:mst@redhat.com] 
Sent: Sunday, September 13, 2009 1:46 PM
To: Xin, Xiaohui
Cc: Ira W. Snyder; netdev@vger.kernel.org; virtualization@lists.linux-foundation.org; kvm@vger.kernel.org; linux-kernel@vger.kernel.org; mingo@elte.hu; linux-mm@kvack.org; akpm@linux-foundation.org; hpa@zytor.com; gregory.haskins@gmail.com; Rusty Russell; s.hetze@linux-ag.com; avi@redhat.com
Subject: Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server

On Fri, Sep 11, 2009 at 11:17:33PM +0800, Xin, Xiaohui wrote:
> Michael,
> We are very interested in your patch and want to have a try with it.
> I have collected your 3 patches in kernel side and 4 patches in queue side.
> The patches are listed here:
> 
> PATCHv5-1-3-mm-export-use_mm-unuse_mm-to-modules.patch
> PATCHv5-2-3-mm-reduce-atomic-use-on-use_mm-fast-path.patch
> PATCHv5-3-3-vhost_net-a-kernel-level-virtio-server.patch
> 
> PATCHv3-1-4-qemu-kvm-move-virtio-pci[1].o-to-near-pci.o.patch
> PATCHv3-2-4-virtio-move-features-to-an-inline-function.patch
> PATCHv3-3-4-qemu-kvm-vhost-net-implementation.patch
> PATCHv3-4-4-qemu-kvm-add-compat-eventfd.patch
> 
> I applied the kernel patches on v2.6.31-rc4 and the qemu patches on latest kvm qemu.
> But seems there are some patches are needed at least irqfd and ioeventfd patches on
> current qemu. I cannot create a kvm guest with "-net nic,model=virtio,vhost=vethX".
> 
> May you kindly advice us the patch lists all exactly to make it work?
> Thanks a lot. :-)
> 
> Thanks
> Xiaohui


The irqfd/ioeventfd patches are part of Avi's kvm.git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git

I expect them to be merged by 2.6.32-rc1 - right, Avi?

-- 
MST

^ permalink raw reply

* Re: [PATCH 1/8] networking/fanotify: declare fanotify socket numbers
From: Eric Paris @ 2009-09-14  1:26 UTC (permalink / raw)
  To: Jamie Lokier
  Cc: jamal, David Miller, linux-kernel, linux-fsdevel, netdev, viro,
	alan, hch, balbir
In-Reply-To: <20090914000303.GA30621@shareable.org>

On Mon, 2009-09-14 at 01:03 +0100, Jamie Lokier wrote:
> jamal wrote:
> > On Fri, 2009-09-11 at 22:42 +0100, Jamie Lokier wrote:

First let me state the 2 main new things fanotify gives so neither is
lost.

#1 fanotify implements the same basic thing as inotify except rather
than an arbitrary number (inotify speak a watch descriptor) which
userspace has to somehow convert back into a file, fanotify gives
userspace an open file descriptor to that object.  (this is the part
that requires recv side processing)

#2 fanotify allows the userspace 'listener' or 'group' (I use group to
describe it's actions in kernel and listener to describe it's action in
userspace) to request that it be allowed to arbitrate open and access
(read) security decisions.

> > > Eric's explained that it would be normal for _every_ file operation on
> > > some systems to trigger a fanotify event and possibly wait on the
> > > response, or at least in major directory trees on the filesystem.
> > > Even if it's just for the fanotify app to say "oh I don't care about
> > > that file, carry on".
> > > 
> > 
> > That doesnt sound very scalable. Should it not be you get nothing unless
> > you register for interest in something?
> 
> You do get nothing unless you register interest.  The problem is
> there's no way to register interest on just a subtree, so the fanotify
> approach is let you register for events on the whole filesystem, and
> let the userspace daemon filter paths.  At least it's decisions can be
> cached, although I'm not sure how that works when multiple processes
> want to monitor overlapping parts of the filesystem.

fanotify provides 3 options to register.
1) this inode
2) this dir and it's children
3) all files on the whole fscking system

this patch only does #1 and #2.  After it's in I'm going to take a
serious look at #4 subtrees.

Responses to access decisions are cached and checked in kernel per
fanotify listener.  So listener 1 can ignore requests for a given inode
while listener 2 still gets notification and forces the original process
to block.

> It doesn't sound scalable to me, either, and that's why I don't like
> this part, and described a solution to monitoring subtrees - which
> would also solve the problem for inotify.  (Both use fsnotify under
> the hood, and that's where subtree notification would go).

Subtree checking hasn't seen and work from me but it is something I plan
to work on.  And one thing that makes me scared to tie myself to
syscalls when I already have something that works relatively cleanly and
easily.

> Eric's mentioned interest in a way to monitor subtrees, but that
> hasn't gone anywhere as far as I know.  He doesn't seem convinced by
> my solution - or even that scalability will be an issue.  I think
> there's a bit of vision lacking here, and I'll admit I'm more
> interested in the inotify uses of fsnotify (being able to detect
> changes) than the fanotify uses (being able to _block_ or _modify_
> changes).  I think both inotify and fanotify ought to benefit from the
> same improvements to file monitoring.

sort of agree with you here.  anything that gets added to support
subtrees would have to be in the generic code.  Although I question how
inotify could be used, as a wd is not (in my mind) a reasonable way to
tell userspace about files.  (and with subtrees it would be a wd and a
pathname....)    I think fanotify with notification only (what I'm
giving in this patch series is a much better fit for subtree
notification.

> > > File performance is one of those things which really needs to be fast
> > > for a good user experience - and it's not unusual to grep the odd
> > > 10,000 files here or there (just think of what a kernel developer
> > > does), or to replace a few thousand quickly (rpm/dpkg) and things like
> > > that.
> > > 
> > 
> > So grepping 10000 files would cause 10000 events? I am not sure how the
> > scheme works; filtering of what events get delivered sounds more
> > reasonable if it happens in the kernel.
> 
> I believe it would cause 10000 events, yes, even if they are files
> that userspace policy is not interested in.  Eric, is that right?

If fanotify wants it, yes, that's exactly what happens.

> However I believe after the first grep, subsequent greps' decisions
> would be cached by marking the inodes.  I'm not sure what happens if
> two fanotify monitors both try marking the inodes.

Each can mark individually.

> Arguably if a fanotify monitor is running before those files are in
> page cache anyway, then I/O may dominate, and when the files are
> cached, fanotify has already cached it's decisions in the kernel.
> However fanotify is synchronous: each new file access involves a round
> trip to the fanotify userspace and back before it can proceed, so
> there's quite a lot of IPC and scheduling too.  Without testing, it's
> hard to guess how it'll really perform.

As I recall my old old tests on a 32 way system showed around a 10%
performance penalty when building a kernel when making userspace
arbitrate decisions and the cache was blank.  So yes, there is a serious
performance hit to making a userspace application control access
decisions.  Then again, I'd rather not have those people who need this
system wide access controls to do it in the kernel (anti-malware
vendors)

I believe that people who chose to use this interface will have to
realize there is a severe up front performance penalty.  On a steady
state system like a web server you'd see near 0% performance (a new srcu
lock, inode->i_lock, and running a short list)  But yes, controling
access to every file on a system eats performance, that's the nature of
the beast.

> > Theres a difference between events which are abbreviated in the form
> > "hey some read happened on fd you are listening on" vs "hey a read
> > of file X for 16 bytes at offset 200 by process Y just occured while
> > at the same time process Z was writting at offset 2000". The later
> > (which netlink will give you) includes a lot more attribute details
> > which could be filtered or can be extended to include a lot
> > more. The former(what epoll will give you) is merely a signal.

> But this part is irrelevant to fanotify, because there's no plan or
> intention to provide that much detail about I/O.

We have ZERO plan to include ordering.  ZERO.  inotify sorta pretends it
deals with ordering by only dropping a notification if it is the same as
the last one in the queue.  fanotify will gladly merge events which
exist anywhere in the queue clearly throwing ordering to the wind.

We do plan to include the pid, uid, and gid of the process making the
original request.  We also plan to include the f_flags of the file in
the original process when possible.

^ permalink raw reply

* Re: [PATCH 1/8] networking/fanotify: declare fanotify socket numbers
From: Jamie Lokier @ 2009-09-14  0:17 UTC (permalink / raw)
  To: Evgeniy Polyakov
  Cc: Eric Paris, David Miller, linux-kernel, linux-fsdevel, netdev,
	viro, alan, hch
In-Reply-To: <20090912094110.GB24709@ioremap.net>

Evgeniy Polyakov wrote:
> On Fri, Sep 11, 2009 at 05:51:42PM -0400, Eric Paris (eparis@redhat.com) wrote:
> > For some things yes, some things no.  I'd have to understand where loss
> > can happen to know if it's feasible.  If I know loss happens in the
> > sender context that's great.  If it's somewhere in the middle and the
> > sender doesn't immediately know it'll never be delivered, yes, I don't
> > think it can solve all my needs.  How many places can and skb get lost
> > between the sender and the receiver?
> 
> When queue is full or you do not have enough RAM. Both are reported at
> 'sending' time.

Can you ->poll() and wait reliably until the queue will accept an skb?
(A few spurious EAGAINs/ENOBUFs is ok, as long as it's not the norm).

-- Jamie

^ permalink raw reply

* Re: [PATCH 1/8] networking/fanotify: declare fanotify socket numbers
From: Jamie Lokier @ 2009-09-14  0:03 UTC (permalink / raw)
  To: jamal
  Cc: Eric Paris, David Miller, linux-kernel, linux-fsdevel, netdev,
	viro, alan, hch, balbir
In-Reply-To: <1252709567.25158.61.camel@dogo.mojatatu.com>

jamal wrote:
> On Fri, 2009-09-11 at 22:42 +0100, Jamie Lokier wrote:
> 
> > One of the uses of fanotify is as a security or auditing mechanism.
> > That can't tolerate gaps.
> > 
> > It's fundemantally different from inotify in one important respect:
> > inotify apps can recover from losing events by checking what they are
> > watching.
> > 
> > The fanotify application will know that it missed events, but what
> > happens to the other application which _caused_ those events?  Does it
> > get to do things it shouldn't, or hide them from the fanotify app, by
> > simply overloading the system?  Or the opposite, does it get access
> > denied - spurious file errors when the system is overloaded?
> > 
> > There's no way to handle that by dropping events.  A transport
> > mechanism can be dropped (say skbs), but the event itself has to be
> > kept, and then retried.
> > 
> >
> > Since you have to keep an event object around until it's handled,
> > there's no point tying it to an unreliable delivery mechanism which
> > you'd have to wrap a retry mechanism around.
> > 
> > In other words, that part of netlink is a poor match.  It would match
> > inotify much better.
> 
> Reliability is something that you should build in. Netlink provides you
> all the necessary tools. What you are asking for here is essentially 
> reliable multicasting.

Almost.  It's reliable multicasting plus unicast responses which must
be waited for.  That changes things.

> You dont have infinite memory, therefore there
> will be times when you will overload one of the users, and they wont
> have sufficient buffer space and then you have to retransmit.

If you have enough memory to remember _what_ to retransmit, then you
have enough memory to buffer a fixed-size message.  It just depends on
how you do the buffering.  To say netlink drops the message and you
can retry is just saying that the buffering is happening one step
earlier, before netlink.  That's what I mean by netlink being a
pointless complication for this, because you can just as easily write
code which gets to the message to userspace without going through
netlink and with no chance of it being dropped.

> Is the current proposed mechanism capable of reliably multicasting
> without need for retransmit?

Yes.  It uses positive acknowledge and flow control, because these
match naturally with what fanotify does at the next higher level.

The process generating the multicast (e.g. trying to write a file) is
blocked until the receiver gets the message, handles it and
acknowledges with a "yes you can" or "no you can't" response.

That's part of fanotify's design.  The pattern conveniently has no
issues with using unbounded memory for message, because the sending
process is blocked.

> > Speaking of skbs, how fast and compact are they for this?
> 
> They are largish relative to say if you trimmed down to basic necessity.
> But then you get a lot of the buffer management aspects for free.
> In this case, the concept of multicasting is built in so for one event
> to be sent to X users - you only need one skb.

True you only need one skb.  But netlink doesn't handle waiting for
positive acknowledge responses from every receiver, and combining
their value, does it?  You can't really take advantage of netlink's
built in multicast, because to known when it has all the responses,
the fanotify layer has to track the subscriber list itself anyway.

What I'm saying is perhaps skbs are useful for fanotify, but I don't
know that netlink's multicasting is useful.  But storing the messages
in skbs for transmission, and using parts of netlink to manage them,
and to provide some of the API, that might be useful.

> > Eric's explained that it would be normal for _every_ file operation on
> > some systems to trigger a fanotify event and possibly wait on the
> > response, or at least in major directory trees on the filesystem.
> > Even if it's just for the fanotify app to say "oh I don't care about
> > that file, carry on".
> > 
> 
> That doesnt sound very scalable. Should it not be you get nothing unless
> you register for interest in something?

You do get nothing unless you register interest.  The problem is
there's no way to register interest on just a subtree, so the fanotify
approach is let you register for events on the whole filesystem, and
let the userspace daemon filter paths.  At least it's decisions can be
cached, although I'm not sure how that works when multiple processes
want to monitor overlapping parts of the filesystem.

It doesn't sound scalable to me, either, and that's why I don't like
this part, and described a solution to monitoring subtrees - which
would also solve the problem for inotify.  (Both use fsnotify under
the hood, and that's where subtree notification would go).

Eric's mentioned interest in a way to monitor subtrees, but that
hasn't gone anywhere as far as I know.  He doesn't seem convinced by
my solution - or even that scalability will be an issue.  I think
there's a bit of vision lacking here, and I'll admit I'm more
interested in the inotify uses of fsnotify (being able to detect
changes) than the fanotify uses (being able to _block_ or _modify_
changes).  I think both inotify and fanotify ought to benefit from the
same improvements to file monitoring.

> > File performance is one of those things which really needs to be fast
> > for a good user experience - and it's not unusual to grep the odd
> > 10,000 files here or there (just think of what a kernel developer
> > does), or to replace a few thousand quickly (rpm/dpkg) and things like
> > that.
> > 
> 
> So grepping 10000 files would cause 10000 events? I am not sure how the
> scheme works; filtering of what events get delivered sounds more
> reasonable if it happens in the kernel.

I believe it would cause 10000 events, yes, even if they are files
that userspace policy is not interested in.  Eric, is that right?

However I believe after the first grep, subsequent greps' decisions
would be cached by marking the inodes.  I'm not sure what happens if
two fanotify monitors both try marking the inodes.

Arguably if a fanotify monitor is running before those files are in
page cache anyway, then I/O may dominate, and when the files are
cached, fanotify has already cached it's decisions in the kernel.
However fanotify is synchronous: each new file access involves a round
trip to the fanotify userspace and back before it can proceed, so
there's quite a lot of IPC and scheduling too.  Without testing, it's
hard to guess how it'll really perform.

> > While skbs and netlink aren't that slow, I suspect they're an order of
> > magnitude or two slower than, say, epoll or inotify at passing events
> > around.
> 
> not familiar with inotify.

inotify is like dnotify, and like a signal or epoll: a message that
something happened.  You register interest in individual files or
directories only, and inotify does not (yet) provide a way to monitor
the whole filesystem or a subtree.

fanotify is different: it provides access control, and can _refuse_
attempts to read file X, or even modify the file before permitting the
file to be read.

> Theres a difference between events which are abbreviated in the form
> "hey some read happened on fd you are listening on" vs "hey a read
> of file X for 16 bytes at offset 200 by process Y just occured while
> at the same time process Z was writting at offset 2000". The later
> (which netlink will give you) includes a lot more attribute details
> which could be filtered or can be extended to include a lot
> more. The former(what epoll will give you) is merely a signal.

Firstly, it's really hard to retain the ordering of userspace events
like that in a useful way, given the non-determinstic parallelism
going on with multiple processes doing I/O do the same file :-)

Second, you can't really pump messages with that much detail into
netlink and let _it_ filter them to userspace; that would be too much
processing.  You'd have to have some way of not generating that much
detail except when it's been requested, and preferably only for files
you want it for.

But this part is irrelevant to fanotify, because there's no plan or
intention to provide that much detail about I/O.

If you want, feel free to provide a stracenotify subsystem to track
everything in detail :-)

-- Jamie

^ permalink raw reply

* [PATCH resend] sky2: Make sure both ports initialize correctly
From: Mike McCormack @ 2009-09-13 23:08 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

Rework sky2_probe() so that both ports are allocated and deallocated in the
 same way.  Fail if allocating either port fails as both are assumed to be
 allocated in many places.

Signed-off-by: Mike McCormack <mikem@ring3k.org>
---
 drivers/net/sky2.c |   71 +++++++++++++++++++++++++--------------------------
 1 files changed, 35 insertions(+), 36 deletions(-)

diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index c8ebc03..9c3fa49 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -4422,11 +4422,11 @@ static const char *sky2_name(u8 chipid, char *buf, int sz)
 static int __devinit sky2_probe(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	struct net_device *dev;
 	struct sky2_hw *hw;
 	int err, using_dac = 0, wol_default;
 	u32 reg;
 	char buf1[16];
+	int i;
 
 	err = pci_enable_device(pdev);
 	if (err) {
@@ -4512,17 +4512,30 @@ static int __devinit sky2_probe(struct pci_dev *pdev,
 
 	err = sky2_init(hw);
 	if (err)
-		goto err_out_iounmap;
+		goto err_out_free_pci;
 
 	dev_info(&pdev->dev, "Yukon-2 %s chip revision %d\n",
 		 sky2_name(hw->chip_id, buf1, sizeof(buf1)), hw->chip_rev);
 
 	sky2_reset(hw);
 
-	dev = sky2_init_netdev(hw, 0, using_dac, wol_default);
-	if (!dev) {
-		err = -ENOMEM;
-		goto err_out_free_pci;
+	for (i=0; i<hw->ports; i++)
+	{
+		struct net_device *dev;
+
+		dev = sky2_init_netdev(hw, 0, using_dac, wol_default);
+		if (!dev) {
+			err = -ENOMEM;
+			goto err_out_free_netdev;
+		}
+
+		err = register_netdev(dev);
+		if (err) {
+			dev_err(&pdev->dev, "cannot register net device\n");
+			goto err_out_free_netdev;
+		}
+
+		sky2_show_addr(dev);
 	}
 
 	if (!disable_msi && pci_enable_msi(pdev) == 0) {
@@ -4530,44 +4543,21 @@ static int __devinit sky2_probe(struct pci_dev *pdev,
 		if (err == -EOPNOTSUPP)
  			pci_disable_msi(pdev);
 		else if (err)
-			goto err_out_free_netdev;
+			goto err_out_disable_msi;
  	}
 
-	err = register_netdev(dev);
-	if (err) {
-		dev_err(&pdev->dev, "cannot register net device\n");
-		goto err_out_free_netdev;
-	}
-
-	netif_napi_add(dev, &hw->napi, sky2_poll, NAPI_WEIGHT);
+	netif_napi_add(hw->dev[0], &hw->napi, sky2_poll, NAPI_WEIGHT);
 
 	err = request_irq(pdev->irq, sky2_intr,
 			  (hw->flags & SKY2_HW_USE_MSI) ? 0 : IRQF_SHARED,
-			  dev->name, hw);
+			  hw->dev[0]->name, hw);
 	if (err) {
 		dev_err(&pdev->dev, "cannot assign irq %d\n", pdev->irq);
-		goto err_out_unregister;
+		goto err_out_disable_msi;
 	}
 	sky2_write32(hw, B0_IMSK, Y2_IS_BASE);
 	napi_enable(&hw->napi);
 
-	sky2_show_addr(dev);
-
-	if (hw->ports > 1) {
-		struct net_device *dev1;
-
-		dev1 = sky2_init_netdev(hw, 1, using_dac, wol_default);
-		if (!dev1)
-			dev_warn(&pdev->dev, "allocation for second device failed\n");
-		else if ((err = register_netdev(dev1))) {
-			dev_warn(&pdev->dev,
-				 "register of second port failed (%d)\n", err);
-			hw->dev[1] = NULL;
-			free_netdev(dev1);
-		} else
-			sky2_show_addr(dev1);
-	}
-
 	setup_timer(&hw->watchdog_timer, sky2_watchdog, (unsigned long) hw);
 	INIT_WORK(&hw->restart_work, sky2_restart);
 
@@ -4575,12 +4565,21 @@ static int __devinit sky2_probe(struct pci_dev *pdev,
 
 	return 0;
 
-err_out_unregister:
+err_out_disable_msi:
 	if (hw->flags & SKY2_HW_USE_MSI)
 		pci_disable_msi(pdev);
-	unregister_netdev(dev);
+
 err_out_free_netdev:
-	free_netdev(dev);
+	for (i=0; i<hw->ports; i++) {
+		struct net_device *dev = hw->dev[i];
+
+		if (dev) {
+			if (dev->reg_state == NETREG_REGISTERED)
+				unregister_netdev(dev);
+			free_netdev(dev);
+		}
+	}
+
 err_out_free_pci:
 	sky2_write8(hw, B0_CTST, CS_RST_SET);
 	pci_free_consistent(pdev, STATUS_LE_BYTES, hw->st_le, hw->st_dma);
-- 
1.5.6.5


^ permalink raw reply related

* [PATCH resend] sky2: Account for VLAN tag in tx_le_req
From: Mike McCormack @ 2009-09-13 23:07 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev

With VLAN enabled, tx_le_req could return fewer list elements than required,
 leading to a potential wrap around of the transmit ring buffer.

Signed-off-by: Mike McCormack <mikem@ring3k.org>
---
 drivers/net/sky2.c |   14 +++++++++++---
 1 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/drivers/net/sky2.c b/drivers/net/sky2.c
index 00bc65a..c8ebc03 100644
--- a/drivers/net/sky2.c
+++ b/drivers/net/sky2.c
@@ -1563,7 +1563,7 @@ static inline int tx_avail(const struct sky2_port *sky2)
 }
 
 /* Estimate of number of transmit list elements required */
-static unsigned tx_le_req(const struct sk_buff *skb)
+static unsigned tx_le_req(const struct sk_buff *skb, int vlan_tag)
 {
 	unsigned count;
 
@@ -1573,6 +1573,9 @@ static unsigned tx_le_req(const struct sk_buff *skb)
 	if (skb_is_gso(skb))
 		++count;
 
+	else if (sizeof(dma_addr_t) == sizeof(u32) && vlan_tag)
+		++count;
+
 	if (skb->ip_summed == CHECKSUM_PARTIAL)
 		++count;
 
@@ -1611,8 +1614,13 @@ static netdev_tx_t sky2_xmit_frame(struct sk_buff *skb,
 	u16 slot;
 	u16 mss;
 	u8 ctrl;
+	unsigned vlan_tag = 0;
+
+#ifdef SKY2_VLAN_TAG_USED
+	vlan_tag = sky2->vlgrp && vlan_tx_tag_present(skb);
+#endif
 
- 	if (unlikely(tx_avail(sky2) < tx_le_req(skb)))
+ 	if (unlikely(tx_avail(sky2) < tx_le_req(skb, vlan_tag)))
   		return NETDEV_TX_BUSY;
 
 	len = skb_headlen(skb);
@@ -1657,7 +1665,7 @@ static netdev_tx_t sky2_xmit_frame(struct sk_buff *skb,
 	ctrl = 0;
 #ifdef SKY2_VLAN_TAG_USED
 	/* Add VLAN tag, can piggyback on LRGLEN or ADDR64 */
-	if (sky2->vlgrp && vlan_tx_tag_present(skb)) {
+	if (vlan_tag) {
 		if (!le) {
 			le = get_tx_le(sky2, &slot);
 			le->addr = 0;
-- 
1.5.6.5



^ permalink raw reply related

* Re: [PATCH 5/5] Updating documentation accordingly
From: Gerrit Renker @ 2009-09-13 18:50 UTC (permalink / raw)
  To: Ivo Calado; +Cc: dccp, netdev
In-Reply-To: <4AA6A26A.6080808@embedded.ufcg.edu.br>

| Updating documentation accordingly

+/* As defined at section 8.6.1. of RFC 4342 */
 #define TFRC_LOSS_INTERVALS_OPT_MAX_LENGTH	28
+/* Specified on section 8.7. of CCID4 draft */
 #define TFRC_DROP_OPT_MAX_LENGTH		84
Need to update the references to CCID4 draft with RFC 5622. See also 4/5.

It would be more helpful to include the documentation for structs with
the patches that use these structs.

^ permalink raw reply

* Re: [PATCH 4/5] Adds options DROPPED PACKETS and LOSS INTERVALS to receiver
From: Gerrit Renker @ 2009-09-13 18:41 UTC (permalink / raw)
  To: Ivo Calado; +Cc: dccp, netdev
In-Reply-To: <4AA6A267.7080500@embedded.ufcg.edu.br>

| Adds options DROPPED PACKETS and LOSS INTERVALS to receiver. In this patch is added the
| mechanism of gathering information about loss intervals and storing it, for later
| construction of these two options.
This also needs some more work, please see inline.

--- dccp_tree_work5.orig/net/dccp/ccids/lib/packet_history_sp.c
+++ dccp_tree_work5/net/dccp/ccids/lib/packet_history_sp.c
@@ -233,7 +233,9 @@
 }
 
 /* return 1 if a new loss event has been identified */
-static int __two_after_loss(struct tfrc_rx_hist *h, struct sk_buff *skb, u32 n3)
+static int __two_after_loss(struct tfrc_rx_hist *h,
+			    struct sk_buff *skb, u32 n3,
+			    bool *new_loss)
 {
 	u64 s0 = tfrc_rx_hist_loss_prev(h)->tfrchrx_seqno,
 	    s1 = tfrc_rx_hist_entry(h, 1)->tfrchrx_seqno,
@@ -245,6 +247,7 @@
 		tfrc_sp_rx_hist_entry_from_skb(tfrc_rx_hist_entry(h, 3),
 					       skb, n3);
 		h->num_losses = dccp_loss_count(s2, s3, n3);
+		*new_loss = 1;
 		return 1;
 	}
 
@@ -259,6 +262,7 @@
 					       skb, n3);
 		h->loss_count = 3;
 		h->num_losses = dccp_loss_count(s1, s3, n3);
+		*new_loss = 1;
 		return 1;
 	}
 
@@ -284,6 +288,7 @@
 			tfrc_sp_rx_hist_entry_from_skb(
 					tfrc_rx_hist_loss_prev(h), skb, n3);
 
+		*new_loss = 0;
 		return 0;
 	}
 
@@ -297,6 +302,7 @@
 	h->loss_count = 3;
 	h->num_losses = dccp_loss_count(s0, s3, n3);
 
+	*new_loss = 1;
 	return 1;
 }
The 'new_loss' parameter is completely redundant, since it duplicates the
return value of the function.


@@ -348,11 +354,14 @@
  *  operations when loss_count is greater than 0 after calling this function.
  */
 bool tfrc_sp_rx_congestion_event(struct tfrc_rx_hist *h,
-			      struct tfrc_loss_hist *lh,
-	 struct sk_buff *skb, const u64 ndp,
-  u32 (*first_li)(struct sock *), struct sock *sk)
+				 struct tfrc_loss_hist *lh,
+				 struct tfrc_loss_data *ld,
+				 struct sk_buff *skb, const u64 ndp,
+				 u32 (*first_li)(struct sock *),
+				 struct sock *sk)
 {
 	bool new_event = false;
+	bool new_loss = false;
 
 	if (tfrc_sp_rx_hist_duplicate(h, skb))
 		return 0;
@@ -365,12 +374,13 @@
 		__one_after_loss(h, skb, ndp);
 	} else if (h->loss_count != 2) {
 		DCCP_BUG("invalid loss_count %d", h->loss_count);
-	} else if (__two_after_loss(h, skb, ndp)) {
+	} else if (__two_after_loss(h, skb, ndp, &new_loss)) {
 		/*
 		* Update Loss Interval database and recycle RX records
 		*/
 		new_event = tfrc_sp_lh_interval_add(lh, h, first_li, sk,
 						dccp_hdr(skb)->dccph_ccval);
+		tfrc_sp_update_li_data(ld, h, skb, new_loss, new_event);
 		__three_after_loss(h);
 
 	} else if (dccp_data_packet(skb) && dccp_skb_is_ecn_ce(skb)) {
@@ -396,6 +406,8 @@
 		}
 	}
 
+	tfrc_sp_update_li_data(ld, h, skb, new_loss, new_event);
+
I don't understand why 'tfrc_sp_update_li_data' is called twice, one call seems
to be redundant. What it seems to be wanting to do is
	bool new_loss = false;

	//...
	} else if (__two_after_loss(h, skb, ndp)) {
		new_loss  = true;
		new_event = tfrc_sp_lh_interval_add(...);
		// ...
	}
	// ...
	tfrc_sp_update_li_data(ld, h, skb, new_loss, new_event);
	


--- dccp_tree_work5.orig/net/dccp/ccids/lib/loss_interval_sp.c
+++ dccp_tree_work5/net/dccp/ccids/lib/loss_interval_sp.c
 
+void tfrc_sp_ld_cleanup(struct tfrc_loss_data *ld)
+{
+	struct tfrc_loss_data_entry *next, *h = ld->head;
+
+	if (!h)
+		return;
+
+	while (h) {
+		next = h->next;
+		kmem_cache_free(tfrc_ld_slab, h);
+		h = next;
+	}
+
+	ld->head = NULL;
+	ld->counter = 0;
+}
The if(!h) statement is not needed above and prevents resetting 
the head/counter values.


+void tfrc_sp_ld_prepare_data(u8 loss_count, struct tfrc_loss_data *ld)
+{
+	u8 *li_ofs, *d_ofs;
+	struct tfrc_loss_data_entry *e;
+	u16 count;
+
+	li_ofs = &ld->loss_intervals_opts[0];
+	d_ofs = &ld->drop_opts[0];
+
+	count = 0;
+	e = ld->head;
+
+	*li_ofs = loss_count + 1;
+	li_ofs++;
+
+	while (e != NULL) {
+
+		if (count < TFRC_LOSS_INTERVALS_OPT_MAX_LENGTH) {
+			*li_ofs = ((htonl(e->lossless_length) & 0x00FFFFFF)<<8);
+			li_ofs += 3;
+			*li_ofs = ((e->ecn_nonce_sum&0x1) << 31) &
+				  (htonl((e->loss_length & 0x00FFFFFF))<<8);
==> This does not seem right: mixing htonl with non-htonl data, using '&' where
    probably '|' is meant.
+			li_ofs += 3;
+			*li_ofs = ((htonl(e->data_length) & 0x00FFFFFF)<<8);
==> I think you mean "e->data_length & 0xFFFFFF".

+			li_ofs += 3;
+		}
+
+		if (count < TFRC_DROP_OPT_MAX_LENGTH) {
+			*d_ofs = (htonl(e->drop_count) & 0x00FFFFFF)<<8;
+			d_ofs += 3;
+		}
+
+		if ((count >= TFRC_LOSS_INTERVALS_OPT_MAX_LENGTH) &&
+		    (count >= TFRC_DROP_OPT_MAX_LENGTH))
+			break;
==> This could be better handled using 'else' cases above, it seems the '&&' means '||'.
+
+		count++;
+		e = e->next;
+	}
+}

+void tfrc_sp_update_li_data(struct tfrc_loss_data *ld,
+			    struct tfrc_rx_hist *rh,
+			    struct sk_buff *skb,
+			    bool new_loss, bool new_event)
+{
+	struct tfrc_loss_data_entry *new, *h;
+
+	if (!dccp_data_packet(skb))
+		return;
+
+	if (ld->head == NULL) {
+		new = tfrc_ld_add_new(ld);
+		if (unlikely(new == NULL)) {
+			DCCP_CRIT("Cannot allocate new loss data registry.");
+			return;
+		}
+
+		if (new_loss) {
+			new->drop_count = rh->num_losses;
+			new->lossless_length = 1;
+			new->loss_length = rh->num_losses;
+
+			if (dccp_data_packet(skb))
+				new->data_length = 1;
+
+			if (dccp_data_packet(skb) && dccp_skb_is_ecn_ect1(skb))
+				new->ecn_nonce_sum = 1;
+			else
+				new->ecn_nonce_sum = 0;
+		} else {
+			new->drop_count = 0;
+			new->lossless_length = 1;
+			new->loss_length = 0;
+
+			if (dccp_data_packet(skb))
+				new->data_length = 1;
+
+			if (dccp_data_packet(skb) && dccp_skb_is_ecn_ect1(skb))
+				new->ecn_nonce_sum = 1;
+			else
+				new->ecn_nonce_sum = 0;
+		}
+
+		return;
+	}
+
+	if (new_event) {
+		new = tfrc_ld_add_new(ld);
+		if (unlikely(new == NULL)) {
+			DCCP_CRIT("Cannot allocate new loss data registry. \
+					Cleaning up.");
+			tfrc_sp_ld_cleanup(ld);
+			return;
+		}
+
+		new->drop_count = rh->num_losses;
+		new->lossless_length = (ld->last_loss_count - rh->loss_count);
+		new->loss_length = rh->num_losses;
+
+		new->ecn_nonce_sum = 0;
+		new->data_length = 0;
+
+		while (ld->last_loss_count > rh->loss_count) {
+			ld->last_loss_count--;
+
+			if (ld->sto_is_data & (1 << (ld->last_loss_count))) {
+				new->data_length++;
+
+				if (ld->sto_ecn & (1 << (ld->last_loss_count)))
+					new->ecn_nonce_sum =
+						!new->ecn_nonce_sum;
+			}
+		}
+
+		return;
+	}
+
+	h = ld->head;
+
+	if (rh->loss_count > ld->last_loss_count) {
+		ld->last_loss_count = rh->loss_count;
+
+		if (dccp_data_packet(skb))
+			ld->sto_is_data |= (1 << (ld->last_loss_count - 1));
+
+		if (dccp_skb_is_ecn_ect1(skb))
+			ld->sto_ecn |= (1 << (ld->last_loss_count - 1));
+
+		return;
+	}
+
+	if (new_loss) {
+		h->drop_count += rh->num_losses;
+		h->lossless_length = (ld->last_loss_count - rh->loss_count);
+		h->loss_length += h->lossless_length + rh->num_losses;
+
+		h->ecn_nonce_sum = 0;
+		h->data_length = 0;
+
+		while (ld->last_loss_count > rh->loss_count) {
+			ld->last_loss_count--;
+
+			if (ld->sto_is_data&(1 << (ld->last_loss_count))) {
+				h->data_length++;
+
+				if (ld->sto_ecn & (1 << (ld->last_loss_count)))
+					h->ecn_nonce_sum = !h->ecn_nonce_sum;
+			}
+		}
+
+		return;
+	}
+
+	if (ld->last_loss_count > rh->loss_count) {
+		while (ld->last_loss_count > rh->loss_count) {
+			ld->last_loss_count--;
+
+			h->lossless_length++;
+
+			if (ld->sto_is_data & (1 << (ld->last_loss_count))) {
+				h->data_length++;
+
+				if (ld->sto_ecn & (1 << (ld->last_loss_count)))
+					h->ecn_nonce_sum = !h->ecn_nonce_sum;
+			}
+		}
+
+		return;
+	}
+
+	h->lossless_length++;
+
+	if (dccp_data_packet(skb)) {
+		h->data_length++;
+
+		if (dccp_skb_is_ecn_ect1(skb))
+			h->ecn_nonce_sum = !h->ecn_nonce_sum;
+	}
+}
Due to the 
       if (!dccp_data_packet(skb))
               return;
at the begin of the function, all subsequent 'dccp_data_packet(skb)' are unnecessary.
Almost every 'if' statement ends in 'return', this seems ad-hoc and could be reduced
by adding if-else-if-else-if..., which would probably also reduce the code duplication.


@@ -244,8 +463,11 @@
 	tfrc_lh_slab = kmem_cache_create("tfrc_sp_li_hist",
 					 sizeof(struct tfrc_loss_interval), 0,
 					 SLAB_HWCACHE_ALIGN, NULL);
+	tfrc_ld_slab = kmem_cache_create("tfrc_sp_li_data",
+					 sizeof(struct tfrc_loss_data_entry), 0,
+					 SLAB_HWCACHE_ALIGN, NULL);
 
-	if ((tfrc_lh_slab != NULL))
+	if ((tfrc_lh_slab != NULL) || (tfrc_ld_slab != NULL))
 		return 0;
 
 	if (tfrc_lh_slab != NULL) {
@@ -253,6 +475,11 @@
 		tfrc_lh_slab = NULL;
 	}
 
+	if (tfrc_ld_slab != NULL) {
+		kmem_cache_destroy(tfrc_ld_slab);
+		tfrc_ld_slab = NULL;
+	}
+
 	return -ENOBUFS;
 }
The condition above should be '&&', not '||'. Suggested alternative:

+	if (tfrc_lh_slab == NULL)
+		goto lh_failed;
+
+	tfrc_ld_slab = kmem_cache_create("tfrc_sp_li_data",
+					 sizeof(struct tfrc_loss_data_entry), 0,
+					 SLAB_HWCACHE_ALIGN, NULL);
+	if (tfrc_ld_slab != NULL)
+		return 0;
+
+	kmem_cache_destroy(tfrc_lh_slab);
+	tfrc_lh_slab = NULL;
+lh_failed:
+	return -ENOBUFS;
 }
 



--- dccp_tree_work5.orig/net/dccp/ccids/lib/loss_interval_sp.h
+++ dccp_tree_work5/net/dccp/ccids/lib/loss_interval_sp.h
@@ -70,13 +70,52 @@
 struct tfrc_rx_hist;
 #endif
 
+struct tfrc_loss_data_entry {
+	struct tfrc_loss_data_entry	*next;
+	u32				lossless_length:24;
+	u8				ecn_nonce_sum:1;
+	u32				loss_length:24;
+	u32				data_length:24;
+	u32				drop_count:24;
+};
According to RFC 4342, 8.6.1, Loss Length is a 23-bit number, i.e.
	u32				loss_length:23;




+#define TFRC_LOSS_INTERVALS_OPT_MAX_LENGTH	28
+#define TFRC_DROP_OPT_MAX_LENGTH		84
+#define TFRC_LI_OPT_SZ	\
+	(2 + TFRC_LOSS_INTERVALS_OPT_MAX_LENGTH*9)
+#define TFRC_DROPPED_OPT_SZ \
+	(1 + TFRC_DROP_OPT_MAX_LENGTH*3)
It would be good to have a reminder where the numbers come from, i.e.
 * the "28" comes from RFC 4342, 8.6
 * the "84" comes from RFC 5622, 8.7
 * the "9" again is from RFC 4342, 8.6
 * the 1 + TFRC_DROP_OPT_MAX_LENGTH*3 = DCCP_SINGLE_OPT_MAXLEN (linux/dccp.h)


+struct tfrc_loss_data {
+	struct tfrc_loss_data_entry	*head;
+	u16				counter;
+	u8				loss_intervals_opts[TFRC_LI_OPT_SZ];
+	u8				drop_opts[TFRC_DROPPED_OPT_SZ];
+	u8				last_loss_count;
+	u8				sto_ecn;
+	u8				sto_is_data;
+};



+static inline void tfrc_ld_init(struct tfrc_loss_data *ld)
+{
+	memset(ld, 0, sizeof(struct tfrc_loss_data));
+}
A tip from CodingStyle - using "sizeof(*ld)" will continue to work if there
are changes in the interface.


--- dccp_tree_work5.orig/net/dccp/dccp.h
+++ dccp_tree_work5/net/dccp/dccp.h
@@ -403,6 +403,16 @@
 	return (DCCP_SKB_CB(skb)->dccpd_ecn & INET_ECN_MASK) == INET_ECN_CE;
 }
 
+static inline bool dccp_skb_is_ecn_ect0(const struct sk_buff *skb)
+{
+	return (DCCP_SKB_CB(skb)->dccpd_ecn & INET_ECN_MASK) == INET_ECN_ECT_0;
+}
+
+static inline bool dccp_skb_is_ecn_ect1(const struct sk_buff *skb)
+{
+	return (DCCP_SKB_CB(skb)->dccpd_ecn & INET_ECN_MASK) == INET_ECN_ECT_0;
+}
The routines are not needed, because the dccpd_ecn field is (deliberately) only
2 bits wide. In the second case it should have been INET_ECN_ECT_1.

^ permalink raw reply

* Re: [PATCH 3/5] Implement TFRC-SP calc of mean length of loss intervals, accordingly to section 3 of RFC 4828
From: Gerrit Renker @ 2009-09-13 17:28 UTC (permalink / raw)
  To: Ivo Calado; +Cc: dccp, netdev
In-Reply-To: <4AA6A263.8000106@embedded.ufcg.edu.br>

| Implement TFRC-SP calc of mean length of loss intervals accordingly to
| section 3 of RFC 4828
Sorry this also has problems.

--- dccp_tree_work5.orig/net/dccp/ccids/lib/loss_interval_sp.c
+++ dccp_tree_work5/net/dccp/ccids/lib/loss_interval_sp.c
@@ -67,10 +67,11 @@
 		}
 }
 
-static void tfrc_sp_lh_calc_i_mean(struct tfrc_loss_hist *lh)
+static void tfrc_sp_lh_calc_i_mean(struct tfrc_loss_hist *lh, __u8 curr_ccval)
 {
 	u32 i_i, i_tot0 = 0, i_tot1 = 0, w_tot = 0;
 	int i, k = tfrc_lh_length(lh) - 1; /* k is as in rfc3448bis, 5.4 */
+	u32 losses;
 
 	if (k <= 0)
 		return;
@@ -78,6 +79,15 @@
 	for (i = 0; i <= k; i++) {
 		i_i = tfrc_lh_get_interval(lh, i);
 
+		if (SUB16(curr_ccval,
+		    tfrc_lh_get_loss_interval(lh, i)->li_ccval) <= 8) {
+
+			losses = tfrc_lh_get_loss_interval(lh, i)->li_losses;
+
+			if (losses > 0)
+				i_i = div64_u64(i_i, losses);
+		}
+
(Both 'i_i' and 'losses' are u32, so could write "i_i /= losses" instead.)

However, this computation is done in the wrong place: here it is already too
late. The N/K in RFC 4828 refers to entering each individual interval I_0, so
the division (and the check whether this is a "short" interval) needs to be
done when adding the interval, in tfrc_sp_lh_interval_add(). (There also exists
a special rule for the open interval I_0, see RFC 5348, 5.3.)

A second problem is a divide-by-zero condition encoded in the above: if
i_i < losses, the result is 0, if all intervals are 0 then I_mean = 0, and
thus p = 1/I_mean triggers a panic. In all other cases, the integer arithmetic
has the effect of floor(N/K), i.e. it decreases the interval length.

A way out (which does not fix the truncation problem) is to round up:

	if (losses > 0)
		i_i = DIV_ROUND_UP(i_i, losses);

In fact, we have to do this, to avoid the divide-by-zero condition.

The third problem is that the CCVal can be wrong, i.e. "less than 8" can
also mean that it just wrapped around one or more times. But this is noted
already in RFC 5622, it is a problem of the specification.

@@ -87,6 +97,11 @@
 	}
 
 	lh->i_mean = max(i_tot0, i_tot1) / w_tot;
+	BUG_ON(w_tot == 0);
+	if (SUB16(curr_ccval, tfrc_lh_get_loss_interval(lh, 0)->li_ccval) > 8)
+		lh->i_mean = max(i_tot0, i_tot1) / w_tot;
+	else
+		lh->i_mean = i_tot1 / w_tot;
 }
The line above the BUG_ON is probably a leftover. For the BUG_ON, this would
be too late (division by zero). 
In fact, the BUG_ON is not needed, due to testing for k <= 0, see
http://mirror.celinuxforum.org/gitstat//commit-detail.php?commit=eff253c4272cd2aac95ccff46d3d2e1a495f22b1

@@ -127,7 +142,7 @@
 		return;
 
 	cur->li_length = len;
-	tfrc_sp_lh_calc_i_mean(lh);
+	tfrc_sp_lh_calc_i_mean(lh, dccp_hdr(skb)->dccph_ccval);
 }
Should be division as per above.

--- dccp_tree_work5.orig/net/dccp/ccids/lib/loss_interval_sp.h
+++ dccp_tree_work5/net/dccp/ccids/lib/loss_interval_sp.h
@@ -73,7 +73,8 @@
 extern bool tfrc_sp_lh_interval_add(struct tfrc_loss_hist *,
 				    struct tfrc_rx_hist *,
 				    u32 (*first_li)(struct sock *),
-				    struct sock *);
+				    struct sock *,
+				    __u8 ccval);
This function already has the ccval available, so could use it instead
of just passing it through.

^ permalink raw reply

* Re: [PATCH 2/5] Implement loss counting on TFRC-SP receiver
From: Gerrit Renker @ 2009-09-13 16:12 UTC (permalink / raw)
  To: Ivo Calado; +Cc: dccp, netdev
In-Reply-To: <4AA6A25F.4060100@embedded.ufcg.edu.br>

| Implement loss counting on TFRC-SP receiver. 
| Consider transmission's hole size as loss count.
The implementation of loss counts as the basis for Dropped Packet option has
problems and can in its current form not be used. Please see below.


--- dccp_tree_work4.orig/net/dccp/ccids/lib/loss_interval_sp.c
+++ dccp_tree_work4/net/dccp/ccids/lib/loss_interval_sp.c
@@ -187,6 +187,7 @@
 		s64 len = dccp_delta_seqno(cur->li_seqno, cong_evt_seqno);
 		if ((len <= 0) ||
 		    (!tfrc_lh_closed_check(cur, cong_evt->tfrchrx_ccval))) {
+			cur->li_losses += rh->num_losses;
 			return false;
 		}
This has a multiplicative effect, since rh->num_losses is added to cur->li_losses
each time the condition is evaluated. E.g. if 3 times in a row reordered (earlier)
sequence numbers arrive, or if the CCvals do not differ (high-speed networks),
we end up with 3 * rh->num_losses, which can't be correct.



--- dccp_tree_work4.orig/net/dccp/ccids/lib/packet_history_sp.c
+++ dccp_tree_work4/net/dccp/ccids/lib/packet_history_sp.c
@@ -244,6 +244,7 @@
 		h->loss_count = 3;
 		tfrc_sp_rx_hist_entry_from_skb(tfrc_rx_hist_entry(h, 3),
 					       skb, n3);
+		h->num_losses = dccp_loss_count(s2, s3, n3);
 		return 1;
 	}
This only measures the gap between s2 and s3, but the "hole" starts at s0,
so it would need to be dccp_loss_count(s0, s3, n3). Algorithm is documented at
http://www.erg.abdn.ac.uk/users/gerrit/dccp/notes/\
ccid3/ccid3_packet_reception/loss_detection/loss_detection_algorithm_notes.txt
 
@@ -257,6 +258,7 @@
 		tfrc_sp_rx_hist_entry_from_skb(tfrc_rx_hist_entry(h, 2),
 					       skb, n3);
 		h->loss_count = 3;
+		h->num_losses = dccp_loss_count(s1, s3, n3);
 		return 1;
 	}
In this case, the "hole" is between s0 and s2, it is case VI(d) in the above.
 
@@ -293,6 +295,7 @@
 	h->loss_start = tfrc_rx_hist_index(h, 3);
 	tfrc_sp_rx_hist_entry_from_skb(tfrc_rx_hist_entry(h, 1), skb, n3);
 	h->loss_count = 3;
+	h->num_losses = dccp_loss_count(s0, s3, n3);
 
 	return 1;
 }
Here it is also between s0 and s2, not between s0 and s3. It is case VI(c.3).

However, the above still is a crude approximation, since it only measures between
the last sequence number received before the loss and the third sequence number
after the loss. It would be better to either

 * use the first sequence number after the loss (this can be s1, s2, or s3) or
 * check if there are more holes between the first/second and the second/third
   sequence numbers after the loss.

The second option would be the correct one, it should also take the NDP counts
of each gap into account. And already we have a fairly complicated algorithm.
 		  
Another observation is that this function is only called in packet_history_sp.c,
and only in __two_after_loss(), so that dccp_loss_count() could be made static,
and without the need for the WARN_ON (see below), since in all above cases it is
ensured that the first sequence number argument is "before" the second one.


--- dccp_tree_work4.orig/net/dccp/ccids/lib/packet_history_sp.h
+++ dccp_tree_work4/net/dccp/ccids/lib/packet_history_sp.h
@@ -113,6 +113,7 @@
 	u32			  packet_size,
 				  bytes_recvd;
 	ktime_t			  bytes_start;
+	u8			  num_losses;
 };
No more than 255 losses? NDP count option has space for up to 6 bytes, i.e. 2^48-1.
Suggest u64 for consistency with the other parts.
 

--- dccp_tree_work4.orig/net/dccp/dccp.h
+++ dccp_tree_work4/net/dccp/dccp.h
@@ -168,6 +168,21 @@
 	return (u64)delta <= ndp + 1;
 }
 
+static inline u64 dccp_loss_count(const u64 s1, const u64 s2, const u64 ndp)
+{
+	s64 delta, count;
+
+	delta = dccp_delta_seqno(s1, s2);
+	WARN_ON(delta < 0);
+
+	count = ndp + 1;
+	count -= delta;
+
+	count = (count > 0) ? count : 0;
+
+	return (u64) count;
+}
Let S_A, S_B be sequence numbers such that S_B is "after" S_A, and let
N_B be the NDP count of packet S_B. Then, using module-2^48 arithmetic,
 D = S_B - S_A - 1  is an upper bound of the number of lost data packets,
 D - N_B            is an approximation of the number of lost data 
                    packets (there are cases where this is not exact).

But your formula computes N_B - D, i.e. it negates this value. What you
want is dccp_loss_count(S_A, S_B, N_B) := max(S_B - S_A - 1 - N_B, 0).

A short implementation would be:

/**
 * dccp_loss_count - Approximate the number of data packets lost in a row
 * @s1:   last known sequence number before the loss ('hole')
 * @s2:   first sequence number seen after the 'hole'
 * @ndp:  ndp count associated with packet having sequence number @s2
 */
u64 dccp_loss_count(const u64 s1, const u64 s2, const u64 ndp)
{
	s64 delta = dccp_delta_seqno(s1, s2);
	
	WARN_ON(delta < 0);
	delta -= ndp + 1;

	return delta > 0 ? delta : 0;
}

But then dccp_loss_free reduces to a specialisation of the above:
bool dccp_loss_free(const u64 s1, const u64 s2, const u64 ndp)
{
	return dccp_loss_count(s1, s2, ndp) == 0;
}

But please see above -- the function needs to be called for each hole in a row.

^ permalink raw reply

* INFO: possible circular locking detected [when unloading the bonding driver]
From: Alan Jenkins @ 2009-09-13 12:26 UTC (permalink / raw)
  To: Jay Vosburgh; +Cc: linux-kernel, netdev, bonding-devel

I found this warning after loading and unloading all the modules I had
available.  I can reproduce it by hibernating the system, then loading
and unloading the bonding module. 

Regards
Alan

[  123.484225] Ethernet Channel Bonding Driver: v3.5.0 (November 4, 2008)
[  123.484229] bonding: Warning: either miimon or arp_interval and
arp_ip_target module parameters must be specified, otherwise bonding
will not detect link failures! see bonding.txt for details.
[  125.790929]
[  125.790931] =======================================================
[  125.790936] [ INFO: possible circular locking dependency detected ]
[  125.790940] 2.6.31-rc8debug #50
[  125.790942] -------------------------------------------------------
[  125.790946] modprobe/8372 is trying to acquire lock:
[  125.790948]  (cpu_add_remove_lock){+.+.+.}, at: [<ffffffff81048ef2>]
cpu_maps_update_begin+0x12/0x20
[  125.790961]
[  125.790962] but task is already holding lock:
[  125.790965]  (rtnl_mutex){+.+.+.}, at: [<ffffffff812d0aa2>]
rtnl_lock+0x12/0x20
[  125.790974]
[  125.790975] which lock already depends on the new lock.
[  125.790976]
[  125.790978]
[  125.790979] the existing dependency chain (in reverse order) is:
[  125.790982]
[  125.790983] -> #3 (rtnl_mutex){+.+.+.}:
[  125.790989]        [<ffffffff810741d9>] validate_chain+0xb99/0x11d0
[  125.790996]        [<ffffffff81074b6e>] __lock_acquire+0x35e/0xa30
[  125.791001]        [<ffffffff810752d7>] lock_acquire+0x97/0x110
[  125.791006]        [<ffffffff8134f046>] mutex_lock_nested+0x56/0x310
[  125.791012]        [<ffffffff812d0aa2>] rtnl_lock+0x12/0x20
[  125.791016]        [<ffffffff812d2079>] linkwatch_event+0x9/0x40
[  125.791021]        [<ffffffff8105cafa>] worker_thread+0x23a/0x350
[  125.791027]        [<ffffffff81060f56>] kthread+0xa6/0xb0
[  125.791032]        [<ffffffff8100ce5a>] child_rip+0xa/0x20
[  125.791037]        [<ffffffffffffffff>] 0xffffffffffffffff
[  125.791051]
[  125.791052] -> #2 ((linkwatch_work).work){+.+.+.}:
[  125.791058]        [<ffffffff810741d9>] validate_chain+0xb99/0x11d0
[  125.791063]        [<ffffffff81074b6e>] __lock_acquire+0x35e/0xa30
[  125.791068]        [<ffffffff810752d7>] lock_acquire+0x97/0x110
[  125.791073]        [<ffffffff8105caf4>] worker_thread+0x234/0x350
[  125.791077]        [<ffffffff81060f56>] kthread+0xa6/0xb0
[  125.791082]        [<ffffffff8100ce5a>] child_rip+0xa/0x20
[  125.791087]        [<ffffffffffffffff>] 0xffffffffffffffff
[  125.791092]
[  125.791093] -> #1 (events){+.+.+.}:
[  125.791098]        [<ffffffff810741d9>] validate_chain+0xb99/0x11d0
[  125.791104]        [<ffffffff81074b6e>] __lock_acquire+0x35e/0xa30
[  125.791108]        [<ffffffff810752d7>] lock_acquire+0x97/0x110
[  125.791114]        [<ffffffff8105b758>]
cleanup_workqueue_thread+0x48/0xd0
[  125.791119]        [<ffffffff8133dffa>] workqueue_cpu_callback+0x9a/0x130
[  125.791125]        [<ffffffff81353dce>] notifier_call_chain+0x5e/0xa0
[  125.791130]        [<ffffffff81065549>]
__raw_notifier_call_chain+0x9/0x10
[  125.791136]        [<ffffffff81065561>] raw_notifier_call_chain+0x11/0x20
[  125.791141]        [<ffffffff8133c213>] _cpu_down+0x173/0x2b0
[  125.791145]        [<ffffffff81049148>] disable_nonboot_cpus+0x98/0x130
[  125.791150]        [<ffffffff81081f97>] hibernation_snapshot+0x187/0x250
[  125.791156]        [<ffffffff81082178>] hibernate+0x118/0x1c0
[  125.791160]        [<ffffffff81080863>] state_store+0xd3/0xf0
[  125.791165]        [<ffffffff81199fe7>] kobj_attr_store+0x17/0x20
[  125.791172]        [<ffffffff8113e7b4>] sysfs_write_file+0xf4/0x150
[  125.791178]        [<ffffffff810e3557>] vfs_write+0xc7/0x1a0
[  125.791184]        [<ffffffff810e3ba0>] sys_write+0x50/0x90
[  125.791189]        [<ffffffff8100bdab>] system_call_fastpath+0x16/0x1b
[  125.791194]        [<ffffffffffffffff>] 0xffffffffffffffff
[  125.791205]
[  125.791205] -> #0 (cpu_add_remove_lock){+.+.+.}:
[  125.791211]        [<ffffffff81073c0d>] validate_chain+0x5cd/0x11d0
[  125.791216]        [<ffffffff81074b6e>] __lock_acquire+0x35e/0xa30
[  125.791221]        [<ffffffff810752d7>] lock_acquire+0x97/0x110
[  125.791225]        [<ffffffff8134f046>] mutex_lock_nested+0x56/0x310
[  125.791230]        [<ffffffff81048ef2>] cpu_maps_update_begin+0x12/0x20
[  125.791235]        [<ffffffff8105b842>] destroy_workqueue+0x22/0xc0
[  125.791240]        [<ffffffffa03afdfa>] bond_uninit+0x3a/0x70 [bonding]
[  125.791252]        [<ffffffff812c5e8d>] rollback_registered+0xbd/0x120
[  125.791258]        [<ffffffff812c5f0d>] unregister_netdevice+0x1d/0x70
[  125.791262]        [<ffffffffa03ad71f>] bond_free_all+0x4f/0xa0 [bonding]
[  125.791273]        [<ffffffffa03ba075>] bonding_exit+0x35/0x3c [bonding]
[  125.791283]        [<ffffffff8107ddc6>] sys_delete_module+0x176/0x230
[  125.791288]        [<ffffffff8100bdab>] system_call_fastpath+0x16/0x1b
[  125.791293]        [<ffffffffffffffff>] 0xffffffffffffffff
[  125.791298]
[  125.791299] other info that might help us debug this:
[  125.791300]
[  125.791304] 1 lock held by modprobe/8372:
[  125.791306]  #0:  (rtnl_mutex){+.+.+.}, at: [<ffffffff812d0aa2>]
rtnl_lock+0x12/0x20
[  125.791314]
[  125.791315] stack backtrace:
[  125.791319] Pid: 8372, comm: modprobe Not tainted 2.6.31-rc8debug #50
[  125.791322] Call Trace:
[  125.791328]  [<ffffffff81073494>] print_circular_bug_tail+0x94/0xe0
[  125.791333]  [<ffffffff81073c0d>] validate_chain+0x5cd/0x11d0
[  125.791338]  [<ffffffff81074b6e>] __lock_acquire+0x35e/0xa30
[  125.791343]  [<ffffffff810752d7>] lock_acquire+0x97/0x110
[  125.791349]  [<ffffffff81048ef2>] ? cpu_maps_update_begin+0x12/0x20
[  125.791354]  [<ffffffff8134f046>] mutex_lock_nested+0x56/0x310
[  125.791358]  [<ffffffff81048ef2>] ? cpu_maps_update_begin+0x12/0x20
[  125.791363]  [<ffffffff8113f974>] ? release_sysfs_dirent+0x54/0xc0
[  125.791368]  [<ffffffff81048ef2>] cpu_maps_update_begin+0x12/0x20
[  125.791373]  [<ffffffff8105b842>] destroy_workqueue+0x22/0xc0
[  125.791383]  [<ffffffffa03afdfa>] bond_uninit+0x3a/0x70 [bonding]
[  125.791388]  [<ffffffff812c5e8d>] rollback_registered+0xbd/0x120
[  125.791397]  [<ffffffffa03abfb6>] ? bond_work_cancel_all+0x16/0x130
[bonding]
[  125.791402]  [<ffffffff812c5f0d>] unregister_netdevice+0x1d/0x70
[  125.791411]  [<ffffffffa03ad71f>] bond_free_all+0x4f/0xa0 [bonding]
[  125.791421]  [<ffffffffa03ba075>] bonding_exit+0x35/0x3c [bonding]
[  125.791425]  [<ffffffff8107ddc6>] sys_delete_module+0x176/0x230
[  125.791430]  [<ffffffff8107186d>] ? trace_hardirqs_on_caller+0x14d/0x1a0
[  125.791435]  [<ffffffff81350a04>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[  125.791441]  [<ffffffff8100bdab>] system_call_fastpath+0x16/0x1b


^ permalink raw reply

* Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
From: Michael S. Tsirkin @ 2009-09-13 12:01 UTC (permalink / raw)
  To: Gregory Haskins
  Cc: Ira W. Snyder, netdev, virtualization, kvm, linux-kernel, mingo,
	linux-mm, akpm, hpa, Rusty Russell, s.hetze
In-Reply-To: <4AAA7415.5080204@gmail.com>

On Fri, Sep 11, 2009 at 12:00:21PM -0400, Gregory Haskins wrote:
> FWIW: VBUS handles this situation via the "memctx" abstraction.  IOW,
> the memory is not assumed to be a userspace address.  Rather, it is a
> memctx-specific address, which can be userspace, or any other type
> (including hardware, dma-engine, etc).  As long as the memctx knows how
> to translate it, it will work.

How would permissions be handled? it's easy to allow an app to pass in
virtual addresses in its own address space.  But we can't let the guest
specify physical addresses.

-- 
MST

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* [PATCH  kernel 2.6.31] pcnet_cs: add cis of Linksys multifunction pcmcia card
From: Ken Kawasaki @ 2009-09-13  8:22 UTC (permalink / raw)
  To: netdev; +Cc: ken_kawasaki
In-Reply-To: <20090426162929.3d036f26.ken_kawasaki@spring.nifty.jp>


pcnet_cs,serial_cs:
 
add cis of Linksys lan&modem mulitifunction pcmcia card
and some modem card(MT5634ZLX, RS-COM-2P).

  
Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>
 
---

for kernel 2.6.31 

 drivers/net/pcmcia/pcnet_cs.c   |   10 +++++-----
 drivers/serial/serial_cs.c      |   14 +++++++-------
 firmware/Makefile               |    3 ++-
 firmware/WHENCE                 |   12 ++++++++++++
 firmware/cis/MT5634ZLX.cis.ihex |   11 +++++++++++
 firmware/cis/PCMLM28.cis.ihex   |   18 ++++++++++++++++++
 firmware/cis/RS-COM-2P.cis.ihex |   10 ++++++++++
 7 files changed, 65 insertions(+), 13 deletions(-)

diff -uprN linux-2.6.31.orig/drivers/net/pcmcia/pcnet_cs.c linux-2.6.31/drivers/net/pcmcia/pcnet_cs.c
--- linux-2.6.31.orig/drivers/net/pcmcia/pcnet_cs.c	2009-09-10 07:13:59.000000000 +0900
+++ linux-2.6.31/drivers/net/pcmcia/pcnet_cs.c	2009-09-12 19:59:57.000000000 +0900
@@ -1751,11 +1751,11 @@ static struct pcmcia_device_id pcnet_ids
 	PCMCIA_DEVICE_PROD_ID2("EN-6200P2", 0xa996d078),
 	/* too generic! */
 	/* PCMCIA_DEVICE_PROD_ID12("PCMCIA", "10/100 Ethernet Card", 0x281f1c5d, 0x11b0ffc0), */
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "PCMCIA", "EN2218-LAN/MODEM", 0x281f1c5d, 0x570f348e, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "PCMCIA", "UE2218-LAN/MODEM", 0x281f1c5d, 0x6fdcacee, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "Psion Dacom", "Gold Card V34 Ethernet", 0xf5f025c2, 0x338e8155, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "Psion Dacom", "Gold Card V34 Ethernet GSM", 0xf5f025c2, 0x4ae85d35, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "LINKSYS", "PCMLM28", 0xf7cb0b07, 0x66881874, "PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "PCMCIA", "EN2218-LAN/MODEM", 0x281f1c5d, 0x570f348e, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "PCMCIA", "UE2218-LAN/MODEM", 0x281f1c5d, 0x6fdcacee, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "Psion Dacom", "Gold Card V34 Ethernet", 0xf5f025c2, 0x338e8155, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "Psion Dacom", "Gold Card V34 Ethernet GSM", 0xf5f025c2, 0x4ae85d35, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "LINKSYS", "PCMLM28", 0xf7cb0b07, 0x66881874, "cis/PCMLM28.cis"),
 	PCMCIA_MFC_DEVICE_CIS_PROD_ID12(0, "DAYNA COMMUNICATIONS", "LAN AND MODEM MULTIFUNCTION", 0x8fdf8f89, 0xdd5ed9e8, "DP83903.cis"),
 	PCMCIA_MFC_DEVICE_CIS_PROD_ID4(0, "NSC MF LAN/Modem", 0x58fc6056, "DP83903.cis"),
 	PCMCIA_MFC_DEVICE_CIS_MANF_CARD(0, 0x0175, 0x0000, "DP83903.cis"),
diff -uprN linux-2.6.31.orig/drivers/serial/serial_cs.c linux-2.6.31/drivers/serial/serial_cs.c
--- linux-2.6.31.orig/drivers/serial/serial_cs.c	2009-09-10 07:13:59.000000000 +0900
+++ linux-2.6.31/drivers/serial/serial_cs.c	2009-09-12 20:02:33.000000000 +0900
@@ -868,11 +868,11 @@ static struct pcmcia_device_id serial_id
 	PCMCIA_DEVICE_PROD_ID12("PCMCIA   ", "C336MX     ", 0x99bcafe9, 0xaa25bcab),
 	PCMCIA_DEVICE_PROD_ID12("Quatech Inc", "PCMCIA Dual RS-232 Serial Port Card", 0xc4420b35, 0x92abc92f),
 	PCMCIA_DEVICE_PROD_ID12("Quatech Inc", "Dual RS-232 Serial Port PC Card", 0xc4420b35, 0x031a380d),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "PCMCIA", "EN2218-LAN/MODEM", 0x281f1c5d, 0x570f348e, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "PCMCIA", "UE2218-LAN/MODEM", 0x281f1c5d, 0x6fdcacee, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "Psion Dacom", "Gold Card V34 Ethernet", 0xf5f025c2, 0x338e8155, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "Psion Dacom", "Gold Card V34 Ethernet GSM", 0xf5f025c2, 0x4ae85d35, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "LINKSYS", "PCMLM28", 0xf7cb0b07, 0x66881874, "PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "PCMCIA", "EN2218-LAN/MODEM", 0x281f1c5d, 0x570f348e, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "PCMCIA", "UE2218-LAN/MODEM", 0x281f1c5d, 0x6fdcacee, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "Psion Dacom", "Gold Card V34 Ethernet", 0xf5f025c2, 0x338e8155, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "Psion Dacom", "Gold Card V34 Ethernet GSM", 0xf5f025c2, 0x4ae85d35, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "LINKSYS", "PCMLM28", 0xf7cb0b07, 0x66881874, "cis/PCMLM28.cis"),
 	PCMCIA_MFC_DEVICE_CIS_PROD_ID12(1, "DAYNA COMMUNICATIONS", "LAN AND MODEM MULTIFUNCTION", 0x8fdf8f89, 0xdd5ed9e8, "DP83903.cis"),
 	PCMCIA_MFC_DEVICE_CIS_PROD_ID4(1, "NSC MF LAN/Modem", 0x58fc6056, "DP83903.cis"),
 	PCMCIA_MFC_DEVICE_CIS_MANF_CARD(1, 0x0101, 0x0556, "cis/3CCFEM556.cis"),
@@ -883,10 +883,10 @@ static struct pcmcia_device_id serial_id
 	PCMCIA_DEVICE_CIS_MANF_CARD(0x0192, 0x0710, "SW_7xx_SER.cis"),	/* Sierra Wireless AC710/AC750 GPRS Network Adapter R1 */
 	PCMCIA_DEVICE_CIS_MANF_CARD(0x0192, 0xa555, "SW_555_SER.cis"),  /* Sierra Aircard 555 CDMA 1xrtt Modem -- pre update */
 	PCMCIA_DEVICE_CIS_MANF_CARD(0x013f, 0xa555, "SW_555_SER.cis"),  /* Sierra Aircard 555 CDMA 1xrtt Modem -- post update */
-	PCMCIA_DEVICE_CIS_PROD_ID12("MultiTech", "PCMCIA 56K DataFax", 0x842047ee, 0xc2efcf03, "MT5634ZLX.cis"),
+	PCMCIA_DEVICE_CIS_PROD_ID12("MultiTech", "PCMCIA 56K DataFax", 0x842047ee, 0xc2efcf03, "cis/MT5634ZLX.cis"),
 	PCMCIA_DEVICE_CIS_PROD_ID12("ADVANTECH", "COMpad-32/85B-4", 0x96913a85, 0xcec8f102, "COMpad4.cis"),
 	PCMCIA_DEVICE_CIS_PROD_ID123("ADVANTECH", "COMpad-32/85", "1.0", 0x96913a85, 0x8fbe92ae, 0x0877b627, "COMpad2.cis"),
-	PCMCIA_DEVICE_CIS_PROD_ID2("RS-COM 2P", 0xad20b156, "RS-COM-2P.cis"),
+	PCMCIA_DEVICE_CIS_PROD_ID2("RS-COM 2P", 0xad20b156, "cis/RS-COM-2P.cis"),
 	PCMCIA_DEVICE_CIS_MANF_CARD(0x0013, 0x0000, "GLOBETROTTER.cis"),
 	PCMCIA_DEVICE_PROD_ID12("ELAN DIGITAL SYSTEMS LTD, c1997.","SERIAL CARD: SL100  1.00.",0x19ca78af,0xf964f42b),
 	PCMCIA_DEVICE_PROD_ID12("ELAN DIGITAL SYSTEMS LTD, c1997.","SERIAL CARD: SL100",0x19ca78af,0x71d98e83),
diff -uprN linux-2.6.31.orig/firmware/cis/MT5634ZLX.cis.ihex linux-2.6.31/firmware/cis/MT5634ZLX.cis.ihex
--- linux-2.6.31.orig/firmware/cis/MT5634ZLX.cis.ihex	1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.31/firmware/cis/MT5634ZLX.cis.ihex	2009-09-12 19:58:04.000000000 +0900
@@ -0,0 +1,11 @@
+:100000000101FF152204014D756C74695465636824
+:100010000050434D4349412035364B2044617461C3
+:10002000466178000000FF20040002010021020266
+:10003000001A05012780FF671B0FCF418B01550177
+:10004000550155AA60F80307281B08970108AA6004
+:10005000F802071B089F0108AA60E803071B08A70E
+:0B0060000108AA60E802071400FF007E
+:00000001FF
+#
+# Replacement CIS for Multitech MT5634ZLX modems
+#
diff -uprN linux-2.6.31.orig/firmware/cis/PCMLM28.cis.ihex linux-2.6.31/firmware/cis/PCMLM28.cis.ihex
--- linux-2.6.31.orig/firmware/cis/PCMLM28.cis.ihex	1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.31/firmware/cis/PCMLM28.cis.ihex	2009-09-12 19:57:29.000000000 +0900
@@ -0,0 +1,18 @@
+:1000000001030000FF151504014C494E4B53595391
+:100010000050434D4C4D3238000000FF2004430196
+:10002000ABC0210200001A05012FF803031B10E4E6
+:1000300001190155E06100031FF8020730FFFF1BA3
+:100040000BA50108E06120031FF802071B0BA601A6
+:1000500008E06140031FF802071B0BA70108E061DD
+:1000600060031FF802071B0BA80108E06100031FD3
+:10007000E803071B0BA90108E06120031FE8030741
+:100080001B0BAA0108E06140031FE803071B0BAB31
+:100090000108E06160031FE803071B0BAC0108E0E7
+:1000A0006100031FE802071B0BAD0108E06120039C
+:1000B0001FE802071B0BAE0108E06140031FE802C6
+:1000C000071B0BAF0108E06160031FE80207140083
+:0200D000FF002F
+:00000001FF
+#
+# The on-card CIS says it is MFC-compliant, but it is not
+#
diff -uprN linux-2.6.31.orig/firmware/cis/RS-COM-2P.cis.ihex linux-2.6.31/firmware/cis/RS-COM-2P.cis.ihex
--- linux-2.6.31.orig/firmware/cis/RS-COM-2P.cis.ihex	1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.31/firmware/cis/RS-COM-2P.cis.ihex	2009-09-12 19:58:12.000000000 +0900
@@ -0,0 +1,10 @@
+:1000000001030000FF1516040150434D4349410010
+:1000100052532D434F4D203250000000FF21020269
+:10002000011A0501030001011B0EC18118AA61E834
+:100030000307E8020730B89E1B0B820108AA615033
+:1000400002075802071B0B830108AA6160020768B8
+:0600500002071400FF008E
+:00000001FF
+#
+# Replacement CIS for dual-serial-port IO card
+#
diff -uprN linux-2.6.31.orig/firmware/Makefile linux-2.6.31/firmware/Makefile
--- linux-2.6.31.orig/firmware/Makefile	2009-09-10 07:13:59.000000000 +0900
+++ linux-2.6.31/firmware/Makefile	2009-09-12 20:12:11.000000000 +0900
@@ -47,9 +47,10 @@ fw-shipped-$(CONFIG_DVB_TTUSB_BUDGET) +=
 fw-shipped-$(CONFIG_E100) += e100/d101m_ucode.bin e100/d101s_ucode.bin \
 			     e100/d102e_ucode.bin
 fw-shipped-$(CONFIG_MYRI_SBUS) += myricom/lanai.bin
-fw-shipped-$(CONFIG_PCMCIA_PCNET) += cis/LA-PCM.cis
+fw-shipped-$(CONFIG_PCMCIA_PCNET) += cis/LA-PCM.cis cis/PCMLM28.cis
 fw-shipped-$(CONFIG_PCMCIA_3C589) += cis/3CXEM556.cis
 fw-shipped-$(CONFIG_PCMCIA_3C574) += cis/3CCFEM556.cis
+fw-shipped-$(CONFIG_SERIAL_8250_CS) += cis/MT5634ZLX.cis cis/RS-COM-2P.cis
 fw-shipped-$(CONFIG_PCMCIA_SMC91C92) += ositech/Xilinx7OD.bin
 fw-shipped-$(CONFIG_SCSI_ADVANSYS) += advansys/mcode.bin advansys/38C1600.bin \
 				      advansys/3550.bin advansys/38C0800.bin
diff -uprN linux-2.6.31.orig/firmware/WHENCE linux-2.6.31/firmware/WHENCE
--- linux-2.6.31.orig/firmware/WHENCE	2009-09-10 07:13:59.000000000 +0900
+++ linux-2.6.31/firmware/WHENCE	2009-09-12 20:09:40.000000000 +0900
@@ -579,6 +579,7 @@ Found in hex form in kernel source.
 Driver: PCMCIA_PCNET - NE2000 compatible PCMCIA adapter
 
 File: cis/LA-PCM.cis
+      cis/PCMLM28.cis
 
 Licence: GPL
 
@@ -606,6 +607,17 @@ Originally developed by the pcmcia-cs pr
 
 --------------------------------------------------------------------------
 
+Driver: SERIAL_8250_CS - Serial PCMCIA adapter
+
+File: cis/MT5634ZLX.cis
+      cis/RS-COM-2P.cis
+
+Licence: GPL
+
+Originally developed by the pcmcia-cs project
+
+--------------------------------------------------------------------------
+
 Driver: PCMCIA_SMC91C92 - SMC 91Cxx PCMCIA
 
 File: ositech/Xilinx7OD.bin


^ permalink raw reply

* [PATCH net-next-2.6] Have atalk_route_packet() return NET_RX_SUCCESS not NET_XMIT_SUCCESS
From: Mark Smith @ 2009-09-13  6:48 UTC (permalink / raw)
  To: netdev, acme, davem

Have atalk_route_packet() return NET_RX_SUCCESS not NET_XMIT_SUCCESS

atalk_route_packet() returns NET_RX_DROP if it's call to
aarp_send_ddp() returns NET_XMIT_DROP. If aarp_send_ddp() returns
anything else atalk_route_packet() should return NET_RX_SUCCESS, not
NET_XMIT_SUCCESS.

Signed-off-by: Mark Smith <markzzzsmith@yahoo.com.au>

diff --git a/net/appletalk/ddp.c b/net/appletalk/ddp.c
index 4a6ff2b..b1a4290 100644
--- a/net/appletalk/ddp.c
+++ b/net/appletalk/ddp.c
@@ -1372,7 +1372,7 @@ static int atalk_route_packet(struct sk_buff *skb, struct net_device *dev,
 
 	if (aarp_send_ddp(rt->dev, skb, &ta, NULL) == NET_XMIT_DROP)
 		return NET_RX_DROP;
-	return NET_XMIT_SUCCESS;
+	return NET_RX_SUCCESS;
 free_it:
 	kfree_skb(skb);
 drop:

^ permalink raw reply related

* Re: [net-next PATCH] etherdevice.h: random_ether_addr update
From: Mark Smith @ 2009-09-13  6:39 UTC (permalink / raw)
  To: Joe Perches
  Cc: Stephen Hemminger, David Miller, jeffrey.t.kirsher, netdev, gospo,
	gregory.v.rose, donald.c.skidmore
In-Reply-To: <1252822152.4400.108.camel@Joe-Laptop.home>

On Sat, 12 Sep 2009 23:09:12 -0700
Joe Perches <joe@perches.com> wrote:

> On Sun, 2009-09-13 at 13:17 +0930, Mark Smith wrote:
> > On Sat, 12 Sep 2009 17:44:46 -0700
> > Joe Perches <joe@perches.com> wrote:
> > > Avoiding an initial octet of "02", which is partially
> > > assigned to 3Com and others, might be useful.
> > I wouldn't necessarily disagree. I would say that if that path was
> > taken, then you'd probably also want to be avoiding all the other
> > well known mac addresses that do or can fall within the locally
> > assigned range e.g. DECnet 0xAA addresses, Microsoft's  use of
> > 02:01:00:00:00:00 and similar addresses for their Network Load
> > Balancing software, the unicast version of the CF:00:00:00:00:00
> > multicast address use for ECTP, the unicast version of the
> > 33:33:xx:xx:xx:xx IPv6 ND multicast ranges etc.
> 
> The existing code already has the first wire bit cleared so it
> is not multicast

Agreed. However I think that if there is a well-known multicast address
that has 0x02 set, there is a future, slight possibility that unicast
addresses might be assigned out of that same LA space, and so they're
worth avoiding. It probably seems a bit paranoid, but LAs are supposed
to be private use only in the first place, and not supposed to be seen
outside of the organisation or entity assigning them (e.g. Microsoft
should have got an OUI to use with their NLB product). If people are
crossing those privacy boundaries with LA multicast addresses, I'd
suggest they might be willing to do it with LA unicast addresses in the
future too - and hopefully they'd restrict themselves to the OUI bytes
they've used for their multicast addresses.

> and has the locally assigned bit set so the
> first octet is a multiple of 2.
> 
> The suggested patch requires an initial octet >= 0x04.
> 
> Skipping AA seems a good idea.
> 
> > Having thought about this issue a bit before, another thought might be
> > to have somebody get the Linux kernel its own OUI,
> 
> That's been suggested.
> 
> > > Not drawing from entropy I think useful, but it's debatable.
> > I'm guessing there are other things in the kernel that would be taking
> > away far more entropy, far more often. IIRC, TCP connection initial
> > sequence number selection would be one example.
> 
> These MAC assignments are generally done at system startup
> when entropy often isn't available and possibly should be
> conserved.
> 
> Maybe this:
> 

I'd suggest documenting in the comment why 0x02 or 0xaa are special
values that have been avoided.

> Signed-off-by: Joe Perches <joe@perches.com>
> 
> diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
> index 3d7a668..40233db 100644
> --- a/include/linux/etherdevice.h
> +++ b/include/linux/etherdevice.h
> @@ -118,12 +118,30 @@ static inline int is_valid_ether_addr(const u8 *addr)
>   *
>   * Generate a random Ethernet address (MAC) that is not multicast
>   * and has the local assigned bit set.
> + * Does not assign a leading octet of 0x02 or 0xaa.
>   */
>  static inline void random_ether_addr(u8 *addr)
>  {
> -	get_random_bytes (addr, ETH_ALEN);
> -	addr [0] &= 0xfe;	/* clear multicast bit */
> -	addr [0] |= 0x02;	/* set local assignment bit (IEEE802) */
> +	u32 val;
> +
> +	/* not calling get_random_bytes to avoid using entropy */
> +	do {
> +		val = random32();
> +		addr[0] = val;
> +		addr[0] &= 0xfe;	/* clear multicast bit */
> +		addr[0] |= 0x02;	/* set local assignment bit (IEEE802) */
> +	} while (addr[0] == 0x02 || addr[0] == 0xaa);
> +
> +	val >>= 8;
> +	addr[1] = val;
> +	val >>= 8;
> +	addr[2] = val;
> +	val >>= 8;
> +	addr[3] = val;
> +	val = random32();
> +	addr[4] = val;
> +	val >>= 8;
> +	addr[5] = val;
>  }
>  
>  /**
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH] ppp: ppp_generic.c initialize flen before it get used
From: David Miller @ 2009-09-13  6:17 UTC (permalink / raw)
  To: jaswinder; +Cc: gabriele.paoloni, paulus, linux-ppp, netdev
In-Reply-To: <1252773691.3687.5.camel@ht.satnam>


Already dealt with in net-next-2.6, please act against the
active networking development trees when fixing warnings and
doing cleanups.

Thank you.

commit 886f9fe68310168358f55028a03553ae154dfeb6
Author: David S. Miller <davem@davemloft.net>
Date:   Wed Aug 19 13:55:55 2009 -0700

    ppp_generic: Help GCC see that 'flen' is always initialized.
    
    It's too stupid to see that we always set flen to something
    before we use it in ppp_mp_explode():
    
    drivers/net/ppp_generic.c: In function 'ppp_push':
    drivers/net/ppp_generic.c:1314: warning: 'flen' may be used uninitialized in this function
    drivers/net/ppp_generic.c:1314: note: 'flen' was declared here
    
    This started warning after commit a53a8b56827cc429c6d9f861ad558beeb5f6103f
    ("ppp: fix lost fragments in ppp_mp_explode() (resubmit)")
    
    So just put an explicit unconditional initialization there to
    hush it up.
    
    Signed-off-by: David S. Miller <davem@davemloft.net>


^ permalink raw reply

* Re: [net-next PATCH] etherdevice.h: random_ether_addr update
From: Joe Perches @ 2009-09-13  6:09 UTC (permalink / raw)
  To: Mark Smith
  Cc: Stephen Hemminger, David Miller, jeffrey.t.kirsher, netdev, gospo,
	gregory.v.rose, donald.c.skidmore
In-Reply-To: <20090913131752.462abc01@opy.nosense.org>

On Sun, 2009-09-13 at 13:17 +0930, Mark Smith wrote:
> On Sat, 12 Sep 2009 17:44:46 -0700
> Joe Perches <joe@perches.com> wrote:
> > Avoiding an initial octet of "02", which is partially
> > assigned to 3Com and others, might be useful.
> I wouldn't necessarily disagree. I would say that if that path was
> taken, then you'd probably also want to be avoiding all the other
> well known mac addresses that do or can fall within the locally
> assigned range e.g. DECnet 0xAA addresses, Microsoft's  use of
> 02:01:00:00:00:00 and similar addresses for their Network Load
> Balancing software, the unicast version of the CF:00:00:00:00:00
> multicast address use for ECTP, the unicast version of the
> 33:33:xx:xx:xx:xx IPv6 ND multicast ranges etc.

The existing code already has the first wire bit cleared so it
is not multicast and has the locally assigned bit set so the
first octet is a multiple of 2.

The suggested patch requires an initial octet >= 0x04.

Skipping AA seems a good idea.

> Having thought about this issue a bit before, another thought might be
> to have somebody get the Linux kernel its own OUI,

That's been suggested.

> > Not drawing from entropy I think useful, but it's debatable.
> I'm guessing there are other things in the kernel that would be taking
> away far more entropy, far more often. IIRC, TCP connection initial
> sequence number selection would be one example.

These MAC assignments are generally done at system startup
when entropy often isn't available and possibly should be
conserved.

Maybe this:

Signed-off-by: Joe Perches <joe@perches.com>

diff --git a/include/linux/etherdevice.h b/include/linux/etherdevice.h
index 3d7a668..40233db 100644
--- a/include/linux/etherdevice.h
+++ b/include/linux/etherdevice.h
@@ -118,12 +118,30 @@ static inline int is_valid_ether_addr(const u8 *addr)
  *
  * Generate a random Ethernet address (MAC) that is not multicast
  * and has the local assigned bit set.
+ * Does not assign a leading octet of 0x02 or 0xaa.
  */
 static inline void random_ether_addr(u8 *addr)
 {
-	get_random_bytes (addr, ETH_ALEN);
-	addr [0] &= 0xfe;	/* clear multicast bit */
-	addr [0] |= 0x02;	/* set local assignment bit (IEEE802) */
+	u32 val;
+
+	/* not calling get_random_bytes to avoid using entropy */
+	do {
+		val = random32();
+		addr[0] = val;
+		addr[0] &= 0xfe;	/* clear multicast bit */
+		addr[0] |= 0x02;	/* set local assignment bit (IEEE802) */
+	} while (addr[0] == 0x02 || addr[0] == 0xaa);
+
+	val >>= 8;
+	addr[1] = val;
+	val >>= 8;
+	addr[2] = val;
+	val >>= 8;
+	addr[3] = val;
+	val = random32();
+	addr[4] = val;
+	val >>= 8;
+	addr[5] = val;
 }
 
 /**



^ permalink raw reply related

* Re: [PATCHv5 3/3] vhost_net: a kernel-level virtio server
From: Michael S. Tsirkin @ 2009-09-13  5:46 UTC (permalink / raw)
  To: Xin, Xiaohui
  Cc: Ira W. Snyder, netdev@vger.kernel.org,
	virtualization@lists.linux-foundation.org, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, mingo@elte.hu, linux-mm@kvack.org,
	akpm@linux-foundation.org, hpa@zytor.com,
	gregory.haskins@gmail.com, Rusty Russell, s.hetze@linux-ag.com,
	avi
In-Reply-To: <C85CEDA13AB1CF4D9D597824A86D2B9006AECB9C1D@PDSMSX501.ccr.corp.intel.com>

On Fri, Sep 11, 2009 at 11:17:33PM +0800, Xin, Xiaohui wrote:
> Michael,
> We are very interested in your patch and want to have a try with it.
> I have collected your 3 patches in kernel side and 4 patches in queue side.
> The patches are listed here:
> 
> PATCHv5-1-3-mm-export-use_mm-unuse_mm-to-modules.patch
> PATCHv5-2-3-mm-reduce-atomic-use-on-use_mm-fast-path.patch
> PATCHv5-3-3-vhost_net-a-kernel-level-virtio-server.patch
> 
> PATCHv3-1-4-qemu-kvm-move-virtio-pci[1].o-to-near-pci.o.patch
> PATCHv3-2-4-virtio-move-features-to-an-inline-function.patch
> PATCHv3-3-4-qemu-kvm-vhost-net-implementation.patch
> PATCHv3-4-4-qemu-kvm-add-compat-eventfd.patch
> 
> I applied the kernel patches on v2.6.31-rc4 and the qemu patches on latest kvm qemu.
> But seems there are some patches are needed at least irqfd and ioeventfd patches on
> current qemu. I cannot create a kvm guest with "-net nic,model=virtio,vhost=vethX".
> 
> May you kindly advice us the patch lists all exactly to make it work?
> Thanks a lot. :-)
> 
> Thanks
> Xiaohui


The irqfd/ioeventfd patches are part of Avi's kvm.git tree:
git://git.kernel.org/pub/scm/linux/kernel/git/avi/kvm.git

I expect them to be merged by 2.6.32-rc1 - right, Avi?

-- 
MST

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [net-next PATCH] etherdevice.h: random_ether_addr update
From: Mark Smith @ 2009-09-13  3:47 UTC (permalink / raw)
  To: Joe Perches
  Cc: Stephen Hemminger, David Miller, jeffrey.t.kirsher, netdev, gospo,
	gregory.v.rose, donald.c.skidmore
In-Reply-To: <1252802686.4400.19.camel@Joe-Laptop.home>

On Sat, 12 Sep 2009 17:44:46 -0700
Joe Perches <joe@perches.com> wrote:

> On Sun, 2009-09-13 at 10:03 +0930, Mark Smith wrote:
> > Hmm, probably didn't make my point all that clear. IOW, anything that
> > can contribute to avoiding duplicate mac addresses is worth it in my
> > opinion, when compared to the time (usually hours) involved in
> > troubleshooting duplicate mac addresses.
> 
> Avoiding an initial octet of "02", which is partially
> assigned to 3Com and others, might be useful.
> 

I wouldn't necessarily disagree. I would say that if that path was
taken, then you'd probably also want to be avoiding all the other
well known mac addresses that do or can fall within the locally
assigned range e.g. DECnet 0xAA addresses, Microsoft's  use of
02:01:00:00:00:00 and similar addresses for their Network Load
Balancing software, the unicast version of the CF:00:00:00:00:00
multicast address use for ECTP, the unicast version of the
33:33:xx:xx:xx:xx IPv6 ND multicast ranges etc.

Having thought about this issue a bit before, another thought might be
to have somebody get the Linux kernel it's own OUI, and then have
addresses randomly selected out of that. As my day job is networking,
I'd find some value in being able to see a well known OUI for Linux
randomly generated addresses, rather than the complete randomness that
is the case now.

The drawback there is that there are then only 24 bits octets of
randomness in the addresses that each host can independently choose to
use, which isn't anywhere near the as random as the 2^46 the LA address
space provides. For most ethernet segments, 24 bits of randomness might
be ok, however some of the very large metro ethernet networks are
starting to carry 16000+ MAC addresses, all within the same, very
controlled broadcast domain.


> Not drawing from entropy I think useful, but it's debatable.
> 
> 

I'm guessing there are other things in the kernel that would be taking
away far more entropy, far more often. IIRC, TCP connection initial
sequence number selection would be one example.

^ permalink raw reply

* Info
From: Irish News Centre @ 2009-09-13  0:58 UTC (permalink / raw)


You have won £ 750,000 pounds. Send the required information: name, age, country


^ permalink raw reply

* Re: [net-next PATCH] etherdevice.h: random_ether_addr update
From: Joe Perches @ 2009-09-13  0:44 UTC (permalink / raw)
  To: Mark Smith
  Cc: Stephen Hemminger, David Miller, jeffrey.t.kirsher, netdev, gospo,
	gregory.v.rose, donald.c.skidmore
In-Reply-To: <20090913100342.4f6a2417@opy.nosense.org>

On Sun, 2009-09-13 at 10:03 +0930, Mark Smith wrote:
> Hmm, probably didn't make my point all that clear. IOW, anything that
> can contribute to avoiding duplicate mac addresses is worth it in my
> opinion, when compared to the time (usually hours) involved in
> troubleshooting duplicate mac addresses.

Avoiding an initial octet of "02", which is partially
assigned to 3Com and others, might be useful.

Not drawing from entropy I think useful, but it's debatable.



^ permalink raw reply

* Re: [net-next PATCH] etherdevice.h: random_ether_addr update
From: Mark Smith @ 2009-09-13  0:33 UTC (permalink / raw)
  To: Mark Smith
  Cc: Joe Perches, Stephen Hemminger, David Miller, jeffrey.t.kirsher,
	netdev, gospo, gregory.v.rose, donald.c.skidmore
In-Reply-To: <20090913094409.2dd88212@opy.nosense.org>

On Sun, 13 Sep 2009 09:44:09 +0930
Mark Smith <lk-netdev@lk-netdev.nosense.org> wrote:

> On Fri, 11 Sep 2009 17:57:47 -0700
> Joe Perches <joe@perches.com> wrote:
> 
> > On Fri, 2009-09-11 at 14:15 -0700, Stephen Hemminger wrote:
> > > On Fri, 2009-09011 at 13:20 -0700 Joe Perches wrote:
> > > > Do not use get_random_bytes to avoid drawing down entropy pool.
> > > Getting 6 bytes once is not going to be enough of a problem
> > > to drain the pool. I prefer not to weaken the randomness here.
> > 
> > I see no reason to draw down the entropy pool.
> > 
> > I have a hard time imagining that a random mac address
> > needs more than reasonably random values.
> > 
> 
> One factor to consider is the amount of effort involved in
> troubleshooting duplicate mac address problems. They don't happen
> very often, and most people won't encounter them at all. That means
> they're very low on the list of things you suspect when
> troubleshooting, so you spend a lot of time investigating other things
> first, without that time providing any value once you work out what the
> actual fault it. Once you work out what the problem is, working out
> which devices are causing it can also be an effort, as, of course, they
> now don't have unique identifiers.
> 

Hmm, probably didn't make my point all that clear. IOW, anything that
can contribute to avoiding duplicate mac addresses is worth it in my
opinion, when compared to the time (usually hours) involved in
troubleshooting duplicate mac addresses.

> 
> 
> > Why do you think it's reasonable to draw from the
> > entropy pool?
> > 
> > cheers, Joe
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe netdev" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [net-next PATCH] etherdevice.h: random_ether_addr update
From: Mark Smith @ 2009-09-13  0:14 UTC (permalink / raw)
  To: Joe Perches
  Cc: Stephen Hemminger, David Miller, jeffrey.t.kirsher, netdev, gospo,
	gregory.v.rose, donald.c.skidmore
In-Reply-To: <1252717067.29420.22.camel@Joe-Laptop.home>

On Fri, 11 Sep 2009 17:57:47 -0700
Joe Perches <joe@perches.com> wrote:

> On Fri, 2009-09-11 at 14:15 -0700, Stephen Hemminger wrote:
> > On Fri, 2009-09011 at 13:20 -0700 Joe Perches wrote:
> > > Do not use get_random_bytes to avoid drawing down entropy pool.
> > Getting 6 bytes once is not going to be enough of a problem
> > to drain the pool. I prefer not to weaken the randomness here.
> 
> I see no reason to draw down the entropy pool.
> 
> I have a hard time imagining that a random mac address
> needs more than reasonably random values.
> 

One factor to consider is the amount of effort involved in
troubleshooting duplicate mac address problems. They don't happen
very often, and most people won't encounter them at all. That means
they're very low on the list of things you suspect when
troubleshooting, so you spend a lot of time investigating other things
first, without that time providing any value once you work out what the
actual fault it. Once you work out what the problem is, working out
which devices are causing it can also be an effort, as, of course, they
now don't have unique identifiers.



> Why do you think it's reasonable to draw from the
> entropy pool?
> 
> cheers, Joe
> 
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* [PATCH] pcnet_cs: add cis of Linksys multifunction pcmcia card
From: Ken Kawasaki @ 2009-09-12 22:22 UTC (permalink / raw)
  To: netdev; +Cc: ken_kawasaki
In-Reply-To: <20090426162929.3d036f26.ken_kawasaki@spring.nifty.jp>


pcnet_cs,serial_cs:
 
add cis of Linksys lan&modem mulitifunction pcmcia card
and some modem card(MT5634ZLX, RS-COM-2P).

  
Signed-off-by: Ken Kawasaki <ken_kawasaki@spring.nifty.jp>
 
---
 
 drivers/net/pcmcia/pcnet_cs.c   |   10 +++++-----
 drivers/serial/serial_cs.c      |   14 +++++++-------
 firmware/Makefile               |    3 ++-
 firmware/WHENCE                 |   12 ++++++++++++
 firmware/cis/MT5634ZLX.cis.ihex |   11 +++++++++++
 firmware/cis/PCMLM28.cis.ihex   |   18 ++++++++++++++++++
 firmware/cis/RS-COM-2P.cis.ihex |   10 ++++++++++
 7 files changed, 65 insertions(+), 13 deletions(-)

diff -uprN linux-2.6.31.orig/drivers/net/pcmcia/pcnet_cs.c linux-2.6.31/drivers/net/pcmcia/pcnet_cs.c
--- linux-2.6.31.orig/drivers/net/pcmcia/pcnet_cs.c	2009-09-10 07:13:59.000000000 +0900
+++ linux-2.6.31/drivers/net/pcmcia/pcnet_cs.c	2009-09-12 19:59:57.000000000 +0900
@@ -1751,11 +1751,11 @@ static struct pcmcia_device_id pcnet_ids
 	PCMCIA_DEVICE_PROD_ID2("EN-6200P2", 0xa996d078),
 	/* too generic! */
 	/* PCMCIA_DEVICE_PROD_ID12("PCMCIA", "10/100 Ethernet Card", 0x281f1c5d, 0x11b0ffc0), */
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "PCMCIA", "EN2218-LAN/MODEM", 0x281f1c5d, 0x570f348e, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "PCMCIA", "UE2218-LAN/MODEM", 0x281f1c5d, 0x6fdcacee, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "Psion Dacom", "Gold Card V34 Ethernet", 0xf5f025c2, 0x338e8155, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "Psion Dacom", "Gold Card V34 Ethernet GSM", 0xf5f025c2, 0x4ae85d35, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "LINKSYS", "PCMLM28", 0xf7cb0b07, 0x66881874, "PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "PCMCIA", "EN2218-LAN/MODEM", 0x281f1c5d, 0x570f348e, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "PCMCIA", "UE2218-LAN/MODEM", 0x281f1c5d, 0x6fdcacee, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "Psion Dacom", "Gold Card V34 Ethernet", 0xf5f025c2, 0x338e8155, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "Psion Dacom", "Gold Card V34 Ethernet GSM", 0xf5f025c2, 0x4ae85d35, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(0, "LINKSYS", "PCMLM28", 0xf7cb0b07, 0x66881874, "cis/PCMLM28.cis"),
 	PCMCIA_MFC_DEVICE_CIS_PROD_ID12(0, "DAYNA COMMUNICATIONS", "LAN AND MODEM MULTIFUNCTION", 0x8fdf8f89, 0xdd5ed9e8, "DP83903.cis"),
 	PCMCIA_MFC_DEVICE_CIS_PROD_ID4(0, "NSC MF LAN/Modem", 0x58fc6056, "DP83903.cis"),
 	PCMCIA_MFC_DEVICE_CIS_MANF_CARD(0, 0x0175, 0x0000, "DP83903.cis"),
diff -uprN linux-2.6.31.orig/drivers/serial/serial_cs.c linux-2.6.31/drivers/serial/serial_cs.c
--- linux-2.6.31.orig/drivers/serial/serial_cs.c	2009-09-10 07:13:59.000000000 +0900
+++ linux-2.6.31/drivers/serial/serial_cs.c	2009-09-12 20:02:33.000000000 +0900
@@ -868,11 +868,11 @@ static struct pcmcia_device_id serial_id
 	PCMCIA_DEVICE_PROD_ID12("PCMCIA   ", "C336MX     ", 0x99bcafe9, 0xaa25bcab),
 	PCMCIA_DEVICE_PROD_ID12("Quatech Inc", "PCMCIA Dual RS-232 Serial Port Card", 0xc4420b35, 0x92abc92f),
 	PCMCIA_DEVICE_PROD_ID12("Quatech Inc", "Dual RS-232 Serial Port PC Card", 0xc4420b35, 0x031a380d),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "PCMCIA", "EN2218-LAN/MODEM", 0x281f1c5d, 0x570f348e, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "PCMCIA", "UE2218-LAN/MODEM", 0x281f1c5d, 0x6fdcacee, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "Psion Dacom", "Gold Card V34 Ethernet", 0xf5f025c2, 0x338e8155, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "Psion Dacom", "Gold Card V34 Ethernet GSM", 0xf5f025c2, 0x4ae85d35, "PCMLM28.cis"),
-	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "LINKSYS", "PCMLM28", 0xf7cb0b07, 0x66881874, "PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "PCMCIA", "EN2218-LAN/MODEM", 0x281f1c5d, 0x570f348e, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "PCMCIA", "UE2218-LAN/MODEM", 0x281f1c5d, 0x6fdcacee, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "Psion Dacom", "Gold Card V34 Ethernet", 0xf5f025c2, 0x338e8155, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "Psion Dacom", "Gold Card V34 Ethernet GSM", 0xf5f025c2, 0x4ae85d35, "cis/PCMLM28.cis"),
+	PCMCIA_PFC_DEVICE_CIS_PROD_ID12(1, "LINKSYS", "PCMLM28", 0xf7cb0b07, 0x66881874, "cis/PCMLM28.cis"),
 	PCMCIA_MFC_DEVICE_CIS_PROD_ID12(1, "DAYNA COMMUNICATIONS", "LAN AND MODEM MULTIFUNCTION", 0x8fdf8f89, 0xdd5ed9e8, "DP83903.cis"),
 	PCMCIA_MFC_DEVICE_CIS_PROD_ID4(1, "NSC MF LAN/Modem", 0x58fc6056, "DP83903.cis"),
 	PCMCIA_MFC_DEVICE_CIS_MANF_CARD(1, 0x0101, 0x0556, "cis/3CCFEM556.cis"),
@@ -883,10 +883,10 @@ static struct pcmcia_device_id serial_id
 	PCMCIA_DEVICE_CIS_MANF_CARD(0x0192, 0x0710, "SW_7xx_SER.cis"),	/* Sierra Wireless AC710/AC750 GPRS Network Adapter R1 */
 	PCMCIA_DEVICE_CIS_MANF_CARD(0x0192, 0xa555, "SW_555_SER.cis"),  /* Sierra Aircard 555 CDMA 1xrtt Modem -- pre update */
 	PCMCIA_DEVICE_CIS_MANF_CARD(0x013f, 0xa555, "SW_555_SER.cis"),  /* Sierra Aircard 555 CDMA 1xrtt Modem -- post update */
-	PCMCIA_DEVICE_CIS_PROD_ID12("MultiTech", "PCMCIA 56K DataFax", 0x842047ee, 0xc2efcf03, "MT5634ZLX.cis"),
+	PCMCIA_DEVICE_CIS_PROD_ID12("MultiTech", "PCMCIA 56K DataFax", 0x842047ee, 0xc2efcf03, "cis/MT5634ZLX.cis"),
 	PCMCIA_DEVICE_CIS_PROD_ID12("ADVANTECH", "COMpad-32/85B-4", 0x96913a85, 0xcec8f102, "COMpad4.cis"),
 	PCMCIA_DEVICE_CIS_PROD_ID123("ADVANTECH", "COMpad-32/85", "1.0", 0x96913a85, 0x8fbe92ae, 0x0877b627, "COMpad2.cis"),
-	PCMCIA_DEVICE_CIS_PROD_ID2("RS-COM 2P", 0xad20b156, "RS-COM-2P.cis"),
+	PCMCIA_DEVICE_CIS_PROD_ID2("RS-COM 2P", 0xad20b156, "cis/RS-COM-2P.cis"),
 	PCMCIA_DEVICE_CIS_MANF_CARD(0x0013, 0x0000, "GLOBETROTTER.cis"),
 	PCMCIA_DEVICE_PROD_ID12("ELAN DIGITAL SYSTEMS LTD, c1997.","SERIAL CARD: SL100  1.00.",0x19ca78af,0xf964f42b),
 	PCMCIA_DEVICE_PROD_ID12("ELAN DIGITAL SYSTEMS LTD, c1997.","SERIAL CARD: SL100",0x19ca78af,0x71d98e83),
diff -uprN linux-2.6.31.orig/firmware/cis/MT5634ZLX.cis.ihex linux-2.6.31/firmware/cis/MT5634ZLX.cis.ihex
--- linux-2.6.31.orig/firmware/cis/MT5634ZLX.cis.ihex	1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.31/firmware/cis/MT5634ZLX.cis.ihex	2009-09-12 19:58:04.000000000 +0900
@@ -0,0 +1,11 @@
+:100000000101FF152204014D756C74695465636824
+:100010000050434D4349412035364B2044617461C3
+:10002000466178000000FF20040002010021020266
+:10003000001A05012780FF671B0FCF418B01550177
+:10004000550155AA60F80307281B08970108AA6004
+:10005000F802071B089F0108AA60E803071B08A70E
+:0B0060000108AA60E802071400FF007E
+:00000001FF
+#
+# Replacement CIS for Multitech MT5634ZLX modems
+#
diff -uprN linux-2.6.31.orig/firmware/cis/PCMLM28.cis.ihex linux-2.6.31/firmware/cis/PCMLM28.cis.ihex
--- linux-2.6.31.orig/firmware/cis/PCMLM28.cis.ihex	1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.31/firmware/cis/PCMLM28.cis.ihex	2009-09-12 19:57:29.000000000 +0900
@@ -0,0 +1,18 @@
+:1000000001030000FF151504014C494E4B53595391
+:100010000050434D4C4D3238000000FF2004430196
+:10002000ABC0210200001A05012FF803031B10E4E6
+:1000300001190155E06100031FF8020730FFFF1BA3
+:100040000BA50108E06120031FF802071B0BA601A6
+:1000500008E06140031FF802071B0BA70108E061DD
+:1000600060031FF802071B0BA80108E06100031FD3
+:10007000E803071B0BA90108E06120031FE8030741
+:100080001B0BAA0108E06140031FE803071B0BAB31
+:100090000108E06160031FE803071B0BAC0108E0E7
+:1000A0006100031FE802071B0BAD0108E06120039C
+:1000B0001FE802071B0BAE0108E06140031FE802C6
+:1000C000071B0BAF0108E06160031FE80207140083
+:0200D000FF002F
+:00000001FF
+#
+# The on-card CIS says it is MFC-compliant, but it is not
+#
diff -uprN linux-2.6.31.orig/firmware/cis/RS-COM-2P.cis.ihex linux-2.6.31/firmware/cis/RS-COM-2P.cis.ihex
--- linux-2.6.31.orig/firmware/cis/RS-COM-2P.cis.ihex	1970-01-01 09:00:00.000000000 +0900
+++ linux-2.6.31/firmware/cis/RS-COM-2P.cis.ihex	2009-09-12 19:58:12.000000000 +0900
@@ -0,0 +1,10 @@
+:1000000001030000FF1516040150434D4349410010
+:1000100052532D434F4D203250000000FF21020269
+:10002000011A0501030001011B0EC18118AA61E834
+:100030000307E8020730B89E1B0B820108AA615033
+:1000400002075802071B0B830108AA6160020768B8
+:0600500002071400FF008E
+:00000001FF
+#
+# Replacement CIS for dual-serial-port IO card
+#
diff -uprN linux-2.6.31.orig/firmware/Makefile linux-2.6.31/firmware/Makefile
--- linux-2.6.31.orig/firmware/Makefile	2009-09-10 07:13:59.000000000 +0900
+++ linux-2.6.31/firmware/Makefile	2009-09-12 20:12:11.000000000 +0900
@@ -47,9 +47,10 @@ fw-shipped-$(CONFIG_DVB_TTUSB_BUDGET) +=
 fw-shipped-$(CONFIG_E100) += e100/d101m_ucode.bin e100/d101s_ucode.bin \
 			     e100/d102e_ucode.bin
 fw-shipped-$(CONFIG_MYRI_SBUS) += myricom/lanai.bin
-fw-shipped-$(CONFIG_PCMCIA_PCNET) += cis/LA-PCM.cis
+fw-shipped-$(CONFIG_PCMCIA_PCNET) += cis/LA-PCM.cis cis/PCMLM28.cis
 fw-shipped-$(CONFIG_PCMCIA_3C589) += cis/3CXEM556.cis
 fw-shipped-$(CONFIG_PCMCIA_3C574) += cis/3CCFEM556.cis
+fw-shipped-$(CONFIG_SERIAL_8250_CS) += cis/MT5634ZLX.cis cis/RS-COM-2P.cis
 fw-shipped-$(CONFIG_PCMCIA_SMC91C92) += ositech/Xilinx7OD.bin
 fw-shipped-$(CONFIG_SCSI_ADVANSYS) += advansys/mcode.bin advansys/38C1600.bin \
 				      advansys/3550.bin advansys/38C0800.bin
diff -uprN linux-2.6.31.orig/firmware/WHENCE linux-2.6.31/firmware/WHENCE
--- linux-2.6.31.orig/firmware/WHENCE	2009-09-10 07:13:59.000000000 +0900
+++ linux-2.6.31/firmware/WHENCE	2009-09-12 20:09:40.000000000 +0900
@@ -579,6 +579,7 @@ Found in hex form in kernel source.
 Driver: PCMCIA_PCNET - NE2000 compatible PCMCIA adapter
 
 File: cis/LA-PCM.cis
+      cis/PCMLM28.cis
 
 Licence: GPL
 
@@ -606,6 +607,17 @@ Originally developed by the pcmcia-cs pr
 
 --------------------------------------------------------------------------
 
+Driver: SERIAL_8250_CS - Serial PCMCIA adapter
+
+File: cis/MT5634ZLX.cis
+      cis/RS-COM-2P.cis
+
+Licence: GPL
+
+Originally developed by the pcmcia-cs project
+
+--------------------------------------------------------------------------
+
 Driver: PCMCIA_SMC91C92 - SMC 91Cxx PCMCIA
 
 File: ositech/Xilinx7OD.bin


^ permalink raw reply

* Re: [iproute2] tc action mirred    question
From: jamal @ 2009-09-12 21:49 UTC (permalink / raw)
  To: Xiaofei Wu; +Cc: linux netdev
In-Reply-To: <463375.95638.qm@web111612.mail.gq1.yahoo.com>

On Sat, 2009-09-12 at 09:01 -0700, Xiaofei Wu wrote:

> I can mirror the packets (A -> B) to D. Node D forwards them to C.
> Sometimes the same packets through the path A-B-C, A-D-C are not lost. Node C will receive the
> same packets twice. How to handle the duplicate packet on node C ?
> 
> Maybe the duplicate packets have side effects to TCP.
> 

I will leave this to you. This is where you go and do interesting
things, publish your results and maybe write a paper.
I dont think its a big deal to receive duplicate packets.

cheers,
jamal



^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox