Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [RFC] net: change bridge/macvlan hook to be be generic
From: Ben Greear @ 2010-05-10 17:14 UTC (permalink / raw)
  To: Scott Feldman; +Cc: Stephen Hemminger, David Miller, Patrick McHardy, netdev
In-Reply-To: <C80610E0.2E2EC%scofeldm@cisco.com>

On 05/04/2010 05:58 PM, Scott Feldman wrote:
> On 5/4/10 3:37 PM, "Stephen Hemminger"<shemminger@vyatta.com>  wrote:
>
>> The existing macvlan and bridge have special hooks in the packet input
>> path. This patch changes it to a generic hook chain, like the packet type
>> processing. I have been wanting to look into flow based switching, etc...
>
> Can this be further simplified by saying that a netdev can only be hooked by
> one mux (macvlan, bridge, etc) at any given time, so there is never more
> than one element in the hook chain.  If so, then we just need a single hook,
> not a chain.
>
> It seems odd to me that a dev would have both macvlan_port != NULL and
> br_port != NULL.  Can dev be in a macvlan and a bridge at the same time?

If we did add the generic hook list, then we could support other hooks, like
a pktgen rx hook to gather latency, pkt-loss, and similar stats, for example.

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply

* Re: [PATCHv7] add mergeable buffers support to vhost_net
From: Michael S. Tsirkin @ 2010-05-10 17:25 UTC (permalink / raw)
  To: David Stevens; +Cc: kvm, kvm-owner, netdev, virtualization
In-Reply-To: <OF1F65A110.7C63CDB3-ON8825771F.005DD80E-8825771F.005E3509@us.ibm.com>

On Mon, May 10, 2010 at 10:09:03AM -0700, David Stevens wrote:
> Since "datalen" carries the difference and will be negative by that amount
> from the original loop, what about just adding something like:
> 
>         }
>         if (headcount)
>                 heads[headcount-1].len += datalen;
> [and really, headcount >0 since datalen > 0, so just:
> 
>         heads[headcount-1].len += datalen;
> 
>                                                 +-DLS

This works too, just does more checks and comparisons.
I am still surprised that you were unable to reproduce the problem.

> 
> kvm-owner@vger.kernel.org wrote on 05/10/2010 09:43:03 AM:
> 
> > On Wed, Apr 28, 2010 at 01:57:12PM -0700, David L Stevens wrote:
> > > @@ -218,18 +248,19 @@ static void handle_rx(struct vhost_net *
> > >     use_mm(net->dev.mm);
> > >     mutex_lock(&vq->mutex);
> > >     vhost_disable_notify(vq);
> > > -   hdr_size = vq->hdr_size;
> > > +   vhost_hlen = vq->vhost_hlen;
> > > 
> > >     vq_log = unlikely(vhost_has_feature(&net->dev, VHOST_F_LOG_ALL)) ?
> > >        vq->log : NULL;
> > > 
> > > -   for (;;) {
> > > -      head = vhost_get_vq_desc(&net->dev, vq, vq->iov,
> > > -                ARRAY_SIZE(vq->iov),
> > > -                &out, &in,
> > > -                vq_log, &log);
> > > +   while ((datalen = vhost_head_len(vq, sock->sk))) {
> > > +      headcount = vhost_get_desc_n(vq, vq->heads,
> > > +                    datalen + vhost_hlen,
> > > +                    &in, vq_log, &log);
> > > +      if (headcount < 0)
> > > +         break;
> > >        /* OK, now we need to know about added descriptors. */
> > > -      if (head == vq->num) {
> > > +      if (!headcount) {
> > >           if (unlikely(vhost_enable_notify(vq))) {
> > >              /* They have slipped one in as we were
> > >               * doing that: check again. */
> > > @@ -241,46 +272,53 @@ static void handle_rx(struct vhost_net *
> > >           break;
> > >        }
> > >        /* We don't need to be notified again. */
> > > -      if (out) {
> > > -         vq_err(vq, "Unexpected descriptor format for RX: "
> > > -                "out %d, int %d\n",
> > > -                out, in);
> > > -         break;
> > > -      }
> > > -      /* Skip header. TODO: support TSO/mergeable rx buffers. */
> > > -      s = move_iovec_hdr(vq->iov, vq->hdr, hdr_size, in);
> > > +      if (vhost_hlen)
> > > +         /* Skip header. TODO: support TSO. */
> > > +         s = move_iovec_hdr(vq->iov, vq->hdr, vhost_hlen, in);
> > > +      else
> > > +         s = copy_iovec_hdr(vq->iov, vq->hdr, vq->sock_hlen, in);
> > >        msg.msg_iovlen = in;
> > >        len = iov_length(vq->iov, in);
> > >        /* Sanity check */
> > >        if (!len) {
> > >           vq_err(vq, "Unexpected header len for RX: "
> > >                  "%zd expected %zd\n",
> > > -                iov_length(vq->hdr, s), hdr_size);
> > > +                iov_length(vq->hdr, s), vhost_hlen);
> > >           break;
> > >        }
> > >        err = sock->ops->recvmsg(NULL, sock, &msg,
> > >                  len, MSG_DONTWAIT | MSG_TRUNC);
> > >        /* TODO: Check specific error and bomb out unless EAGAIN? */
> > >        if (err < 0) {
> > > -         vhost_discard_vq_desc(vq);
> > > +         vhost_discard_desc(vq, headcount);
> > >           break;
> > >        }
> > > -      /* TODO: Should check and handle checksum. */
> > > -      if (err > len) {
> > > -         pr_err("Discarded truncated rx packet: "
> > > -                " len %d > %zd\n", err, len);
> > > -         vhost_discard_vq_desc(vq);
> > > +      if (err != datalen) {
> > > +         pr_err("Discarded rx packet: "
> > > +                " len %d, expected %zd\n", err, datalen);
> > > +         vhost_discard_desc(vq, headcount);
> > >           continue;
> > >        }
> > >        len = err;
> > > -      err = memcpy_toiovec(vq->hdr, (unsigned char *)&hdr, hdr_size);
> > > -      if (err) {
> > > -         vq_err(vq, "Unable to write vnet_hdr at addr %p: %d\n",
> > > -                vq->iov->iov_base, err);
> > > +      if (vhost_hlen &&
> > > +          memcpy_toiovecend(vq->hdr, (unsigned char *)&hdr, 0,
> > > +                  vhost_hlen)) {
> > > +         vq_err(vq, "Unable to write vnet_hdr at addr %p\n",
> > > +                vq->iov->iov_base);
> > >           break;
> > >        }
> > > -      len += hdr_size;
> > > -      vhost_add_used_and_signal(&net->dev, vq, head, len);
> > > +      /* TODO: Should check and handle checksum. */
> > > +      if (vhost_has_feature(&net->dev, VIRTIO_NET_F_MRG_RXBUF) &&
> > > +          memcpy_toiovecend(vq->hdr, (unsigned char *)&headcount,
> > > +                  offsetof(typeof(hdr), num_buffers),
> > > +                  sizeof(hdr.num_buffers))) {
> > > +         vq_err(vq, "Failed num_buffers write");
> > > +         vhost_discard_desc(vq, headcount);
> > > +         break;
> > > +      }
> > > +      len += vhost_hlen;
> > > +      vhost_add_used_and_signal_n(&net->dev, vq, vq->heads,
> > > +                   headcount);
> > >        if (unlikely(vq_log))
> > >           vhost_log_write(vq, vq_log, log, len);
> > >        total_len += len;
> > 
> > OK I think I see the bug here: vhost_add_used_and_signal_n
> > does not get the actual length, it gets the iovec length from vhost.
> > Guest virtio uses this as packet length, with bad results.
> > 
> > So I have applied the follows and it seems to have fixed the problem:
> > 
> > diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
> > index c16db02..9d7496d 100644
> > --- a/drivers/vhost/net.c
> > +++ b/drivers/vhost/net.c
> > @@ -219,7 +219,7 @@ static int peek_head_len(struct vhost_virtqueue *vq, 
> 
> > struct sock *sk)
> >  /* This is a multi-buffer version of vhost_get_desc, that works if
> >   *   vq has read descriptors only.
> >   * @vq      - the relevant virtqueue
> > - * @datalen   - data length we'll be reading
> > + * @datalen   - data length we'll be reading. must be > 0
> >   * @iovcount   - returned count of io vectors we fill
> >   * @log      - vhost log
> >   * @log_num   - log offset
> > @@ -236,9 +236,10 @@ static int get_rx_bufs(struct vhost_virtqueue *vq,
> >     int seg = 0;
> >     int headcount = 0;
> >     unsigned d;
> > +   size_t len;
> >     int r, nlogs = 0;
> > 
> > -   while (datalen > 0) {
> > +   for (;;) {
> >        if (unlikely(headcount >= VHOST_NET_MAX_SG)) {
> >           r = -ENOBUFS;
> >           goto err;
> > @@ -260,16 +261,20 @@ static int get_rx_bufs(struct vhost_virtqueue *vq,
> >           nlogs += *log_num;
> >           log += *log_num;
> >        }
> > +      len = iov_length(vq->iov + seg, in);
> > +      seg += in;
> >        heads[headcount].id = d;
> > -      heads[headcount].len = iov_length(vq->iov + seg, in);
> > -      datalen -= heads[headcount].len;
> > +      if (datalen <= len)
> > +         break;
> > +      heads[headcount].len = len;
> >        ++headcount;
> > -      seg += in;
> > +      datalen -= len;
> >     }
> > +   heads[headcount].len = datalen;
> >     *iovcount = seg;
> >     if (unlikely(log))
> >        *log_num = nlogs;
> > -   return headcount;
> > +   return headcount + 1;
> >  err:
> >     vhost_discard_desc(vq, headcount);
> >     return r;
> > 
> > -- 
> > MST
> > --
> > To unsubscribe from this list: send the line "unsubscribe kvm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: TCP-MD5 checksum failure on x86_64 SMP
From: Bijay Singh @ 2010-05-10 17:27 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Stephen Hemminger, David Miller, <bhaskie@gmail.com>,
	<bhutchings@solarflare.com>, <netdev@vger.kernel.org>
In-Reply-To: <1273504693.2221.17.camel@edumazet-laptop>

Hi Eric,

Didn't intend to mix  the issues. It was a hack intended to calm the nerves. I am going to apply the proper patches asap.

About the latest problem of MD5 not working with MTU set to 4470. I noticed this when i needed to change the MTU for some other  purpose. 

Since it was a production box, i have to first set up my box with the right NIC card to reproduce this and try debugging it. In the meantime any ques will  help.

Thanks,
Bijay
 
On 10-May-2010, at 8:48 PM, Eric Dumazet wrote:

> Le lundi 10 mai 2010 à 14:55 +0000, Bijay Singh a écrit :
>> Hi,
>> I had noticed the corruption in the context and actually did what is mentioned.
>> 
>> I allocated the context on the stack and plugged in the md5.c functions. I was able to temporarily solve the problem, all this before I got a response on this thread.
>> 
>> But now I have seeing another problem, when i change the MTU on the interface from 1500 to 4470 none of the message from the peer get thru and I get hash failed message. I am wondering if this is another bug getting hit in this scenario.
> 
> Thats very fine, but you mix very different problems.
> 
> Step by step resolution is required, and clean patches too, because
> plugging md5.c functions is not an option for stable series :)
> 
> Obviously, nobody seriously used TCP-MD5 on linux, but you...
> 
> 
> 


^ permalink raw reply

* Re: [PATCHv7] add mergeable buffers support to vhost_net
From: Michael S. Tsirkin @ 2010-05-10 17:31 UTC (permalink / raw)
  To: David L Stevens; +Cc: netdev, kvm, virtualization
In-Reply-To: <1272488232.11307.4.camel@w-dls.beaverton.ibm.com>

On Wed, Apr 28, 2010 at 01:57:12PM -0700, David L Stevens wrote:
> @@ -218,18 +248,19 @@ static void handle_rx(struct vhost_net *
>  	use_mm(net->dev.mm);
>  	mutex_lock(&vq->mutex);
>  	vhost_disable_notify(vq);
> -	hdr_size = vq->hdr_size;
> +	vhost_hlen = vq->vhost_hlen;
>  
>  	vq_log = unlikely(vhost_has_feature(&net->dev, VHOST_F_LOG_ALL)) ?
>  		vq->log : NULL;
>  
> -	for (;;) {
> -		head = vhost_get_vq_desc(&net->dev, vq, vq->iov,
> -					 ARRAY_SIZE(vq->iov),
> -					 &out, &in,
> -					 vq_log, &log);
> +	while ((datalen = vhost_head_len(vq, sock->sk))) {
> +		headcount = vhost_get_desc_n(vq, vq->heads,
> +					     datalen + vhost_hlen,
> +					     &in, vq_log, &log);
> +		if (headcount < 0)
> +			break;
>  		/* OK, now we need to know about added descriptors. */
> -		if (head == vq->num) {
> +		if (!headcount) {
>  			if (unlikely(vhost_enable_notify(vq))) {
>  				/* They have slipped one in as we were
>  				 * doing that: check again. */

So I think this breaks handling for a failure mode where
we get an skb that is larger than the max packet guest
can get. The right thing to do in this case is to
drop the skb, we currently do this by passing
truncate flag to recvmsg.

In particular, with mergeable buffers off, if we get an skb
that does not fit in a single packet, this code will
spread it over multiple buffers.

You should be able to reproduce this fairly easily
by disabling both indirect buffers and mergeable buffers
on qemu command line. With current code TCP still
works by falling back on small packets. I think
with your code it will get stuck forever once
we get an skb that is too large for us to handle.

-- 
MST

^ permalink raw reply

* pull request: wireless-2.6 2010-05-10
From: John W. Linville @ 2010-05-10 17:43 UTC (permalink / raw)
  To: davem; +Cc: linux-wireless, netdev, linux-kernel

Dave,

Here are three more candidates for 2.6.34.  I hesitated to push them,
but at least two of them are documented regressions and the other (i.e.
"iwlwifi: work around passive scan issue") avoids some rather annoying
firmware restarts at little or no risk.  I think it would be good to
take these now rather than later.

Please let me know if there are problems!

Thanks,

John

---

The following changes since commit 80ea76bb2575c426154b8d61d324197ee3592baa:
  David S. Miller (1):
        phy: Fix initialization in micrel driver.

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git master

Christian Lamparter (1):
      ar9170: wait for asynchronous firmware loading

Johannes Berg (1):
      iwlwifi: work around passive scan issue

Reinette Chatre (1):
      mac80211: remove association work when processing deauth request

 drivers/net/wireless/ath/ar9170/usb.c       |   11 +++++++++++
 drivers/net/wireless/ath/ar9170/usb.h       |    1 +
 drivers/net/wireless/iwlwifi/iwl-commands.h |    4 +++-
 drivers/net/wireless/iwlwifi/iwl-scan.c     |   23 ++++++++++++++++++-----
 drivers/net/wireless/iwlwifi/iwl3945-base.c |    3 ++-
 net/mac80211/mlme.c                         |    3 ++-
 6 files changed, 37 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/ath/ar9170/usb.c b/drivers/net/wireless/ath/ar9170/usb.c
index 6b1cb70..24dc555 100644
--- a/drivers/net/wireless/ath/ar9170/usb.c
+++ b/drivers/net/wireless/ath/ar9170/usb.c
@@ -726,12 +726,16 @@ static void ar9170_usb_firmware_failed(struct ar9170_usb *aru)
 {
 	struct device *parent = aru->udev->dev.parent;
 
+	complete(&aru->firmware_loading_complete);
+
 	/* unbind anything failed */
 	if (parent)
 		down(&parent->sem);
 	device_release_driver(&aru->udev->dev);
 	if (parent)
 		up(&parent->sem);
+
+	usb_put_dev(aru->udev);
 }
 
 static void ar9170_usb_firmware_finish(const struct firmware *fw, void *context)
@@ -760,6 +764,8 @@ static void ar9170_usb_firmware_finish(const struct firmware *fw, void *context)
 	if (err)
 		goto err_unrx;
 
+	complete(&aru->firmware_loading_complete);
+	usb_put_dev(aru->udev);
 	return;
 
  err_unrx:
@@ -857,6 +863,7 @@ static int ar9170_usb_probe(struct usb_interface *intf,
 	init_usb_anchor(&aru->tx_pending);
 	init_usb_anchor(&aru->tx_submitted);
 	init_completion(&aru->cmd_wait);
+	init_completion(&aru->firmware_loading_complete);
 	spin_lock_init(&aru->tx_urb_lock);
 
 	aru->tx_pending_urbs = 0;
@@ -876,6 +883,7 @@ static int ar9170_usb_probe(struct usb_interface *intf,
 	if (err)
 		goto err_freehw;
 
+	usb_get_dev(aru->udev);
 	return request_firmware_nowait(THIS_MODULE, 1, "ar9170.fw",
 				       &aru->udev->dev, GFP_KERNEL, aru,
 				       ar9170_usb_firmware_step2);
@@ -895,6 +903,9 @@ static void ar9170_usb_disconnect(struct usb_interface *intf)
 		return;
 
 	aru->common.state = AR9170_IDLE;
+
+	wait_for_completion(&aru->firmware_loading_complete);
+
 	ar9170_unregister(&aru->common);
 	ar9170_usb_cancel_urbs(aru);
 
diff --git a/drivers/net/wireless/ath/ar9170/usb.h b/drivers/net/wireless/ath/ar9170/usb.h
index a2ce3b1..919b060 100644
--- a/drivers/net/wireless/ath/ar9170/usb.h
+++ b/drivers/net/wireless/ath/ar9170/usb.h
@@ -71,6 +71,7 @@ struct ar9170_usb {
 	unsigned int tx_pending_urbs;
 
 	struct completion cmd_wait;
+	struct completion firmware_loading_complete;
 	int readlen;
 	u8 *readbuf;
 
diff --git a/drivers/net/wireless/iwlwifi/iwl-commands.h b/drivers/net/wireless/iwlwifi/iwl-commands.h
index 6383d9f..f4e59ae 100644
--- a/drivers/net/wireless/iwlwifi/iwl-commands.h
+++ b/drivers/net/wireless/iwlwifi/iwl-commands.h
@@ -2621,7 +2621,9 @@ struct iwl_ssid_ie {
 #define PROBE_OPTION_MAX_3945		4
 #define PROBE_OPTION_MAX		20
 #define TX_CMD_LIFE_TIME_INFINITE	cpu_to_le32(0xFFFFFFFF)
-#define IWL_GOOD_CRC_TH			cpu_to_le16(1)
+#define IWL_GOOD_CRC_TH_DISABLED	0
+#define IWL_GOOD_CRC_TH_DEFAULT		cpu_to_le16(1)
+#define IWL_GOOD_CRC_TH_NEVER		cpu_to_le16(0xffff)
 #define IWL_MAX_SCAN_SIZE 1024
 #define IWL_MAX_CMD_SIZE 4096
 #define IWL_MAX_PROBE_REQUEST		200
diff --git a/drivers/net/wireless/iwlwifi/iwl-scan.c b/drivers/net/wireless/iwlwifi/iwl-scan.c
index 5062f4e..2367286 100644
--- a/drivers/net/wireless/iwlwifi/iwl-scan.c
+++ b/drivers/net/wireless/iwlwifi/iwl-scan.c
@@ -812,16 +812,29 @@ static void iwl_bg_request_scan(struct work_struct *data)
 			rate = IWL_RATE_1M_PLCP;
 			rate_flags = RATE_MCS_CCK_MSK;
 		}
-		scan->good_CRC_th = 0;
+		scan->good_CRC_th = IWL_GOOD_CRC_TH_DISABLED;
 	} else if (priv->scan_bands & BIT(IEEE80211_BAND_5GHZ)) {
 		band = IEEE80211_BAND_5GHZ;
 		rate = IWL_RATE_6M_PLCP;
 		/*
-		 * If active scaning is requested but a certain channel
-		 * is marked passive, we can do active scanning if we
-		 * detect transmissions.
+		 * If active scanning is requested but a certain channel is
+		 * marked passive, we can do active scanning if we detect
+		 * transmissions.
+		 *
+		 * There is an issue with some firmware versions that triggers
+		 * a sysassert on a "good CRC threshold" of zero (== disabled),
+		 * on a radar channel even though this means that we should NOT
+		 * send probes.
+		 *
+		 * The "good CRC threshold" is the number of frames that we
+		 * need to receive during our dwell time on a channel before
+		 * sending out probes -- setting this to a huge value will
+		 * mean we never reach it, but at the same time work around
+		 * the aforementioned issue. Thus use IWL_GOOD_CRC_TH_NEVER
+		 * here instead of IWL_GOOD_CRC_TH_DISABLED.
 		 */
-		scan->good_CRC_th = is_active ? IWL_GOOD_CRC_TH : 0;
+		scan->good_CRC_th = is_active ? IWL_GOOD_CRC_TH_DEFAULT :
+						IWL_GOOD_CRC_TH_NEVER;
 
 		/* Force use of chains B and C (0x6) for scan Rx for 4965
 		 * Avoid A (0x1) because of its off-channel reception on A-band.
diff --git a/drivers/net/wireless/iwlwifi/iwl3945-base.c b/drivers/net/wireless/iwlwifi/iwl3945-base.c
index e276f2a..2f47d93 100644
--- a/drivers/net/wireless/iwlwifi/iwl3945-base.c
+++ b/drivers/net/wireless/iwlwifi/iwl3945-base.c
@@ -2966,7 +2966,8 @@ static void iwl3945_bg_request_scan(struct work_struct *data)
 		 * is marked passive, we can do active scanning if we
 		 * detect transmissions.
 		 */
-		scan->good_CRC_th = is_active ? IWL_GOOD_CRC_TH : 0;
+		scan->good_CRC_th = is_active ? IWL_GOOD_CRC_TH_DEFAULT :
+						IWL_GOOD_CRC_TH_DISABLED;
 		band = IEEE80211_BAND_5GHZ;
 	} else {
 		IWL_WARN(priv, "Invalid scan band count\n");
diff --git a/net/mac80211/mlme.c b/net/mac80211/mlme.c
index 8a96503..6ccd48e 100644
--- a/net/mac80211/mlme.c
+++ b/net/mac80211/mlme.c
@@ -2029,7 +2029,8 @@ int ieee80211_mgd_deauth(struct ieee80211_sub_if_data *sdata,
 				continue;
 
 			if (wk->type != IEEE80211_WORK_DIRECT_PROBE &&
-			    wk->type != IEEE80211_WORK_AUTH)
+			    wk->type != IEEE80211_WORK_AUTH &&
+			    wk->type != IEEE80211_WORK_ASSOC)
 				continue;
 
 			if (memcmp(req->bss->bssid, wk->filter_ta, ETH_ALEN))
-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply related

* Re: [PATCHv7] add mergeable buffers support to vhost_net
From: David Stevens @ 2010-05-10 17:46 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: kvm, kvm-owner, netdev, netdev-owner, virtualization
In-Reply-To: <20100510172557.GD28798@redhat.com>

netdev-owner@vger.kernel.org wrote on 05/10/2010 10:25:57 AM:

> On Mon, May 10, 2010 at 10:09:03AM -0700, David Stevens wrote:
> > Since "datalen" carries the difference and will be negative by that 
amount
> > from the original loop, what about just adding something like:
> > 
> >         }
> >         if (headcount)
> >                 heads[headcount-1].len += datalen;
> > [and really, headcount >0 since datalen > 0, so just:
> > 
> >         heads[headcount-1].len += datalen;
> > 
> >                                                 +-DLS
> 
> This works too, just does more checks and comparisons.
> I am still surprised that you were unable to reproduce the problem.
> 

I'm sure it happened, and probably had a performance
penalty on my systems too, but not as much as yours.
I didn't see any obvious performance difference running
with the patch, though; not sure why. I'll instrument to
see how often it's happening, I think.
        But fixed now, good catch!

                                                        +-DLS


^ permalink raw reply

* Re: [PATCH] virtif: initial interface extensions
From: Scott Feldman @ 2010-05-10 18:56 UTC (permalink / raw)
  To: Stefan Berger, netdev
In-Reply-To: <loom.20100510T172617-53@post.gmane.org>

On 5/10/10 8:37 AM, "Stefan Berger" <stefanb@us.ibm.com> wrote:

> Arnd Bergmann <arnd <at> arndb.de> writes:
> 
> [...]
> 
>> + if (tb[IFLA_VIRTIF]) {
>> +  struct ifla_virtif_port_profile *ivp;
>> +  struct nlattr *virtif[IFLA_VIRTIF_MAX+1];
>> +  u32 vf;
>> +
>> +  err = nla_parse_nested(virtif, IFLA_VIRTIF_MAX,
>> +           tb[IFLA_VIRTIF], ifla_virtif_policy);
>> +  if (err < 0)
>> +   return err;
>> +
>> +  if (!virtif[IFLA_VIRTIF_VF] || !virtif[IFLA_VIRTIF_PORT_PROFILE])
>> +   goto novirtif; /* IFLA_VIRTIF may be directed at user space */
> 
> 
> In what case would the IFLA_VIRTIF_PORT_PROFILE be provided? Would libvirt for
> example need to be aware of whether the Ethernet device can handle the setup
> protocol via its firmware and in this case provide the port profile parameter
> and in other cases provide other parameters? Certainly the user or upper layer
> management software would have to know it when creating the domain XML and in
> fact different types of parameters were needed.

> Obviously we should have one
> common set of (XML) parameters that go into the netlink message and that can
> be handled by the kernel driver if the firmware knows how to handle it or by
> LLDPAD. 

With Arnd's latest additions, we have a single netlink msg, but the
parameter sets are disjoint between VDP/CDCP and what we need for the kernel
driver.  So that means the sender (libvirt in this case) needs to know about
both setups to send a single netlink msg.  An alternative is a have two
netlink msgs, one for each setup.  That still requires the sender to know
about two setups.  

> Libvirt would send the parameters via netlink message to trigger the
> setup protocol and the message may be received by kernel and LLDPAD.
 
That was the original idea by having libvirt send the netlink msg using
multicast.


^ permalink raw reply

* [PATCH 0/4] bridge: patches for net-next
From: Stephen Hemminger @ 2010-05-10 19:31 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, bridge

These are only partially related patches for 2.6.35.
They supersede earlier (unaccepted) patches in net-next.



^ permalink raw reply

* [PATCH 1/4] bridge: netpoll cleanup
From: Stephen Hemminger @ 2010-05-10 19:31 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, bridge
In-Reply-To: <20100510193107.722574297@vyatta.com>

[-- Attachment #1: br-netpoll-cleanup.patch --]
[-- Type: text/plain, Size: 3983 bytes --]

Move code around so that the ifdef for NETPOLL_CONTROLLER don't have to
show up in main code path. The control functions should be in helpers
that are only compiled if needed.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>


--- a/net/bridge/br_device.c	2010-05-10 09:51:51.568057462 -0700
+++ b/net/bridge/br_device.c	2010-05-10 11:19:04.867327762 -0700
@@ -191,7 +191,7 @@ static int br_set_tx_csum(struct net_dev
 }
 
 #ifdef CONFIG_NET_POLL_CONTROLLER
-bool br_devices_support_netpoll(struct net_bridge *br)
+static bool br_devices_support_netpoll(struct net_bridge *br)
 {
 	struct net_bridge_port *p;
 	bool ret = true;
@@ -217,9 +217,9 @@ static void br_poll_controller(struct ne
 		netpoll_poll_dev(np->real_dev);
 }
 
-void br_netpoll_cleanup(struct net_device *br_dev)
+void br_netpoll_cleanup(struct net_device *dev)
 {
-	struct net_bridge *br = netdev_priv(br_dev);
+	struct net_bridge *br = netdev_priv(dev);
 	struct net_bridge_port *p, *n;
 	const struct net_device_ops *ops;
 
@@ -235,10 +235,30 @@ void br_netpoll_cleanup(struct net_devic
 	}
 }
 
-#else
-
-void br_netpoll_cleanup(struct net_device *br_dev)
+void br_netpoll_disable(struct net_bridge *br,
+			struct net_device *dev)
 {
+	if (br_devices_support_netpoll(br))
+		br->dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
+	if (dev->netdev_ops->ndo_netpoll_cleanup)
+		dev->netdev_ops->ndo_netpoll_cleanup(dev);
+	else
+		dev->npinfo = NULL;
+}
+
+void br_netpoll_enable(struct net_bridge *br,
+		       struct net_device *dev)
+{
+	if (br_devices_support_netpoll(br)) {
+		br->dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
+		if (br->dev->npinfo)
+			dev->npinfo = br->dev->npinfo;
+	} else if (!(br->dev->priv_flags & IFF_DISABLE_NETPOLL)) {
+		br->dev->priv_flags |= IFF_DISABLE_NETPOLL;
+		printk(KERN_INFO "%s:new device %s"
+			" does not support netpoll (disabling)",
+			br->dev->name, dev->name);
+	}
 }
 
 #endif
--- a/net/bridge/br_if.c	2010-05-10 09:51:47.878057482 -0700
+++ b/net/bridge/br_if.c	2010-05-10 10:47:51.089679264 -0700
@@ -154,14 +154,7 @@ static void del_nbp(struct net_bridge_po
 	kobject_uevent(&p->kobj, KOBJ_REMOVE);
 	kobject_del(&p->kobj);
 
-#ifdef CONFIG_NET_POLL_CONTROLLER
-	if (br_devices_support_netpoll(br))
-		br->dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
-	if (dev->netdev_ops->ndo_netpoll_cleanup)
-		dev->netdev_ops->ndo_netpoll_cleanup(dev);
-	else
-		dev->npinfo = NULL;
-#endif
+	br_netpoll_disable(br, dev);
 	call_rcu(&p->rcu, destroy_nbp_rcu);
 }
 
@@ -455,19 +448,7 @@ int br_add_if(struct net_bridge *br, str
 
 	kobject_uevent(&p->kobj, KOBJ_ADD);
 
-#ifdef CONFIG_NET_POLL_CONTROLLER
-	if (br_devices_support_netpoll(br)) {
-		br->dev->priv_flags &= ~IFF_DISABLE_NETPOLL;
-		if (br->dev->npinfo)
-			dev->npinfo = br->dev->npinfo;
-	} else if (!(br->dev->priv_flags & IFF_DISABLE_NETPOLL)) {
-		br->dev->priv_flags |= IFF_DISABLE_NETPOLL;
-		printk(KERN_INFO "New device %s does not support netpoll\n",
-			dev->name);
-		printk(KERN_INFO "Disabling netpoll for %s\n",
-			br->dev->name);
-	}
-#endif
+	br_netpoll_enable(br, dev);
 
 	return 0;
 err2:
--- a/net/bridge/br_private.h	2010-05-10 09:51:55.267744944 -0700
+++ b/net/bridge/br_private.h	2010-05-10 10:08:09.117432563 -0700
@@ -253,8 +253,18 @@ static inline int br_is_root_bridge(cons
 extern void br_dev_setup(struct net_device *dev);
 extern netdev_tx_t br_dev_xmit(struct sk_buff *skb,
 			       struct net_device *dev);
-extern bool br_devices_support_netpoll(struct net_bridge *br);
-extern void br_netpoll_cleanup(struct net_device *br_dev);
+#ifdef CONFIG_NET_POLL_CONTROLLER
+extern void br_netpoll_cleanup(struct net_device *dev);
+extern void br_netpoll_enable(struct net_bridge *br,
+			      struct net_device *dev);
+extern void br_netpoll_disable(struct net_bridge *br,
+			       struct net_device *dev);
+#else
+#define br_netpoll_cleanup(br)
+#define br_netpoll_enable(br, dev)
+#define br_netpoll_disable(br, dev)
+
+#endif
 
 /* br_fdb.c */
 extern int br_fdb_init(void);



^ permalink raw reply

* [PATCH 2/4] bridge: change console message interface
From: Stephen Hemminger @ 2010-05-10 19:31 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, bridge
In-Reply-To: <20100510193107.722574297@vyatta.com>

[-- Attachment #1: bridge-msg.patch --]
[-- Type: text/plain, Size: 12036 bytes --]

Use one set of macro's for all bridge messages.

Note: can't use netdev_XXX macro's because bridge is purely
virtual and has no device parent.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

---
 net/bridge/br.c           |    2 +-
 net/bridge/br_fdb.c       |    9 ++++-----
 net/bridge/br_ioctl.c     |    2 +-
 net/bridge/br_multicast.c |   32 +++++++++++++-------------------
 net/bridge/br_netlink.c   |    8 +++++---
 net/bridge/br_private.h   |   15 +++++++++++++++
 net/bridge/br_stp.c       |   11 +++++------
 net/bridge/br_stp_if.c    |   16 ++++++----------
 net/bridge/br_stp_timer.c |   24 ++++++++++--------------
 9 files changed, 60 insertions(+), 59 deletions(-)

--- a/net/bridge/br.c	2010-05-10 11:50:28.856612469 -0700
+++ b/net/bridge/br.c	2010-05-10 11:50:34.547229645 -0700
@@ -38,7 +38,7 @@ static int __init br_init(void)
 
 	err = stp_proto_register(&br_stp_proto);
 	if (err < 0) {
-		printk(KERN_ERR "bridge: can't register sap for STP\n");
+		pr_err("bridge: can't register sap for STP\n");
 		return err;
 	}
 
--- a/net/bridge/br_fdb.c	2010-05-10 11:50:28.886629735 -0700
+++ b/net/bridge/br_fdb.c	2010-05-10 12:00:32.239308637 -0700
@@ -353,8 +353,7 @@ static int fdb_insert(struct net_bridge 
 		 */
 		if (fdb->is_local)
 			return 0;
-
-		printk(KERN_WARNING "%s adding interface with same address "
+		br_warn(br, "adding interface %s with same address "
 		       "as a received packet\n",
 		       source->dev->name);
 		fdb_delete(fdb);
@@ -397,9 +396,9 @@ void br_fdb_update(struct net_bridge *br
 		/* attempt to update an entry for a local interface */
 		if (unlikely(fdb->is_local)) {
 			if (net_ratelimit())
-				printk(KERN_WARNING "%s: received packet with "
-				       "own address as source address\n",
-				       source->dev->name);
+				br_warn(br, "received packet on %s with "
+					"own address as source address\n",
+					source->dev->name);
 		} else {
 			/* fastpath: update of existing entry */
 			fdb->dst = source;
--- a/net/bridge/br_multicast.c	2010-05-10 11:50:28.936627985 -0700
+++ b/net/bridge/br_multicast.c	2010-05-10 12:04:33.316906966 -0700
@@ -585,10 +585,9 @@ static struct net_bridge_mdb_entry *br_m
 
 	if (unlikely(count > br->hash_elasticity && count)) {
 		if (net_ratelimit())
-			printk(KERN_INFO "%s: Multicast hash table "
-			       "chain limit reached: %s\n",
-			       br->dev->name, port ? port->dev->name :
-						     br->dev->name);
+			br_info(br, "Multicast hash table "
+				"chain limit reached: %s\n",
+				port ? port->dev->name : br->dev->name);
 
 		elasticity = br->hash_elasticity;
 	}
@@ -596,11 +595,9 @@ static struct net_bridge_mdb_entry *br_m
 	if (mdb->size >= max) {
 		max *= 2;
 		if (unlikely(max >= br->hash_max)) {
-			printk(KERN_WARNING "%s: Multicast hash table maximum "
-			       "reached, disabling snooping: %s, %d\n",
-			       br->dev->name, port ? port->dev->name :
-						     br->dev->name,
-			       max);
+			br_warn(br, "Multicast hash table maximum "
+				"reached, disabling snooping: %s, %d\n",
+				port ? port->dev->name : br->dev->name, max);
 			err = -E2BIG;
 disable:
 			br->multicast_disabled = 1;
@@ -611,22 +608,19 @@ disable:
 	if (max > mdb->max || elasticity) {
 		if (mdb->old) {
 			if (net_ratelimit())
-				printk(KERN_INFO "%s: Multicast hash table "
-				       "on fire: %s\n",
-				       br->dev->name, port ? port->dev->name :
-							     br->dev->name);
+				br_info(br, "Multicast hash table "
+					"on fire: %s\n",
+					port ? port->dev->name : br->dev->name);
 			err = -EEXIST;
 			goto err;
 		}
 
 		err = br_mdb_rehash(&br->mdb, max, elasticity);
 		if (err) {
-			printk(KERN_WARNING "%s: Cannot rehash multicast "
-			       "hash table, disabling snooping: "
-			       "%s, %d, %d\n",
-			       br->dev->name, port ? port->dev->name :
-						     br->dev->name,
-			       mdb->size, err);
+			br_warn(br, "Cannot rehash multicast "
+				"hash table, disabling snooping: %s, %d, %d\n",
+				port ? port->dev->name : br->dev->name,
+				mdb->size, err);
 			goto disable;
 		}
 
--- a/net/bridge/br_stp_if.c	2010-05-10 11:50:28.916629479 -0700
+++ b/net/bridge/br_stp_if.c	2010-05-10 12:04:37.100899166 -0700
@@ -85,17 +85,16 @@ void br_stp_enable_port(struct net_bridg
 {
 	br_init_port(p);
 	br_port_state_selection(p->br);
+	br_log_state(p);
 }
 
 /* called under bridge lock */
 void br_stp_disable_port(struct net_bridge_port *p)
 {
-	struct net_bridge *br;
+	struct net_bridge *br = p->br;
 	int wasroot;
 
-	br = p->br;
-	printk(KERN_INFO "%s: port %i(%s) entering %s state\n",
-	       br->dev->name, p->port_no, p->dev->name, "disabled");
+	br_log_state(p);
 
 	wasroot = br_is_root_bridge(br);
 	br_become_designated_port(p);
@@ -127,11 +126,10 @@ static void br_stp_start(struct net_brid
 	r = call_usermodehelper(BR_STP_PROG, argv, envp, UMH_WAIT_PROC);
 	if (r == 0) {
 		br->stp_enabled = BR_USER_STP;
-		printk(KERN_INFO "%s: userspace STP started\n", br->dev->name);
+		br_debug(br, "userspace STP started\n");
 	} else {
 		br->stp_enabled = BR_KERNEL_STP;
-		printk(KERN_INFO "%s: starting userspace STP failed, "
-				"starting kernel STP\n", br->dev->name);
+		br_debug(br, "using kernel STP\n");
 
 		/* To start timers on any ports left in blocking */
 		spin_lock_bh(&br->lock);
@@ -148,9 +146,7 @@ static void br_stp_stop(struct net_bridg
 
 	if (br->stp_enabled == BR_USER_STP) {
 		r = call_usermodehelper(BR_STP_PROG, argv, envp, 1);
-		printk(KERN_INFO "%s: userspace STP stopped, return code %d\n",
-			br->dev->name, r);
-
+		br_info(br, "userspace STP stopped, return code %d\n", r);
 
 		/* To start timers on any ports left in blocking */
 		spin_lock_bh(&br->lock);
--- a/net/bridge/br_ioctl.c	2010-05-10 11:50:28.906610162 -0700
+++ b/net/bridge/br_ioctl.c	2010-05-10 12:00:40.326906216 -0700
@@ -412,6 +412,6 @@ int br_dev_ioctl(struct net_device *dev,
 
 	}
 
-	pr_debug("Bridge does not support ioctl 0x%x\n", cmd);
+	br_debug(br, "Bridge does not support ioctl 0x%x\n", cmd);
 	return -EOPNOTSUPP;
 }
--- a/net/bridge/br_netlink.c	2010-05-10 11:50:28.876606911 -0700
+++ b/net/bridge/br_netlink.c	2010-05-10 12:11:33.095469978 -0700
@@ -42,8 +42,8 @@ static int br_fill_ifinfo(struct sk_buff
 	struct nlmsghdr *nlh;
 	u8 operstate = netif_running(dev) ? dev->operstate : IF_OPER_DOWN;
 
-	pr_debug("br_fill_info event %d port %s master %s\n",
-		 event, dev->name, br->dev->name);
+	br_debug(br, "br_fill_info event %d port %s master %s\n",
+		     event, dev->name, br->dev->name);
 
 	nlh = nlmsg_put(skb, pid, seq, event, sizeof(*hdr), flags);
 	if (nlh == NULL)
@@ -87,7 +87,9 @@ void br_ifinfo_notify(int event, struct 
 	struct sk_buff *skb;
 	int err = -ENOBUFS;
 
-	pr_debug("bridge notify event=%d\n", event);
+	br_debug(port->br, "port %u(%s) event %d\n",
+		 (unsigned)port->port_no, port->dev->name, event);
+
 	skb = nlmsg_new(br_nlmsg_size(), GFP_ATOMIC);
 	if (skb == NULL)
 		goto errout;
--- a/net/bridge/br_stp_timer.c	2010-05-10 11:50:28.916629479 -0700
+++ b/net/bridge/br_stp_timer.c	2010-05-10 12:09:57.504418884 -0700
@@ -35,7 +35,7 @@ static void br_hello_timer_expired(unsig
 {
 	struct net_bridge *br = (struct net_bridge *)arg;
 
-	pr_debug("%s: hello timer expired\n", br->dev->name);
+	br_debug(br, "hello timer expired\n");
 	spin_lock(&br->lock);
 	if (br->dev->flags & IFF_UP) {
 		br_config_bpdu_generation(br);
@@ -55,13 +55,9 @@ static void br_message_age_timer_expired
 	if (p->state == BR_STATE_DISABLED)
 		return;
 
-
-	pr_info("%s: neighbor %.2x%.2x.%.2x:%.2x:%.2x:%.2x:%.2x:%.2x lost on port %d(%s)\n",
-		br->dev->name,
-		id->prio[0], id->prio[1],
-		id->addr[0], id->addr[1], id->addr[2],
-		id->addr[3], id->addr[4], id->addr[5],
-		p->port_no, p->dev->name);
+	br_info(br, "port %u(%s) neighbor %.2x%.2x.%pM lost\n",
+		(unsigned) p->port_no, p->dev->name,
+		id->prio[0], id->prio[1], &id->addr);
 
 	/*
 	 * According to the spec, the message age timer cannot be
@@ -87,8 +83,8 @@ static void br_forward_delay_timer_expir
 	struct net_bridge_port *p = (struct net_bridge_port *) arg;
 	struct net_bridge *br = p->br;
 
-	pr_debug("%s: %d(%s) forward delay timer\n",
-		 br->dev->name, p->port_no, p->dev->name);
+	br_debug(br, "port %u(%s) forward delay timer\n",
+		 (unsigned) p->port_no, p->dev->name);
 	spin_lock(&br->lock);
 	if (p->state == BR_STATE_LISTENING) {
 		p->state = BR_STATE_LEARNING;
@@ -107,7 +103,7 @@ static void br_tcn_timer_expired(unsigne
 {
 	struct net_bridge *br = (struct net_bridge *) arg;
 
-	pr_debug("%s: tcn timer expired\n", br->dev->name);
+	br_debug(br, "tcn timer expired\n");
 	spin_lock(&br->lock);
 	if (br->dev->flags & IFF_UP) {
 		br_transmit_tcn(br);
@@ -121,7 +117,7 @@ static void br_topology_change_timer_exp
 {
 	struct net_bridge *br = (struct net_bridge *) arg;
 
-	pr_debug("%s: topo change timer expired\n", br->dev->name);
+	br_debug(br, "topo change timer expired\n");
 	spin_lock(&br->lock);
 	br->topology_change_detected = 0;
 	br->topology_change = 0;
@@ -132,8 +128,8 @@ static void br_hold_timer_expired(unsign
 {
 	struct net_bridge_port *p = (struct net_bridge_port *) arg;
 
-	pr_debug("%s: %d(%s) hold timer expired\n",
-		 p->br->dev->name,  p->port_no, p->dev->name);
+	br_debug(p->br, "port %u(%s) hold timer expired\n",
+		 (unsigned) p->port_no, p->dev->name);
 
 	spin_lock(&p->br->lock);
 	if (p->config_pending)
--- a/net/bridge/br_stp.c	2010-05-10 11:50:28.896578146 -0700
+++ b/net/bridge/br_stp.c	2010-05-10 12:01:40.080530431 -0700
@@ -31,10 +31,9 @@ static const char *const br_port_state_n
 
 void br_log_state(const struct net_bridge_port *p)
 {
-	pr_info("%s: port %d(%s) entering %s state\n",
-		p->br->dev->name, p->port_no, p->dev->name,
+	br_info(p->br, "port %u(%s) entering %s state\n",
+		(unsigned) p->port_no, p->dev->name,
 		br_port_state_names[p->state]);
-
 }
 
 /* called under bridge lock */
@@ -300,7 +299,7 @@ void br_topology_change_detection(struct
 	if (br->stp_enabled != BR_KERNEL_STP)
 		return;
 
-	pr_info("%s: topology change detected, %s\n", br->dev->name,
+	br_info(br, "topology change detected, %s\n",
 		isroot ? "propagating" : "sending tcn bpdu");
 
 	if (isroot) {
@@ -469,8 +468,8 @@ void br_received_config_bpdu(struct net_
 void br_received_tcn_bpdu(struct net_bridge_port *p)
 {
 	if (br_is_designated_port(p)) {
-		pr_info("%s: received tcn bpdu on port %i(%s)\n",
-		       p->br->dev->name, p->port_no, p->dev->name);
+		br_info(p->br, "port %u(%s) received tcn bpdu\n",
+			(unsigned) p->port_no, p->dev->name);
 
 		br_topology_change_detection(p->br);
 		br_topology_change_acknowledge(p);
--- a/net/bridge/br_private.h	2010-05-10 11:50:37.646944750 -0700
+++ b/net/bridge/br_private.h	2010-05-10 12:28:44.137056465 -0700
@@ -240,6 +240,21 @@ struct br_input_skb_cb {
 # define BR_INPUT_SKB_CB_MROUTERS_ONLY(__skb)	(0)
 #endif
 
+#define br_printk(level, br, format, args...)	\
+	printk(level "%s: " format, (br)->dev->name, ##args)
+
+#define br_err(__br, format, args...)			\
+	br_printk(KERN_ERR, __br, format, ##args)
+#define br_warn(__br, format, args...)			\
+	br_printk(KERN_WARNING, __br, format, ##args)
+#define br_notice(__br, format, args...)		\
+	br_printk(KERN_NOTICE, __br, format, ##args)
+#define br_info(__br, format, args...)			\
+	br_printk(KERN_INFO, __br, format, ##args)
+
+#define br_debug(br, format, args...)			\
+	pr_debug("%s: " format,  (br)->dev->name, ##args)
+
 extern struct notifier_block br_device_notifier;
 extern const u8 br_group_address[ETH_ALEN];
 
--- a/net/bridge/br_device.c	2010-05-10 12:29:30.308306178 -0700
+++ b/net/bridge/br_device.c	2010-05-10 12:30:13.027074884 -0700
@@ -255,9 +255,8 @@ void br_netpoll_enable(struct net_bridge
 			dev->npinfo = br->dev->npinfo;
 	} else if (!(br->dev->priv_flags & IFF_DISABLE_NETPOLL)) {
 		br->dev->priv_flags |= IFF_DISABLE_NETPOLL;
-		printk(KERN_INFO "%s:new device %s"
-			" does not support netpoll (disabling)",
-			br->dev->name, dev->name);
+		br_info(br,"new device %s does not support netpoll (disabling)",
+			dev->name);
 	}
 }
 



^ permalink raw reply

* [PATCH 3/4] bridge: netfilter use net_ratelimit
From: Stephen Hemminger @ 2010-05-10 19:31 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, bridge
In-Reply-To: <20100510193107.722574297@vyatta.com>

[-- Attachment #1: bridge-netfilter-msg.patch --]
[-- Type: text/plain, Size: 1408 bytes --]

The function __br_dnat_complain is basically reimplementing existing
net_ratelimit.

Signed-off-by: Stephen Hemminger <shemminger@vyatta.com>

--- a/net/bridge/br_netfilter.c	2010-05-06 12:32:23.427786161 -0700
+++ b/net/bridge/br_netfilter.c	2010-05-06 12:33:37.826565965 -0700
@@ -253,17 +253,6 @@ static int br_nf_pre_routing_finish_ipv6
 	return 0;
 }
 
-static void __br_dnat_complain(void)
-{
-	static unsigned long last_complaint;
-
-	if (jiffies - last_complaint >= 5 * HZ) {
-		printk(KERN_WARNING "Performing cross-bridge DNAT requires IP "
-		       "forwarding to be enabled\n");
-		last_complaint = jiffies;
-	}
-}
-
 /* This requires some explaining. If DNAT has taken place,
  * we will need to fix up the destination Ethernet address,
  * and this is a tricky process.
@@ -382,8 +371,12 @@ static int br_nf_pre_routing_finish(stru
 				/* we are sure that forwarding is disabled, so printing
 				 * this message is no problem. Note that the packet could
 				 * still have a martian destination address, in which case
-				 * the packet could be dropped even if forwarding were enabled */
-				__br_dnat_complain();
+				 * the packet could be dropped even if forwarding were enabled
+				 */
+				if (net_ratelimit())
+					netdev_warn(dev, "Performing cross-bridge DNAT "
+						    "requires IP forwarding to be enabled\n");
+
 				dst_release((struct dst_entry *)rt);
 			}
 free_skb:



^ permalink raw reply

* [PATCH 4/4] bridge: update sysfs link names if port device names have changed
From: Stephen Hemminger @ 2010-05-10 19:31 UTC (permalink / raw)
  To: David S. Miller; +Cc: netdev, bridge, Simon Arlott
In-Reply-To: <20100510193107.722574297@vyatta.com>

[-- Attachment #1: br-rename-link.patch --]
[-- Type: text/plain, Size: 4844 bytes --]

From: Simon Arlott <simon@fire.lp0.eu>

Links for each port are created in sysfs using the device
name, but this could be changed after being added to the
bridge.

As well as being unable to remove interfaces after this
occurs (because userspace tools don't recognise the new
name, and the kernel won't recognise the old name), adding
another interface with the old name to the bridge will
cause an error trying to create the sysfs link.

This fixes the problem by listening for NETDEV_CHANGENAME
notifications and renaming the link.

https://bugzilla.kernel.org/show_bug.cgi?id=12743

Signed-off-by: Simon Arlott <simon@fire.lp0.eu>
Acked-by: Stephen Hemminger <shemminger@vyatta.com>

---
Modified to apply to net-next and fix checkpatch warnings -- stephen

 fs/sysfs/symlink.c       |    1 +
 net/bridge/br_if.c       |    2 +-
 net/bridge/br_notify.c   |    7 +++++++
 net/bridge/br_private.h  |    6 ++++++
 net/bridge/br_sysfs_if.c |   32 +++++++++++++++++++++++++++-----
 5 files changed, 42 insertions(+), 6 deletions(-)

--- a/fs/sysfs/symlink.c	2010-05-07 17:43:18.937936182 -0700
+++ b/fs/sysfs/symlink.c	2010-05-10 12:15:03.247827705 -0700
@@ -261,3 +261,4 @@ const struct inode_operations sysfs_syml
 
 EXPORT_SYMBOL_GPL(sysfs_create_link);
 EXPORT_SYMBOL_GPL(sysfs_remove_link);
+EXPORT_SYMBOL_GPL(sysfs_rename_link);
--- a/net/bridge/br_if.c	2010-05-10 10:47:51.089679264 -0700
+++ b/net/bridge/br_if.c	2010-05-10 12:15:03.247827705 -0700
@@ -133,7 +133,7 @@ static void del_nbp(struct net_bridge_po
 	struct net_bridge *br = p->br;
 	struct net_device *dev = p->dev;
 
-	sysfs_remove_link(br->ifobj, dev->name);
+	sysfs_remove_link(br->ifobj, p->sysfs_name);
 
 	dev_set_promiscuity(dev, -1);
 
--- a/net/bridge/br_notify.c	2010-05-07 17:43:18.927931730 -0700
+++ b/net/bridge/br_notify.c	2010-05-10 12:15:03.257881460 -0700
@@ -34,6 +34,7 @@ static int br_device_event(struct notifi
 	struct net_device *dev = ptr;
 	struct net_bridge_port *p = dev->br_port;
 	struct net_bridge *br;
+	int err;
 
 	/* not a port of a bridge */
 	if (p == NULL)
@@ -83,6 +84,12 @@ static int br_device_event(struct notifi
 		br_del_if(br, dev);
 		break;
 
+	case NETDEV_CHANGENAME:
+		err = br_sysfs_renameif(p);
+		if (err)
+			return notifier_from_errno(err);
+		break;
+
 	case NETDEV_PRE_TYPE_CHANGE:
 		/* Forbid underlaying device to change its type. */
 		return NOTIFY_BAD;
--- a/net/bridge/br_private.h	2010-05-10 12:11:13.863532891 -0700
+++ b/net/bridge/br_private.h	2010-05-10 12:15:03.267827450 -0700
@@ -139,6 +139,10 @@ struct net_bridge_port
 	struct hlist_head		mglist;
 	struct hlist_node		rlist;
 #endif
+
+#ifdef CONFIG_SYSFS
+	char				sysfs_name[IFNAMSIZ];
+#endif
 };
 
 struct br_cpu_netstats {
@@ -480,6 +484,7 @@ extern void br_ifinfo_notify(int event, 
 /* br_sysfs_if.c */
 extern const struct sysfs_ops brport_sysfs_ops;
 extern int br_sysfs_addif(struct net_bridge_port *p);
+extern int br_sysfs_renameif(struct net_bridge_port *p);
 
 /* br_sysfs_br.c */
 extern int br_sysfs_addbr(struct net_device *dev);
@@ -488,6 +493,7 @@ extern void br_sysfs_delbr(struct net_de
 #else
 
 #define br_sysfs_addif(p)	(0)
+#define br_sysfs_renameif(p)	(0)
 #define br_sysfs_addbr(dev)	(0)
 #define br_sysfs_delbr(dev)	do { } while(0)
 #endif /* CONFIG_SYSFS */
--- a/net/bridge/br_sysfs_if.c	2010-05-07 17:43:18.917930679 -0700
+++ b/net/bridge/br_sysfs_if.c	2010-05-10 12:15:03.267827450 -0700
@@ -246,7 +246,7 @@ const struct sysfs_ops brport_sysfs_ops 
 /*
  * Add sysfs entries to ethernet device added to a bridge.
  * Creates a brport subdirectory with bridge attributes.
- * Puts symlink in bridge's brport subdirectory
+ * Puts symlink in bridge's brif subdirectory
  */
 int br_sysfs_addif(struct net_bridge_port *p)
 {
@@ -257,15 +257,37 @@ int br_sysfs_addif(struct net_bridge_por
 	err = sysfs_create_link(&p->kobj, &br->dev->dev.kobj,
 				SYSFS_BRIDGE_PORT_LINK);
 	if (err)
-		goto out2;
+		return err;
 
 	for (a = brport_attrs; *a; ++a) {
 		err = sysfs_create_file(&p->kobj, &((*a)->attr));
 		if (err)
-			goto out2;
+			return err;
 	}
 
-	err = sysfs_create_link(br->ifobj, &p->kobj, p->dev->name);
-out2:
+	strlcpy(p->sysfs_name, p->dev->name, IFNAMSIZ);
+	return sysfs_create_link(br->ifobj, &p->kobj, p->sysfs_name);
+}
+
+/* Rename bridge's brif symlink */
+int br_sysfs_renameif(struct net_bridge_port *p)
+{
+	struct net_bridge *br = p->br;
+	int err;
+
+	/* If a rename fails, the rollback will cause another
+	 * rename call with the existing name.
+	 */
+	if (!strncmp(p->sysfs_name, p->dev->name, IFNAMSIZ))
+		return 0;
+
+	err = sysfs_rename_link(br->ifobj, &p->kobj,
+				p->sysfs_name, p->dev->name);
+	if (err)
+		netdev_notice(br->dev, "unable to rename link %s to %s",
+			      p->sysfs_name, p->dev->name);
+	else
+		strlcpy(p->sysfs_name, p->dev->name, IFNAMSIZ);
+
 	return err;
 }



^ permalink raw reply

* Fw: Subnet broadcast MTU issue?
From: Gerrit Binnenmars @ 2010-05-10 19:51 UTC (permalink / raw)
  To: netdev

Hello,

I have configured an ethernet port at MTU 9000 with ifconfig
On this port I have an alias (with IP 192.168.100.254) for a network that 
support only an MTU of 1500, I added ip route change 192.168.100.0/24 dev 
eth0 src 192.168.100.254 mtu lock 1500.

Then ping -s 20000 192.168.100.253 works fine, with packets being fragmented 
as expected at 1500 bytes
but ping -b -s 20000 192.168.100.255 does not work, packets are fragmented 
at approx. 9000 bytes.

I tried ip route change table local broadcast 192.168.100.255 mtu lock 1500 
without success.

Is there a solution? Where in the source code is the MTU size for a 
broadcast message determined?

Thanks in advance,

Gerrit Binnenmars 

^ permalink raw reply

* Question about more headroom in skb
From: Sharat Masetty @ 2010-05-10 20:09 UTC (permalink / raw)
  To: netdev

Hello All,

For my project I need 3 words of headroom in the skb in the network driver level, to add a  custom header to the ethernet packet. I looked into the tcp code and figured out tcp uses sk->sk_prot->max_header for header allocation size. But I was not able to confirm that all other transport protocol use the same mechanism(?) For example in UDP/ICMP I was not able to figure out from the code where the allocation and header reservation happens(Any light here would be really helpful.)

I have also looked at an API in skbuff skb_pad() which does what I want(add either headroom or tailroom), but I want to avoid that for performance reasons(skb_pad does kmalloc and memcpy). I want to figure out a good way(may be tune some parameters) to allocate extra 3 words for any skbuff independant of the transport protocol being used. Any light here would be very much appreciated.

Thanks,
Sharat.

^ permalink raw reply

* [PATCH 00/84] netfilter: netfilter update for 2.6.35
From: kaber @ 2010-05-10 20:17 UTC (permalink / raw)
  To: davem; +Cc: netfilter-devel, netdev

Hi Dave,

appologies for not sending this earlier in smaller batches, as mentioned
earlier I ran into some problems with git. Following is a first netfilter
update for 2.6.35, containing:

- various smaller cleanups, optimizations, Kconfig updates etc.

- merging of the xt_MARK module with xt_mark and xt_CONNMARK with xt_connmark
  to decrease overhead when using modular kernels, saving 14k on 32 bit,
  from Jan

- scheduling of the NOTRACK module for removal, obsoleted by the CT module

- removal of the compat /proc directory of xt_recent

- addition of an entry reaper to the recent module, from Tim Gardner

- support for changing UID/GID of the recent /proc files, from Jan

- use of NFPROTO values in NF_HOOK calls in IPv4/IPv6/bridging/DECnet, from Jan

- a change to the xtables ->checkentry() function signature to support
  returning errno codes, from Jan

- removal of old revisions of the hashlimit, multiport and string matches,
  from Jan

- ctnetlink message size computation fixes with conntrack accounting,
  from Jiri Pirko

- hashlimit match RCU conversion, from Eric

- userspace queuing checksum fixes, from Herbert

- fixes for netfilter RCU warnings, from myself

- fixes for the LED target to avoid invalid errors when replacing the
  ruleset

- fixes for iproute compilation breakage due to XT_ALIGN cleanups, from
  Alexey Dobriyan

- bridge netfilter cleanups, simplification and comment updates from Bart

- bridge netfilter MAC header fixes when using DNAT

- bridge netfilter refragmentation fixes for PPPoe, from Bart

- a change to the IPv6 POST_ROUTING invocation to make it receive
  unfragmented packets like IPv4, from Jan

- a fix for the IPv6 xfrm lookup in ip6_route_me_harder, from Ulrich Weber

- more appropriate default log level (KERNL_NOTICE instead of KERN_EMERG) for
  the IPv4 and IPv6 LOG targets, from myself

- addition of the TEE target, which can be used to clone packets and send
  them to other hosts, f.i. IDS or logging hosts, from Jan

- a patch to make iptables and ip6tables reentrant by moving the jump stack
  to a seperately allocated area. This will allow to get rid of the per
  CPU ruleset duplication in the future. From Jan.

The patches won't apply cleanly because of some conflicts resolved during
merges, please pull from:

git://git.kernel.org/pub/scm/linux/kernel/git/kaber/nf-next-2.6.git master

Thanks!


^ permalink raw reply

* [PATCH 01/84] netfilter: include/linux/netfilter/nf_conntrack_tuple_common.h: Checkpatch cleanup
From: kaber @ 2010-05-10 20:17 UTC (permalink / raw)
  To: davem; +Cc: netfilter-devel, netdev
In-Reply-To: <1273522735-24672-1-git-send-email-kaber@trash.net>

From: Andrea Gelmini <andrea.gelmini@gelma.net>

include/linux/netfilter/nf_conntrack_tuple_common.h:5: ERROR: open brace '{' following enum go on the same line

Signed-off-by: Andrea Gelmini <andrea.gelmini@gelma.net>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 .../linux/netfilter/nf_conntrack_tuple_common.h    |    3 +--
 1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/include/linux/netfilter/nf_conntrack_tuple_common.h b/include/linux/netfilter/nf_conntrack_tuple_common.h
index 8e145f0..2ea22b0 100644
--- a/include/linux/netfilter/nf_conntrack_tuple_common.h
+++ b/include/linux/netfilter/nf_conntrack_tuple_common.h
@@ -1,8 +1,7 @@
 #ifndef _NF_CONNTRACK_TUPLE_COMMON_H
 #define _NF_CONNTRACK_TUPLE_COMMON_H
 
-enum ip_conntrack_dir
-{
+enum ip_conntrack_dir {
 	IP_CT_DIR_ORIGINAL,
 	IP_CT_DIR_REPLY,
 	IP_CT_DIR_MAX
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 02/84] netfilter: ebt_ip6: Use ipv6_masked_addr_cmp()
From: kaber @ 2010-05-10 20:17 UTC (permalink / raw)
  To: davem; +Cc: netfilter-devel, netdev
In-Reply-To: <1273522735-24672-1-git-send-email-kaber@trash.net>

From: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>

Signed-off-by: YOSHIFUJI Hideaki <yoshfuji@linux-ipv6.org>
Signed-off-by: Bart De Schuymer <bdschuym@pandora.be>
Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 net/bridge/netfilter/ebt_ip6.c |   18 ++++--------------
 1 files changed, 4 insertions(+), 14 deletions(-)

diff --git a/net/bridge/netfilter/ebt_ip6.c b/net/bridge/netfilter/ebt_ip6.c
index bbf2534..4644cc9 100644
--- a/net/bridge/netfilter/ebt_ip6.c
+++ b/net/bridge/netfilter/ebt_ip6.c
@@ -35,8 +35,6 @@ ebt_ip6_mt(const struct sk_buff *skb, const struct xt_match_param *par)
 	struct ipv6hdr _ip6h;
 	const struct tcpudphdr *pptr;
 	struct tcpudphdr _ports;
-	struct in6_addr tmp_addr;
-	int i;
 
 	ih6 = skb_header_pointer(skb, 0, sizeof(_ip6h), &_ip6h);
 	if (ih6 == NULL)
@@ -44,18 +42,10 @@ ebt_ip6_mt(const struct sk_buff *skb, const struct xt_match_param *par)
 	if (info->bitmask & EBT_IP6_TCLASS &&
 	   FWINV(info->tclass != ipv6_get_dsfield(ih6), EBT_IP6_TCLASS))
 		return false;
-	for (i = 0; i < 4; i++)
-		tmp_addr.in6_u.u6_addr32[i] = ih6->saddr.in6_u.u6_addr32[i] &
-			info->smsk.in6_u.u6_addr32[i];
-	if (info->bitmask & EBT_IP6_SOURCE &&
-		FWINV((ipv6_addr_cmp(&tmp_addr, &info->saddr) != 0),
-			EBT_IP6_SOURCE))
-		return false;
-	for (i = 0; i < 4; i++)
-		tmp_addr.in6_u.u6_addr32[i] = ih6->daddr.in6_u.u6_addr32[i] &
-			info->dmsk.in6_u.u6_addr32[i];
-	if (info->bitmask & EBT_IP6_DEST &&
-	   FWINV((ipv6_addr_cmp(&tmp_addr, &info->daddr) != 0), EBT_IP6_DEST))
+	if (FWINV(ipv6_masked_addr_cmp(&ih6->saddr, &info->smsk,
+				       &info->saddr), EBT_IP6_SOURCE) ||
+	    FWINV(ipv6_masked_addr_cmp(&ih6->daddr, &info->dmsk,
+				       &info->daddr), EBT_IP6_DEST))
 		return false;
 	if (info->bitmask & EBT_IP6_PROTO) {
 		uint8_t nexthdr = ih6->nexthdr;
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 05/84] netfilter: xt_CT: par->family is an nfproto
From: kaber @ 2010-05-10 20:17 UTC (permalink / raw)
  To: davem; +Cc: netfilter-devel, netdev
In-Reply-To: <1273522735-24672-1-git-send-email-kaber@trash.net>

From: Jan Engelhardt <jengelh@medozas.de>

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
 net/netfilter/xt_CT.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/xt_CT.c b/net/netfilter/xt_CT.c
index 61c50fa..fda603e 100644
--- a/net/netfilter/xt_CT.c
+++ b/net/netfilter/xt_CT.c
@@ -37,13 +37,13 @@ static unsigned int xt_ct_target(struct sk_buff *skb,
 
 static u8 xt_ct_find_proto(const struct xt_tgchk_param *par)
 {
-	if (par->family == AF_INET) {
+	if (par->family == NFPROTO_IPV4) {
 		const struct ipt_entry *e = par->entryinfo;
 
 		if (e->ip.invflags & IPT_INV_PROTO)
 			return 0;
 		return e->ip.proto;
-	} else if (par->family == AF_INET6) {
+	} else if (par->family == NFPROTO_IPV6) {
 		const struct ip6t_entry *e = par->entryinfo;
 
 		if (e->ipv6.invflags & IP6T_INV_PROTO)
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 06/84] netfilter: xt_NFQUEUE: consolidate v4/v6 targets into one
From: kaber @ 2010-05-10 20:17 UTC (permalink / raw)
  To: davem; +Cc: netfilter-devel, netdev
In-Reply-To: <1273522735-24672-1-git-send-email-kaber@trash.net>

From: Jan Engelhardt <jengelh@medozas.de>

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
 net/netfilter/xt_NFQUEUE.c |   40 ++++++++++++----------------------------
 1 files changed, 12 insertions(+), 28 deletions(-)

diff --git a/net/netfilter/xt_NFQUEUE.c b/net/netfilter/xt_NFQUEUE.c
index 12dcd70..a37e216 100644
--- a/net/netfilter/xt_NFQUEUE.c
+++ b/net/netfilter/xt_NFQUEUE.c
@@ -49,17 +49,6 @@ static u32 hash_v4(const struct sk_buff *skb)
 	return jhash_2words((__force u32)ipaddr, iph->protocol, jhash_initval);
 }
 
-static unsigned int
-nfqueue_tg4_v1(struct sk_buff *skb, const struct xt_target_param *par)
-{
-	const struct xt_NFQ_info_v1 *info = par->targinfo;
-	u32 queue = info->queuenum;
-
-	if (info->queues_total > 1)
-		queue = hash_v4(skb) % info->queues_total + queue;
-	return NF_QUEUE_NR(queue);
-}
-
 #if defined(CONFIG_IP6_NF_IPTABLES) || defined(CONFIG_IP6_NF_IPTABLES_MODULE)
 static u32 hash_v6(const struct sk_buff *skb)
 {
@@ -73,18 +62,24 @@ static u32 hash_v6(const struct sk_buff *skb)
 
 	return jhash2((__force u32 *)addr, ARRAY_SIZE(addr), jhash_initval);
 }
+#endif
 
 static unsigned int
-nfqueue_tg6_v1(struct sk_buff *skb, const struct xt_target_param *par)
+nfqueue_tg_v1(struct sk_buff *skb, const struct xt_target_param *par)
 {
 	const struct xt_NFQ_info_v1 *info = par->targinfo;
 	u32 queue = info->queuenum;
 
-	if (info->queues_total > 1)
-		queue = hash_v6(skb) % info->queues_total + queue;
+	if (info->queues_total > 1) {
+		if (par->target->family == NFPROTO_IPV4)
+			queue = hash_v4(skb) % info->queues_total + queue;
+#if defined(CONFIG_IP6_NF_IPTABLES) || defined(CONFIG_IP6_NF_IPTABLES_MODULE)
+		else if (par->target->family == NFPROTO_IPV6)
+			queue = hash_v6(skb) % info->queues_total + queue;
+#endif
+	}
 	return NF_QUEUE_NR(queue);
 }
-#endif
 
 static bool nfqueue_tg_v1_check(const struct xt_tgchk_param *par)
 {
@@ -119,23 +114,12 @@ static struct xt_target nfqueue_tg_reg[] __read_mostly = {
 	{
 		.name		= "NFQUEUE",
 		.revision	= 1,
-		.family		= NFPROTO_IPV4,
-		.checkentry	= nfqueue_tg_v1_check,
-		.target		= nfqueue_tg4_v1,
-		.targetsize	= sizeof(struct xt_NFQ_info_v1),
-		.me		= THIS_MODULE,
-	},
-#if defined(CONFIG_IP6_NF_IPTABLES) || defined(CONFIG_IP6_NF_IPTABLES_MODULE)
-	{
-		.name		= "NFQUEUE",
-		.revision	= 1,
-		.family		= NFPROTO_IPV6,
+		.family		= NFPROTO_UNSPEC,
 		.checkentry	= nfqueue_tg_v1_check,
-		.target		= nfqueue_tg6_v1,
+		.target		= nfqueue_tg_v1,
 		.targetsize	= sizeof(struct xt_NFQ_info_v1),
 		.me		= THIS_MODULE,
 	},
-#endif
 };
 
 static int __init nfqueue_tg_init(void)
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 08/84] netfilter: xtables: merge xt_MARK into xt_mark
From: kaber @ 2010-05-10 20:17 UTC (permalink / raw)
  To: davem; +Cc: netfilter-devel, netdev
In-Reply-To: <1273522735-24672-1-git-send-email-kaber@trash.net>

From: Jan Engelhardt <jengelh@medozas.de>

Two arguments for combining the two:
- xt_mark is pretty useless without xt_MARK
- the actual code is so small anyway that the kmod metadata and the module
  in its loaded state totally outweighs the combined actual code size.

i586-before:
-rw-r--r-- 1 jengelh users 3821 Feb 10 01:01 xt_MARK.ko
-rw-r--r-- 1 jengelh users 2592 Feb 10 00:04 xt_MARK.o
-rw-r--r-- 1 jengelh users 3274 Feb 10 01:01 xt_mark.ko
-rw-r--r-- 1 jengelh users 2108 Feb 10 00:05 xt_mark.o
   text    data     bss     dec     hex filename
    354     264       0     618     26a xt_MARK.o
    223     176       0     399     18f xt_mark.o
And the runtime size is like 14 KB.

i586-after:
-rw-r--r-- 1 jengelh users 3264 Feb 18 17:28 xt_mark.o

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
 include/linux/netfilter/xt_MARK.h |    6 +---
 include/linux/netfilter/xt_mark.h |    4 ++
 net/netfilter/Kconfig             |   46 +++++++++++++++++++-----------
 net/netfilter/Makefile            |    5 ++-
 net/netfilter/xt_MARK.c           |   56 -------------------------------------
 net/netfilter/xt_mark.c           |   35 +++++++++++++++++++++-
 6 files changed, 70 insertions(+), 82 deletions(-)
 delete mode 100644 net/netfilter/xt_MARK.c

diff --git a/include/linux/netfilter/xt_MARK.h b/include/linux/netfilter/xt_MARK.h
index bc9561b..41c456d 100644
--- a/include/linux/netfilter/xt_MARK.h
+++ b/include/linux/netfilter/xt_MARK.h
@@ -1,10 +1,6 @@
 #ifndef _XT_MARK_H_target
 #define _XT_MARK_H_target
 
-#include <linux/types.h>
-
-struct xt_mark_tginfo2 {
-	__u32 mark, mask;
-};
+#include <linux/netfilter/xt_mark.h>
 
 #endif /*_XT_MARK_H_target */
diff --git a/include/linux/netfilter/xt_mark.h b/include/linux/netfilter/xt_mark.h
index 6607c8f..ecadc40 100644
--- a/include/linux/netfilter/xt_mark.h
+++ b/include/linux/netfilter/xt_mark.h
@@ -3,6 +3,10 @@
 
 #include <linux/types.h>
 
+struct xt_mark_tginfo2 {
+	__u32 mark, mask;
+};
+
 struct xt_mark_mtinfo1 {
 	__u32 mark, mask;
 	__u8 invert;
diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index abf4ce6..236aa20 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -314,6 +314,23 @@ config NETFILTER_XTABLES
 
 if NETFILTER_XTABLES
 
+comment "Xtables combined modules"
+
+config NETFILTER_XT_MARK
+	tristate 'nfmark target and match support'
+	default m if NETFILTER_ADVANCED=n
+	---help---
+	This option adds the "MARK" target and "mark" match.
+
+	Netfilter mark matching allows you to match packets based on the
+	"nfmark" value in the packet.
+	The target allows you to create rules in the "mangle" table which alter
+	the netfilter mark (nfmark) field associated with the packet.
+
+	Prior to routing, the nfmark can influence the routing method (see
+	"Use netfilter MARK value as routing key") and can also be used by
+	other subsystems to change their behavior.
+
 # alphabetically ordered list of targets
 
 comment "Xtables targets"
@@ -425,16 +442,12 @@ config NETFILTER_XT_TARGET_LED
 
 config NETFILTER_XT_TARGET_MARK
 	tristate '"MARK" target support'
-	default m if NETFILTER_ADVANCED=n
-	help
-	  This option adds a `MARK' target, which allows you to create rules
-	  in the `mangle' table which alter the netfilter mark (nfmark) field
-	  associated with the packet prior to routing. This can change
-	  the routing method (see `Use netfilter MARK value as routing
-	  key') and can also be used by other subsystems to change their
-	  behavior.
-
-	  To compile it as a module, choose M here.  If unsure, say N.
+	depends on NETFILTER_ADVANCED
+	select NETFILTER_XT_MARK
+	---help---
+	This is a backwards-compat option for the user's convenience
+	(e.g. when running oldconfig). It selects
+	CONFIG_NETFILTER_XT_MARK (combined mark/MARK module).
 
 config NETFILTER_XT_TARGET_NFLOG
 	tristate '"NFLOG" target support'
@@ -739,13 +752,12 @@ config NETFILTER_XT_MATCH_MAC
 
 config NETFILTER_XT_MATCH_MARK
 	tristate '"mark" match support'
-	default m if NETFILTER_ADVANCED=n
-	help
-	  Netfilter mark matching allows you to match packets based on the
-	  `nfmark' value in the packet.  This can be set by the MARK target
-	  (see below).
-
-	  To compile it as a module, choose M here.  If unsure, say N.
+	depends on NETFILTER_ADVANCED
+	select NETFILTER_XT_MARK
+	---help---
+	This is a backwards-compat option for the user's convenience
+	(e.g. when running oldconfig). It selects
+	CONFIG_NETFILTER_XT_MARK (combined mark/MARK module).
 
 config NETFILTER_XT_MATCH_MULTIPORT
 	tristate '"multiport" Multiple port match support'
diff --git a/net/netfilter/Makefile b/net/netfilter/Makefile
index f873644..19775cc 100644
--- a/net/netfilter/Makefile
+++ b/net/netfilter/Makefile
@@ -40,6 +40,9 @@ obj-$(CONFIG_NETFILTER_TPROXY) += nf_tproxy_core.o
 # generic X tables 
 obj-$(CONFIG_NETFILTER_XTABLES) += x_tables.o xt_tcpudp.o
 
+# combos
+obj-$(CONFIG_NETFILTER_XT_MARK) += xt_mark.o
+
 # targets
 obj-$(CONFIG_NETFILTER_XT_TARGET_CLASSIFY) += xt_CLASSIFY.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_CONNMARK) += xt_CONNMARK.o
@@ -48,7 +51,6 @@ obj-$(CONFIG_NETFILTER_XT_TARGET_CT) += xt_CT.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_DSCP) += xt_DSCP.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_HL) += xt_HL.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_LED) += xt_LED.o
-obj-$(CONFIG_NETFILTER_XT_TARGET_MARK) += xt_MARK.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_NFLOG) += xt_NFLOG.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_NFQUEUE) += xt_NFQUEUE.o
 obj-$(CONFIG_NETFILTER_XT_TARGET_NOTRACK) += xt_NOTRACK.o
@@ -76,7 +78,6 @@ obj-$(CONFIG_NETFILTER_XT_MATCH_IPRANGE) += xt_iprange.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_LENGTH) += xt_length.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_LIMIT) += xt_limit.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_MAC) += xt_mac.o
-obj-$(CONFIG_NETFILTER_XT_MATCH_MARK) += xt_mark.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_MULTIPORT) += xt_multiport.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_OSF) += xt_osf.o
 obj-$(CONFIG_NETFILTER_XT_MATCH_OWNER) += xt_owner.o
diff --git a/net/netfilter/xt_MARK.c b/net/netfilter/xt_MARK.c
deleted file mode 100644
index 225f8d1..0000000
--- a/net/netfilter/xt_MARK.c
+++ /dev/null
@@ -1,56 +0,0 @@
-/*
- *	xt_MARK - Netfilter module to modify the NFMARK field of an skb
- *
- *	(C) 1999-2001 Marc Boucher <marc@mbsi.ca>
- *	Copyright Â© CC Computer Consultants GmbH, 2007 - 2008
- *	Jan Engelhardt <jengelh@computergmbh.de>
- *
- *	This program is free software; you can redistribute it and/or modify
- *	it under the terms of the GNU General Public License version 2 as
- *	published by the Free Software Foundation.
- */
-
-#include <linux/module.h>
-#include <linux/skbuff.h>
-#include <linux/ip.h>
-#include <net/checksum.h>
-
-#include <linux/netfilter/x_tables.h>
-#include <linux/netfilter/xt_MARK.h>
-
-MODULE_LICENSE("GPL");
-MODULE_AUTHOR("Marc Boucher <marc@mbsi.ca>");
-MODULE_DESCRIPTION("Xtables: packet mark modification");
-MODULE_ALIAS("ipt_MARK");
-MODULE_ALIAS("ip6t_MARK");
-
-static unsigned int
-mark_tg(struct sk_buff *skb, const struct xt_target_param *par)
-{
-	const struct xt_mark_tginfo2 *info = par->targinfo;
-
-	skb->mark = (skb->mark & ~info->mask) ^ info->mark;
-	return XT_CONTINUE;
-}
-
-static struct xt_target mark_tg_reg __read_mostly = {
-	.name           = "MARK",
-	.revision       = 2,
-	.family         = NFPROTO_UNSPEC,
-	.target         = mark_tg,
-	.targetsize     = sizeof(struct xt_mark_tginfo2),
-	.me             = THIS_MODULE,
-};
-
-static int __init mark_tg_init(void)
-{
-	return xt_register_target(&mark_tg_reg);
-}
-
-static void __exit mark_tg_exit(void)
-{
-	xt_unregister_target(&mark_tg_reg);
-}
-
-module_init(mark_tg_init);
-module_exit(mark_tg_exit);
diff --git a/net/netfilter/xt_mark.c b/net/netfilter/xt_mark.c
index 1db07d8..035c468 100644
--- a/net/netfilter/xt_mark.c
+++ b/net/netfilter/xt_mark.c
@@ -18,9 +18,20 @@
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Marc Boucher <marc@mbsi.ca>");
-MODULE_DESCRIPTION("Xtables: packet mark match");
+MODULE_DESCRIPTION("Xtables: packet mark operations");
 MODULE_ALIAS("ipt_mark");
 MODULE_ALIAS("ip6t_mark");
+MODULE_ALIAS("ipt_MARK");
+MODULE_ALIAS("ip6t_MARK");
+
+static unsigned int
+mark_tg(struct sk_buff *skb, const struct xt_target_param *par)
+{
+	const struct xt_mark_tginfo2 *info = par->targinfo;
+
+	skb->mark = (skb->mark & ~info->mask) ^ info->mark;
+	return XT_CONTINUE;
+}
 
 static bool
 mark_mt(const struct sk_buff *skb, const struct xt_match_param *par)
@@ -30,6 +41,15 @@ mark_mt(const struct sk_buff *skb, const struct xt_match_param *par)
 	return ((skb->mark & info->mask) == info->mark) ^ info->invert;
 }
 
+static struct xt_target mark_tg_reg __read_mostly = {
+	.name           = "MARK",
+	.revision       = 2,
+	.family         = NFPROTO_UNSPEC,
+	.target         = mark_tg,
+	.targetsize     = sizeof(struct xt_mark_tginfo2),
+	.me             = THIS_MODULE,
+};
+
 static struct xt_match mark_mt_reg __read_mostly = {
 	.name           = "mark",
 	.revision       = 1,
@@ -41,12 +61,23 @@ static struct xt_match mark_mt_reg __read_mostly = {
 
 static int __init mark_mt_init(void)
 {
-	return xt_register_match(&mark_mt_reg);
+	int ret;
+
+	ret = xt_register_target(&mark_tg_reg);
+	if (ret < 0)
+		return ret;
+	ret = xt_register_match(&mark_mt_reg);
+	if (ret < 0) {
+		xt_unregister_target(&mark_tg_reg);
+		return ret;
+	}
+	return 0;
 }
 
 static void __exit mark_mt_exit(void)
 {
 	xt_unregister_match(&mark_mt_reg);
+	xt_unregister_target(&mark_tg_reg);
 }
 
 module_init(mark_mt_init);
-- 
1.7.0.4

--
To unsubscribe from this list: send the line "unsubscribe netfilter-devel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* [PATCH 10/84] netfilter: xtables: schedule xt_NOTRACK for removal
From: kaber @ 2010-05-10 20:17 UTC (permalink / raw)
  To: davem; +Cc: netfilter-devel, netdev
In-Reply-To: <1273522735-24672-1-git-send-email-kaber@trash.net>

From: Jan Engelhardt <jengelh@medozas.de>

It is being superseded by xt_CT (-j CT --notrack).

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
 Documentation/feature-removal-schedule.txt |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/Documentation/feature-removal-schedule.txt b/Documentation/feature-removal-schedule.txt
index ed511af..8843fef 100644
--- a/Documentation/feature-removal-schedule.txt
+++ b/Documentation/feature-removal-schedule.txt
@@ -589,3 +589,11 @@ Why:	Useful in 2003, implementation is a hack.
 	Generally invoked by accident today.
 	Seen as doing more harm than good.
 Who:	Len Brown <len.brown@intel.com>
+
+---------------------------
+
+What:	xt_NOTRACK
+Files:	net/netfilter/xt_NOTRACK.c
+When:	April 2011
+Why:	Superseded by xt_CT
+Who:	Netfilter developer team <netfilter-devel@vger.kernel.org>
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 12/84] netfilter: ebt_ip6: add principal maintainer in a MODULE_AUTHOR tag
From: kaber @ 2010-05-10 20:17 UTC (permalink / raw)
  To: davem; +Cc: netfilter-devel, netdev
In-Reply-To: <1273522735-24672-1-git-send-email-kaber@trash.net>

From: Jan Engelhardt <jengelh@medozas.de>

Cc: Kuo-Lang Tseng <kuo-lang.tseng@intel.com>
Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
 net/bridge/netfilter/ebt_ip6.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/net/bridge/netfilter/ebt_ip6.c b/net/bridge/netfilter/ebt_ip6.c
index 4cb60f1..05d0d0c 100644
--- a/net/bridge/netfilter/ebt_ip6.c
+++ b/net/bridge/netfilter/ebt_ip6.c
@@ -139,4 +139,5 @@ static void __exit ebt_ip6_fini(void)
 module_init(ebt_ip6_init);
 module_exit(ebt_ip6_fini);
 MODULE_DESCRIPTION("Ebtables: IPv6 protocol packet match");
+MODULE_AUTHOR("Kuo-Lang Tseng <kuo-lang.tseng@intel.com>");
 MODULE_LICENSE("GPL");
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 13/84] netfilter: xt_recent: update description
From: kaber @ 2010-05-10 20:17 UTC (permalink / raw)
  To: davem; +Cc: netfilter-devel, netdev
In-Reply-To: <1273522735-24672-1-git-send-email-kaber@trash.net>

From: Jan Engelhardt <jengelh@medozas.de>

It had IPv6 for quite a while already :-)

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
 net/netfilter/xt_recent.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/netfilter/xt_recent.c b/net/netfilter/xt_recent.c
index 1af74dd..bcabfbc 100644
--- a/net/netfilter/xt_recent.c
+++ b/net/netfilter/xt_recent.c
@@ -35,7 +35,7 @@
 
 MODULE_AUTHOR("Patrick McHardy <kaber@trash.net>");
 MODULE_AUTHOR("Jan Engelhardt <jengelh@medozas.de>");
-MODULE_DESCRIPTION("Xtables: \"recently-seen\" host matching for IPv4");
+MODULE_DESCRIPTION("Xtables: \"recently-seen\" host matching");
 MODULE_LICENSE("GPL");
 MODULE_ALIAS("ipt_recent");
 MODULE_ALIAS("ip6t_recent");
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 19/84] netfilter: xtables: clean up xt_mac match routine
From: kaber @ 2010-05-10 20:17 UTC (permalink / raw)
  To: davem; +Cc: netfilter-devel, netdev
In-Reply-To: <1273522735-24672-1-git-send-email-kaber@trash.net>

From: Jan Engelhardt <jengelh@medozas.de>

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
 net/netfilter/xt_mac.c |   18 ++++++++++--------
 1 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/net/netfilter/xt_mac.c b/net/netfilter/xt_mac.c
index c200711..2039d07 100644
--- a/net/netfilter/xt_mac.c
+++ b/net/netfilter/xt_mac.c
@@ -26,14 +26,16 @@ MODULE_ALIAS("ip6t_mac");
 
 static bool mac_mt(const struct sk_buff *skb, const struct xt_match_param *par)
 {
-    const struct xt_mac_info *info = par->matchinfo;
-
-    /* Is mac pointer valid? */
-    return skb_mac_header(skb) >= skb->head &&
-	   skb_mac_header(skb) + ETH_HLEN <= skb->data
-	   /* If so, compare... */
-	   && ((!compare_ether_addr(eth_hdr(skb)->h_source, info->srcaddr))
-		^ info->invert);
+	const struct xt_mac_info *info = par->matchinfo;
+	bool ret;
+
+	if (skb_mac_header(skb) < skb->head)
+		return false;
+	if (skb_mac_header(skb) + ETH_HLEN > skb->data)
+		return false;
+	ret  = compare_ether_addr(eth_hdr(skb)->h_source, info->srcaddr) == 0;
+	ret ^= info->invert;
+	return ret;
 }
 
 static struct xt_match mac_mt_reg __read_mostly = {
-- 
1.7.0.4


^ permalink raw reply related

* [PATCH 21/84] netfilter: xtables: resort osf kconfig text
From: kaber @ 2010-05-10 20:17 UTC (permalink / raw)
  To: davem; +Cc: netfilter-devel, netdev
In-Reply-To: <1273522735-24672-1-git-send-email-kaber@trash.net>

From: Jan Engelhardt <jengelh@medozas.de>

Restore alphabetical ordering of the list and put the xt_osf option
into its 'right' place again.

Signed-off-by: Jan Engelhardt <jengelh@medozas.de>
---
 net/netfilter/Kconfig |   26 +++++++++++++-------------
 1 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/net/netfilter/Kconfig b/net/netfilter/Kconfig
index 6ac28ef..8055786 100644
--- a/net/netfilter/Kconfig
+++ b/net/netfilter/Kconfig
@@ -774,6 +774,19 @@ config NETFILTER_XT_MATCH_MULTIPORT
 
 	  To compile it as a module, choose M here.  If unsure, say N.
 
+config NETFILTER_XT_MATCH_OSF
+	tristate '"osf" Passive OS fingerprint match'
+	depends on NETFILTER_ADVANCED && NETFILTER_NETLINK
+	help
+	  This option selects the Passive OS Fingerprinting match module
+	  that allows to passively match the remote operating system by
+	  analyzing incoming TCP SYN packets.
+
+	  Rules and loading software can be downloaded from
+	  http://www.ioremap.net/projects/osf
+
+	  To compile it as a module, choose M here.  If unsure, say N.
+
 config NETFILTER_XT_MATCH_OWNER
 	tristate '"owner" match support'
 	depends on NETFILTER_ADVANCED
@@ -958,19 +971,6 @@ config NETFILTER_XT_MATCH_U32
 
 	  Details and examples are in the kernel module source.
 
-config NETFILTER_XT_MATCH_OSF
-	tristate '"osf" Passive OS fingerprint match'
-	depends on NETFILTER_ADVANCED && NETFILTER_NETLINK
-	help
-	  This option selects the Passive OS Fingerprinting match module
-	  that allows to passively match the remote operating system by
-	  analyzing incoming TCP SYN packets.
-
-	  Rules and loading software can be downloaded from
-	  http://www.ioremap.net/projects/osf
-
-	  To compile it as a module, choose M here.  If unsure, say N.
-
 endif # NETFILTER_XTABLES
 
 endmenu
-- 
1.7.0.4


^ permalink raw reply related

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox