Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: pull request: wireless-2.6 2010-07-26
From: David Miller @ 2010-07-26 20:30 UTC (permalink / raw)
  To: linville; +Cc: linux-wireless, netdev, linux-kernel
In-Reply-To: <20100726194919.GH3903@tuxdriver.com>

From: "John W. Linville" <linville@tuxdriver.com>
Date: Mon, 26 Jul 2010 15:49:20 -0400

> The following changes since commit 3b87956ea645fb4de7e59c7d0aa94de04be72615:
> 
>   net sched: fix race in mirred device removal (2010-07-24 21:04:20 -0700)
> 
> are available in the git repository at:
>   git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git master
> 
> John W. Linville (1):
>       wireless: use netif_rx_ni in ieee80211_send_layer2_update
> 
> Ming Lei (1):
>       ath9k: fix dma direction for map/unmap in ath_rx_tasklet

Pulled, thanks John.

^ permalink raw reply

* Re: [PATCH repost] sched: export sched_set/getaffinity to modules
From: Michael S. Tsirkin @ 2010-07-26 20:27 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Sridhar Samudrala, Peter Zijlstra, Tejun Heo, Ingo Molnar, netdev,
	lkml, kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev,
	Jiri Kosina, Thomas Gleixner, Andi Kleen
In-Reply-To: <20100726180834.GA26988@redhat.com>

On Mon, Jul 26, 2010 at 08:08:34PM +0200, Oleg Nesterov wrote:
> On 07/26, Sridhar Samudrala wrote:
> >
> > I have been testing out a similar patch that uses kernel_thread() without CLONE_FILES
> > flag rather than create_kthread() and then closing the files.
> 
> !CLONE_FILES can't help. copy_files() does dup_fd() in this case.
> The child still inherits the files.
> 
> > Either version should be fine.
> 
> I think neither version is fine ;)
> 
> exit_files() is not enough too. How about the signals, reparenting?
> 
> 
> I already forgot all details, probably I missed somethig. But it
> seems to me that it is better to just export get/set affinity and
> forget about all complications.
> 
> Oleg.

Almost forgot, we need the same for priority.

-- 
MST

^ permalink raw reply

* Re: [PATCH UPDATED 1/3] vhost: replace vhost_workqueue with per-vhost kthread
From: Michael S. Tsirkin @ 2010-07-26 20:19 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Sridhar Samudrala, netdev, lkml,
	kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev, Jiri Kosina,
	Thomas Gleixner, Ingo Molnar, Andi Kleen
In-Reply-To: <4C4DDC31.9070206@kernel.org>

On Mon, Jul 26, 2010 at 09:04:17PM +0200, Tejun Heo wrote:
> On 07/26/2010 06:23 PM, Michael S. Tsirkin wrote:
> >> * Can you please keep the outer goto repeat loop?  I just don't like
> >>   outermost for (;;).
> > 
> > Okay ... can we put the code in a {} scope to make it clear
> > where does the loop starts and ends?
> 
> If we're gonna do that, it would be better to put it inside a loop
> construct.  The reason why I don't like it is that loops like that
> don't really help read/writeability much while indenting the whole
> logic unnecessarily and look more like a result of obsession against
> goto rather than any practical reason.  It's just a cosmetic
> preference and I might as well be the weirdo here, so if you feel
> strong about it, please feel free to put everything in a loop.
> 
> >> * Placing try_to_freeze() could be a bit annoying.  It shouldn't be
> >>   executed when there's a work to flush.
> > 
> > It currently seems to be executed when there is work to flush.
> > Is this wrong?
> 
> Oh, does it?  As I wrote in the other mail, things like that wouldn't
> necessarily break correctness but I think it would be better to avoid
> surprises in the generic code if not too difficult.

Let's try to define what do we want to achieve then.
Do you want code that flushes workers not to block
when workers are frozen? How will we handle work
submitted when worker is frozen?


> >> * I think A - B <= 0 test would be more familiar.  At least
> >>   time_before/after() are implemented that way.
> > 
> > I am concerned that this overflows a signed integer -
> > which I seem to remeber that C99 disallows.
> 
> Really?  Overflows of pointer isn't expected and that's why we have
> weird RELOC_HIDE() macro for such calculations but integers not
> expected to overflow is a news to me.  Are you sure?  That basically
> means time_before/after() aren't safe either.

As I said, in C99.
However,  the kernel is built with -fno-strict-overflow, so it will work.

> > timer macros are on data path so might be worth the risk there,
> > but flush is slow path so better be safe?
> 
> I don't think performance matters much here.  I just think the sign
> test is clearer / more familiar for the logic.
> 
> Thanks.
> 
> -- 
> tejun

^ permalink raw reply

* Re: [PATCH] wireless: Make COMPAT_NETLINK_MESSAGES depend upon WEXT_CORE
From: Johannes Berg @ 2010-07-26 20:18 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, linux-wireless
In-Reply-To: <20100726.131506.245381776.davem@davemloft.net>

On Mon, 2010-07-26 at 13:15 -0700, David Miller wrote:
> WIRELESS_EXT is not the correct dependency.
> 
> Signed-off-by: David S. Miller <davem@davemloft.net>
> ---
> 
> Wireless folks, just FYI, I will commit this to net-next-2.6
> 
>  net/Kconfig |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/net/Kconfig b/net/Kconfig
> index b325094..e24fa08 100644
> --- a/net/Kconfig
> +++ b/net/Kconfig
> @@ -32,7 +32,7 @@ config WANT_COMPAT_NETLINK_MESSAGES
>  config COMPAT_NETLINK_MESSAGES
>  	def_bool y
>  	depends on COMPAT
> -	depends on WIRELESS_EXT || WANT_COMPAT_NETLINK_MESSAGES
> +	depends on WEXT_CORE || WANT_COMPAT_NETLINK_MESSAGES

You're right, thank you, I missed that at the time I changed this.

FWIW, it seems I broke this in 2.6.32! Guess nobody noticed since most
people actually do still have WIRELESS_EXT, especially distros.

johannes


^ permalink raw reply

* Re: Last Ipv6 address removal causes "addrconf_sysctl_unregister", which  inihibits from changing disable_ipv6 setting
From: Brian Haley @ 2010-07-26 20:17 UTC (permalink / raw)
  To: Mahesh Kelkar; +Cc: netdev, David Miller
In-Reply-To: <AANLkTimnBssEeO_faqcAAzHCy8KxQo9FVcn7vu5fru4j@mail.gmail.com>

On 07/26/2010 10:03 AM, Mahesh Kelkar wrote:
> Odd behavior associated with the patch:
> **Last address removal causes "addrconf_sysctl_unregister", which
> inihibits from changing disable_ipv6 setting
> (connected issue: With disable_ipv6 set to 1 on an interface, ff00:/8
> and fe80::/64 are still added on device UP)

This behavior doesn't seem specific to the patch that was accepted last
week to fix the disable_ipv6 behavior (64e724f6) as all of the sysctl
values are gone for eth0.

> Current sysctl config:
> net.ipv6.conf.all.disable_ipv6 = 1
> net.ipv6.conf.default.disable_ipv6 = 1
> 
> Steps:
> - Remove last IPv6 address assigned to the "eth0" interface
> - inet6_addr_del => addrconf_ifdown(idev->dev, 1) => does the device
> sysctl unregister
> ******Not sure why the addrconf_sysctl_unregister is necessary on last
> address removal*******
> - Now, "sysctl -a" does not show "net.ipv6.conf.eth0.disable_ipv6"

Do they get restored if you bring the link down and back up?

> Problem:
> - If you WANT to assign IPv6 address to eth0,
> -> Do it once, which fails due to "disable_ipv6" check in
> addrconf_add_dev OR ipv6_add_addr
> -> But, this process does "addrconf_sysctl_register" (addrconf_add_dev
> => ipv6_find_idev => ipv6_add_dev)
> -> set net.ipv6.conf.eth0.disable_ipv6=0 and then successfully assign
> ipv6-address to the eth0
> (Another alternative is to change all or default to 1; but I wanted to
> disable ipv6 by default)
> 
> ===============
> Probable Solution:
> ===============
> @@ -1948,7 +1959,7 @@ static int inet6_addr_del(int ifindex, s
>  			   disable IPv6 on this interface.
>  			 */
>  			if (idev->addr_list == NULL)
> -				addrconf_ifdown(idev->dev, 1);
> +				addrconf_ifdown(idev->dev, 0);
>  			return 0;
>  		}
>  	}
> I have tested the above solution and it seems to work fine - so far.

This code has been this way forever in 2.6 (< 2005), changing it could break
existing users.  I'm not sure why it was decided to do it this way as it
was before my time.

Also, the full comment here is:

 /* If the last address is deleted administratively,
    disable IPv6 on this interface.
  */

Which says that you're removing all the IPv6 addresses by hand using
/sbin/ip, so you're forcing the removal.  If you want to disable IPv6
on it and remove all the addresses, why don't you just set disable_ipv6=1?

-Brian

^ permalink raw reply

* [PATCH] wireless: Make COMPAT_NETLINK_MESSAGES depend upon WEXT_CORE
From: David Miller @ 2010-07-26 20:15 UTC (permalink / raw)
  To: netdev-u79uwXL29TY76Z2rM5mHXA; +Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA


WIRELESS_EXT is not the correct dependency.

Signed-off-by: David S. Miller <davem-fT/PcQaiUtIeIZ0/mPfg9Q@public.gmane.org>
---

Wireless folks, just FYI, I will commit this to net-next-2.6

 net/Kconfig |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/net/Kconfig b/net/Kconfig
index b325094..e24fa08 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -32,7 +32,7 @@ config WANT_COMPAT_NETLINK_MESSAGES
 config COMPAT_NETLINK_MESSAGES
 	def_bool y
 	depends on COMPAT
-	depends on WIRELESS_EXT || WANT_COMPAT_NETLINK_MESSAGES
+	depends on WEXT_CORE || WANT_COMPAT_NETLINK_MESSAGES
 	help
 	  This option makes it possible to send different netlink messages
 	  to tasks depending on whether the task is a compat task or not. To
-- 
1.7.1.1

--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH net-next-2.6] netlink: netlink_recvmsg() fix
From: David Miller @ 2010-07-26 20:08 UTC (permalink / raw)
  To: eric.dumazet; +Cc: johannes, netdev
In-Reply-To: <20100725.215548.112589000.davem@davemloft.net>

From: David Miller <davem@davemloft.net>
Date: Sun, 25 Jul 2010 21:55:48 -0700 (PDT)

> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 21 Jul 2010 10:43:55 +0200
> 
>> [PATCH net-next-2.6 v3] netlink: netlink_recvmsg() fix
>> 
>> commit 1dacc76d0014 
>> (net/compat/wext: send different messages to compat tasks)
>> introduced a race condition on netlink, in case MSG_PEEK is used.
>> 
>> An skb given by skb_recv_datagram() might be shared, we must copy it
>> before any modification, or risk fatal corruption.
>> 
>> Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
> 
> Applied, thanks Eric.

I bet you didn't compile test the code you modified at all, but it's
not your fault :-)

The code is protected by CONFIG_WIRELESS_EXT but that protection is
not valid.  It should be protected by something like CONFIG_WEXT_CORE
or similar.

The only way to get CONFIG_WIRELESS_EXT set it to enable one of a few
drivers, many of which are in staging.

Anyways, just a heads up, I'll fix this up.

^ permalink raw reply

* Re: [PATCH UPDATED 1/3] vhost: replace vhost_workqueue with per-vhost kthread
From: Michael S. Tsirkin @ 2010-07-26 19:59 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Sridhar Samudrala, netdev, lkml,
	kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev, Jiri Kosina,
	Thomas Gleixner, Ingo Molnar, Andi Kleen
In-Reply-To: <4C4DE2AE.40302@kernel.org>

On Mon, Jul 26, 2010 at 09:31:58PM +0200, Tejun Heo wrote:
> Hello,
> 
> On 07/26/2010 09:14 PM, Tejun Heo wrote:
> > On 07/26/2010 06:51 PM, Michael S. Tsirkin wrote:
> >> I noticed that with vhost, flush_work was getting the worker
> >> pointer as well. Can we live with this API change?
> > 
> > Yeah, the flushing mechanism wouldn't work reliably if the work is
> > queued to a different worker without flushing, so yeah passing in
> > @worker might actually be better.
> 
> Thinking a bit more about it, it kind of sucks that queueing to
> another worker from worker->func() breaks flush.  Maybe the right
> thing to do there is using atomic_t for done_seq?  It pays a bit more
> overhead but maybe that's justifiable to keep the API saner?  It would
> be great if it can be fixed somehow even if it means that the work has
> to be separately flushed for each worker it has been on before being
> destroyed.
> 
> Or, if flushing has to be associated with a specific worker anyway,
> maybe it would be better to move the sequence counter to
> kthread_worker and do it similarly with the original workqueue so that
> work can be destroyed once execution starts?  Then, it can at least
> remain semantically identical to the original workqueue.
> 
> Thanks.

This last sounds sane: in fact I didn't know there is any difference.

> -- 
> tejun

^ permalink raw reply

* Re: [PATCH UPDATED 1/3] vhost: replace vhost_workqueue with per-vhost kthread
From: Michael S. Tsirkin @ 2010-07-26 19:57 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Oleg Nesterov, Sridhar Samudrala, netdev, lkml,
	kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev, Jiri Kosina,
	Thomas Gleixner, Ingo Molnar, Andi Kleen
In-Reply-To: <4C4DD946.2080502@kernel.org>

On Mon, Jul 26, 2010 at 08:51:50PM +0200, Tejun Heo wrote:
> Hello,
> 
> On 07/26/2010 06:31 PM, Michael S. Tsirkin wrote:
> >> On 07/26/2010 06:05 PM, Tejun Heo wrote:
> >>> * Placing try_to_freeze() could be a bit annoying.  It shouldn't be
> >>>   executed when there's a work to flush.
> > 
> > BTW why is this important?
> > We could always get another work and flush right after
> > try_to_freeze, and then flush would block for a long time.
> > 
> > BTW the vhost patch you sent does not do this at all.
> > I am guessing it is because our thread is not freezable?
> 
> Yeap, I think so.
> 
> >> * Similar issue exists for kthread_stop().  The kthread shouldn't exit
> >>   while there's a work to flush (please note that kthread_worker
> >>   interface allows detaching / attaching worker kthread during
> >>   operation, so it should remain in consistent state with regard to
> >>   flushing).
> > 
> > Not sure I agree here. Users must synchronise flush and stop calls.
> > Otherwise a work might get queued after stop is called, and
> > you won't be able to flush it.
> 
> For freeze, it probably is okay but for stop, I think it's better to
> keep the semantics straight forward.

What are the semantics then? What do we want stop followed
by queue and flush to do?

>  It may be okay to do otherwise
> but having such oddity in generic interface is nasty and may lead to
> surprises which can be pretty difficult to track down later on.  It's
> just a bit more of annoyance while writing the generic code, so...
> 
> Thanks.
> 
> -- 
> tejun

^ permalink raw reply

* Re: [PATCH repost] sched: export sched_set/getaffinity to modules
From: Michael S. Tsirkin @ 2010-07-26 19:55 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Sridhar Samudrala, Peter Zijlstra, Tejun Heo, Ingo Molnar, netdev,
	lkml, kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev,
	Jiri Kosina, Thomas Gleixner, Andi Kleen
In-Reply-To: <20100726180834.GA26988@redhat.com>

On Mon, Jul 26, 2010 at 08:08:34PM +0200, Oleg Nesterov wrote:
> On 07/26, Sridhar Samudrala wrote:
> >
> > I have been testing out a similar patch that uses kernel_thread() without CLONE_FILES
> > flag rather than create_kthread() and then closing the files.
> 
> !CLONE_FILES can't help. copy_files() does dup_fd() in this case.
> The child still inherits the files.
> 
> > Either version should be fine.
> 
> I think neither version is fine ;)
> 
> exit_files() is not enough too. How about the signals,

As I said, signals are unimportant as we are using this
thread to base a worker on - it sleeps uninterruptibly.

> reparenting?

That's actually a feature: it lets us find out which process
owns the device using the thread by looking at the parent.

> 
> I already forgot all details, probably I missed somethig. But it
> seems to me that it is better to just export get/set affinity and
> forget about all complications.
> 
> Oleg.

^ permalink raw reply

* pull request: wireless-2.6 2010-07-26
From: John W. Linville @ 2010-07-26 19:49 UTC (permalink / raw)
  To: davem-fT/PcQaiUtIeIZ0/mPfg9Q
  Cc: linux-wireless-u79uwXL29TY76Z2rM5mHXA,
	netdev-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA

Dave,

Here are two more late-bloomers for 2.6.35.  Both are tiny and isolated,
and they fix documented regression BUGs as indicated in their commit logs.

The one from me changes a netif_rx to a netif_rx_ni -- the execution
path is in process context, coming from userland.  The bug reporter
confirmed that it addressed the problem he reported.

The one from Ming Lei has been in net-next-2.6 for a while already.
More recently, someone posted an identical patch indicated it addressed
the BUG as reported in the cited LKML thread.

I think it would be best to get these into 2.6.35 if at all possible.
Please let me know if there are problems!

Thanks,

John

---

The following changes since commit 3b87956ea645fb4de7e59c7d0aa94de04be72615:

  net sched: fix race in mirred device removal (2010-07-24 21:04:20 -0700)

are available in the git repository at:
  git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-2.6.git master

John W. Linville (1):
      wireless: use netif_rx_ni in ieee80211_send_layer2_update

Ming Lei (1):
      ath9k: fix dma direction for map/unmap in ath_rx_tasklet

 drivers/net/wireless/ath/ath9k/recv.c |    4 ++--
 net/mac80211/cfg.c                    |    2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/ath/ath9k/recv.c b/drivers/net/wireless/ath/ath9k/recv.c
index ca6065b..e3e5291 100644
--- a/drivers/net/wireless/ath/ath9k/recv.c
+++ b/drivers/net/wireless/ath/ath9k/recv.c
@@ -844,9 +844,9 @@ int ath_rx_tasklet(struct ath_softc *sc, int flush, bool hp)
 	int dma_type;
 
 	if (edma)
-		dma_type = DMA_FROM_DEVICE;
-	else
 		dma_type = DMA_BIDIRECTIONAL;
+	else
+		dma_type = DMA_FROM_DEVICE;
 
 	qtype = hp ? ATH9K_RX_QUEUE_HP : ATH9K_RX_QUEUE_LP;
 	spin_lock_bh(&sc->rx.rxbuflock);
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c
index c7000a6..67ee34f 100644
--- a/net/mac80211/cfg.c
+++ b/net/mac80211/cfg.c
@@ -632,7 +632,7 @@ static void ieee80211_send_layer2_update(struct sta_info *sta)
 	skb->dev = sta->sdata->dev;
 	skb->protocol = eth_type_trans(skb, sta->sdata->dev);
 	memset(skb->cb, 0, sizeof(skb->cb));
-	netif_rx(skb);
+	netif_rx_ni(skb);
 }
 
 static void sta_apply_parameters(struct ieee80211_local *local,
-- 
John W. Linville		Someday the world will need a hero, and you
linville-2XuSBdqkA4R54TAoqtyWWQ@public.gmane.org			might be all we have.  Be ready.
--
To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply related

* Re: [PATCH UPDATED 1/3] vhost: replace vhost_workqueue with per-vhost kthread
From: Tejun Heo @ 2010-07-26 19:31 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Oleg Nesterov, Sridhar Samudrala, netdev, lkml,
	kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev, Jiri Kosina,
	Thomas Gleixner, Ingo Molnar, Andi Kleen
In-Reply-To: <4C4DDE7E.8030406@kernel.org>

Hello,

On 07/26/2010 09:14 PM, Tejun Heo wrote:
> On 07/26/2010 06:51 PM, Michael S. Tsirkin wrote:
>> I noticed that with vhost, flush_work was getting the worker
>> pointer as well. Can we live with this API change?
> 
> Yeah, the flushing mechanism wouldn't work reliably if the work is
> queued to a different worker without flushing, so yeah passing in
> @worker might actually be better.

Thinking a bit more about it, it kind of sucks that queueing to
another worker from worker->func() breaks flush.  Maybe the right
thing to do there is using atomic_t for done_seq?  It pays a bit more
overhead but maybe that's justifiable to keep the API saner?  It would
be great if it can be fixed somehow even if it means that the work has
to be separately flushed for each worker it has been on before being
destroyed.

Or, if flushing has to be associated with a specific worker anyway,
maybe it would be better to move the sequence counter to
kthread_worker and do it similarly with the original workqueue so that
work can be destroyed once execution starts?  Then, it can at least
remain semantically identical to the original workqueue.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: tcpdump <-> FCoE support
From: Hagen Paul Pfeifer @ 2010-07-26 19:22 UTC (permalink / raw)
  To: Loke, Chetan; +Cc: devel, netdev
In-Reply-To: <D3F292ADF945FB49B35E96C94C2061B90C6F53AE@nsmail.netscout.com>

* Loke, Chetan | 2010-07-26 13:30:13 [-0400]:

>13:08:21.907971 00:yy:yy:yy:yy:yy (oui Unknown) > 00:xx:xx:xx:xx:xx (oui
>Unknown), ethertype Unknown (0x8906)
>
>On: tcpdump-4.0.0-3, tcpdump doesn't seem to have FCoE support(unless I
>capture it and pipe it to wireshark). Is there any effort currently in
>progress on adding FCoE parsers?

That's no kernel issue, it is purely related to userland tools such as
tcpdump/wireshark/whatever.

Hagen


^ permalink raw reply

* Re: [PATCH UPDATED 1/3] vhost: replace vhost_workqueue with per-vhost kthread
From: Tejun Heo @ 2010-07-26 19:14 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Oleg Nesterov, Sridhar Samudrala, netdev, lkml,
	kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev, Jiri Kosina,
	Thomas Gleixner, Ingo Molnar, Andi Kleen
In-Reply-To: <20100726165114.GA27353@redhat.com>

On 07/26/2010 06:51 PM, Michael S. Tsirkin wrote:
> On Mon, Jul 26, 2010 at 06:14:30PM +0200, Tejun Heo wrote:
>> Just one more thing.
> 
> I noticed that with vhost, flush_work was getting the worker
> pointer as well. Can we live with this API change?

Yeah, the flushing mechanism wouldn't work reliably if the work is
queued to a different worker without flushing, so yeah passing in
@worker might actually be better.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH UPDATED 1/3] vhost: replace vhost_workqueue with per-vhost kthread
From: Tejun Heo @ 2010-07-26 19:04 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Oleg Nesterov, Sridhar Samudrala, netdev, lkml,
	kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev, Jiri Kosina,
	Thomas Gleixner, Ingo Molnar, Andi Kleen
In-Reply-To: <20100726162346.GD26412@redhat.com>

On 07/26/2010 06:23 PM, Michael S. Tsirkin wrote:
>> * Can you please keep the outer goto repeat loop?  I just don't like
>>   outermost for (;;).
> 
> Okay ... can we put the code in a {} scope to make it clear
> where does the loop starts and ends?

If we're gonna do that, it would be better to put it inside a loop
construct.  The reason why I don't like it is that loops like that
don't really help read/writeability much while indenting the whole
logic unnecessarily and look more like a result of obsession against
goto rather than any practical reason.  It's just a cosmetic
preference and I might as well be the weirdo here, so if you feel
strong about it, please feel free to put everything in a loop.

>> * Placing try_to_freeze() could be a bit annoying.  It shouldn't be
>>   executed when there's a work to flush.
> 
> It currently seems to be executed when there is work to flush.
> Is this wrong?

Oh, does it?  As I wrote in the other mail, things like that wouldn't
necessarily break correctness but I think it would be better to avoid
surprises in the generic code if not too difficult.

>> * I think A - B <= 0 test would be more familiar.  At least
>>   time_before/after() are implemented that way.
> 
> I am concerned that this overflows a signed integer -
> which I seem to remeber that C99 disallows.

Really?  Overflows of pointer isn't expected and that's why we have
weird RELOC_HIDE() macro for such calculations but integers not
expected to overflow is a news to me.  Are you sure?  That basically
means time_before/after() aren't safe either.

> timer macros are on data path so might be worth the risk there,
> but flush is slow path so better be safe?

I don't think performance matters much here.  I just think the sign
test is clearer / more familiar for the logic.

Thanks.

-- 
tejun

^ permalink raw reply

* Re: [PATCH UPDATED 1/3] vhost: replace vhost_workqueue with per-vhost kthread
From: Tejun Heo @ 2010-07-26 18:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Oleg Nesterov, Sridhar Samudrala, netdev, lkml,
	kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev, Jiri Kosina,
	Thomas Gleixner, Ingo Molnar, Andi Kleen
In-Reply-To: <20100726163108.GE26412@redhat.com>

Hello,

On 07/26/2010 06:31 PM, Michael S. Tsirkin wrote:
>> On 07/26/2010 06:05 PM, Tejun Heo wrote:
>>> * Placing try_to_freeze() could be a bit annoying.  It shouldn't be
>>>   executed when there's a work to flush.
> 
> BTW why is this important?
> We could always get another work and flush right after
> try_to_freeze, and then flush would block for a long time.
> 
> BTW the vhost patch you sent does not do this at all.
> I am guessing it is because our thread is not freezable?

Yeap, I think so.

>> * Similar issue exists for kthread_stop().  The kthread shouldn't exit
>>   while there's a work to flush (please note that kthread_worker
>>   interface allows detaching / attaching worker kthread during
>>   operation, so it should remain in consistent state with regard to
>>   flushing).
> 
> Not sure I agree here. Users must synchronise flush and stop calls.
> Otherwise a work might get queued after stop is called, and
> you won't be able to flush it.

For freeze, it probably is okay but for stop, I think it's better to
keep the semantics straight forward.  It may be okay to do otherwise
but having such oddity in generic interface is nasty and may lead to
surprises which can be pretty difficult to track down later on.  It's
just a bit more of annoyance while writing the generic code, so...

Thanks.

-- 
tejun

^ permalink raw reply

* Re: tcpdump <-> FCoE support
From: Loke, Chetan @ 2010-07-26 18:27 UTC (permalink / raw)
  To: Joe Eykholt; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA, devel-s9riP+hp16TNLxjTenLetw
In-Reply-To: <4C4DCCA6.5060206-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org>

> From: Joe Eykholt [mailto:jeykholt-FYB4Gu1CFyUAvxtiuMwx3w@public.gmane.org]
> 
> Good question.  There's no effort in progress that I know of.
> I find wireshark very handy and do tend to capture with tcpdump -w
> and then read the file with wireshark.  tshark is handy if you
> want a CLI mode.
> 
> If tcpdump interpreted FCoE frames, then we would want FC, FC-ELS,
> and FCP.  Does it already dissect any SCSI frames (e.g., iSCSI)?
> I'm not sure how much work it would be, but given tshark
> is around, I don't think it's that important.
> 
Oki thanks. I'll play with tshark. If tshark handles the FIP/FC
discovery business then that's perfect. I don't have a FCoE target
configured yet. I just sent a raw fcoe-frame to check how tcpdump
responds.

> It could at least be changed to print out FCoE instead of 0x8906,
> and FIP instead of 0x8914.
> 
Agreed.


> 	Joe
Chetan Loke

^ permalink raw reply

* Re: [PATCH 4/5] wireless: use newly introduced hex_to_bin()
From: John W. Linville @ 2010-07-26 18:27 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: David S. Miller, netdev, linux-kernel, Corey Thomas,
	linux-wireless
In-Reply-To: <9d7821ff7791c30772667681c0eab829e6e93e59.1279890832.git.andy.shevchenko@gmail.com>

On Fri, Jul 23, 2010 at 04:18:09PM +0300, Andy Shevchenko wrote:
> Signed-off-by: Andy Shevchenko <andy.shevchenko@gmail.com>
> Cc: Corey Thomas <coreythomas@charter.net>
> Cc: "John W. Linville" <linville@tuxdriver.com>
> Cc: linux-wireless@vger.kernel.org

Acked-by: John W. Linville <linville@tuxdriver.com>

-- 
John W. Linville		Someday the world will need a hero, and you
linville@tuxdriver.com			might be all we have.  Be ready.

^ permalink raw reply

* Re: [PATCH V3] Export SMBIOS provided firmware instance and label to sysfs
From: Greg KH @ 2010-07-26 18:20 UTC (permalink / raw)
  To: Matt Domsch
  Cc: Narendra K, netdev, linux-hotplug, linux-pci, charles_rose,
	jordan_hargrave, vijay_nijhawan
In-Reply-To: <20100723145809.GA7629@auslistsprd01.us.dell.com>

On Fri, Jul 23, 2010 at 09:58:09AM -0500, Matt Domsch wrote:
> On Fri, Jul 23, 2010 at 06:55:57AM -0700, Greg KH wrote:
> > On Fri, Jul 23, 2010 at 08:34:56AM -0500, Narendra K wrote:
> > > --- a/Documentation/ABI/testing/sysfs-bus-pci
> > > +++ b/Documentation/ABI/testing/sysfs-bus-pci
> > > @@ -179,3 +179,30 @@ Contact:	linux-pci@vger.kernel.org
> > >  Description:
> > >  		This symbolic link points to the PCI hotplug controller driver
> > >  		module that manages the hotplug slot.
> > > +
> > > +What:		/sys/bus/pci/devices/.../label
> > > +Date:		July 2010
> > > +Contact:	linux-bugs@dell.com
> > 
> > that's not your email address.  Please don't hide behind some random
> > address, Linux is about contacting developers directly were ever
> > possible.
> 
> That's actually the public email address for the whole Linux
> engineering team (including engineers and managers) at Dell, of which
> Narendra is a part.  It's the address we publish for people to include
> on the cc: list of bugzilla issues on the kernel.org, Novell, and Red
> Hat bugzillas.  This ensures someone on the team will see bug reports
> or patches and act accordingly, likely Narendra, but could be anyone
> in the future once Narendra is promoted or changes responsibilities
> (or even employers).

Ok, I don't really like it, but if that's what Dell wants to do, I guess
it's acceptable.

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH] Driver-core: Fix bluetooth network device rename regression
From: Greg KH @ 2010-07-26 18:09 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Greg KH, Johannes Berg, Kay Sievers, Andrew Morton,
	Rafael J. Wysocki, Maciej W. Rozycki, netdev
In-Reply-To: <m1eievugon.fsf@fess.ebiederm.org>

On Thu, Jul 22, 2010 at 06:31:52PM -0700, Eric W. Biederman wrote:
> Greg KH <gregkh@suse.de> writes:
> 
> > On Thu, Jul 22, 2010 at 08:36:41PM +0200, Johannes Berg wrote:
> >> On Thu, 2010-07-22 at 11:28 -0700, Greg KH wrote:
> >> 
> >> > It worked only because no one realized that it was broken with the
> >> > DEPRECATED option enabled.  When that is enabled, it is broken, right?
> >> 
> >> I'm pretty sure I always had that enabled, and never had issues. Can't
> >> test right now since I don't have that option back yet in the tree I'm
> >> using.
> >> 
> >> > Eric's changes to sysfs to add namespace support exposed this breakage.
> >> > That's not a reason to paper over the problem, but it should be driving
> >> > someone to fix it correctly, as has been pointed out a number of times
> >> > already.
> >> 
> >> I'm just contesting that that someone should be me. I don't think you
> >> get to blame driver developers for doing something that worked and
> >> solved the problem they needed to solve. sysfs is largely opaque to most
> >> of us already, and it now sure feels like Kay decided to change the
> >> rules underneath the code in saying "this was wrong all along".
> >
> > Well, if it worked before, and it doesn't now, that's due to Eric's
> > changes, nothing Kay and I did here :)
> 
> It mostly worked before, and it works with CONFIG_SYSFS_DEPRECATED=y
> (After I my stupid bug is fixed).
> 
> With CONFIG_SYSFS_DEPRECATED=y we have in sysfs:
> ........./net:bnep0 -> /sys/class/net/bnep0
> /sys/class/net/bnep0 is a directory.
> 
> With CONFIG_SYSFS_DEPRECATED=n we have in sysfs:
> ........./bnep0
> /sys/class/net/bnep0 -> ......./bnep0
> 
> Where for a normal network device we have:
> With CONFIG_SYSFS_DEPRECATED=y we have in sysfs:
> ........./net:eth0 -> /sys/class/net/eth0
> /sys/class/net/bnep0 is a directory.
> 
> With CONFIG_SYSFS_DEPRECATED=n we have in sysfs:
> ........./net/eth0
> /sys/class/net/eth0 -> ......./net/eth0
> 
> The new sysfs layout loses the net namespace separator for bluetooth
> devices.
> 
> > But, in looking at it closer, it does seem that the code is doing things
> > that was not expected to work at all previously, and It's amazing that
> > it did.  I thought Kay offered to help fix it all up, and provided 2
> > different ways to do it.  I know they aren't trivial, but then again,
> > your usage of sysfs is not trivial either...
> 
> I don't expect much just the upper levels to work correctly.  The fact
> this works on all but a handful of drivers is what I find dist
> 
> Does this version of the change look less bleh worthy?  The effect is
> the same as my previous patch but the test is more abstract so the
> effect is not strictly limited to /sys/class/net.

What other class type has a namespace at this point in time?
Essentially this is the same exact thing, just in a different format
that obfuscates what you are really doing here.

As you have now sent this to Linus and he took it, well, it seems moot,
right?

kids these days...

greg k-h

^ permalink raw reply

* Re: [PATCH] Driver-core: Always create class directories for classses that support namespaces.
From: Greg KH @ 2010-07-26 18:07 UTC (permalink / raw)
  To: Eric W. Biederman
  Cc: Linus Torvalds, Johannes Berg, Kay Sievers, Andrew Morton,
	Rafael J. Wysocki, Maciej W. Rozycki, netdev,
	<linux-kernel@vger.kernel.org> Marcel Holtmann,
	linux-bluetooth, Greg KH
In-Reply-To: <m1tynonmk8.fsf_-_@fess.ebiederm.org>

On Sat, Jul 24, 2010 at 10:43:35PM -0700, Eric W. Biederman wrote:
> p.s. Linus my apologies for sending this directly but I have gotten an
> incredible amount of flak trying to fix this problem, and I would like
> not to leave an accidental regression whose cause is well known in
> 2.6.35 if I can help it.

"flak"?

{sigh}

No, it's just been an issue of fixing the issue in the correct manner.
You sent this patch a mere 2 days prior, and I've been away since then
and haven't even _had_ the chance to apply it to my tree and run it
through any testing.

Heck, it hasn't even been in the linux-next tree, and seen if it blows
up Andrew's notorious machine.

Next time, give me a chance to at least build your patch before getting
upset and routing around me.  The other 2 patches you have sent me have
had testing in linux-next and will go to Linus later today, and I have
half-a-mind to revert this one, but I just know you will be willing to
fix up any problems found in it now :)

What a nice way to return from a 3day vacation...

greg k-h

^ permalink raw reply

* Re: [PATCH repost] sched: export sched_set/getaffinity to modules
From: Oleg Nesterov @ 2010-07-26 18:08 UTC (permalink / raw)
  To: Sridhar Samudrala
  Cc: Michael S. Tsirkin, Peter Zijlstra, Tejun Heo, Ingo Molnar,
	netdev, lkml, kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev,
	Jiri Kosina, Thomas Gleixner, Andi Kleen
In-Reply-To: <1280166688.3375.5.camel@localhost>

On 07/26, Sridhar Samudrala wrote:
>
> I have been testing out a similar patch that uses kernel_thread() without CLONE_FILES
> flag rather than create_kthread() and then closing the files.

!CLONE_FILES can't help. copy_files() does dup_fd() in this case.
The child still inherits the files.

> Either version should be fine.

I think neither version is fine ;)

exit_files() is not enough too. How about the signals, reparenting?

I already forgot all details, probably I missed somethig. But it
seems to me that it is better to just export get/set affinity and
forget about all complications.

Oleg.

^ permalink raw reply

* Re: tcpdump <-> FCoE support
From: Joe Eykholt @ 2010-07-26 17:57 UTC (permalink / raw)
  To: Loke, Chetan; +Cc: netdev-u79uwXL29TY76Z2rM5mHXA, devel-s9riP+hp16TNLxjTenLetw
In-Reply-To: <D3F292ADF945FB49B35E96C94C2061B90C6F53AE-2s2rCY1e8UXHBhWB4kaBDUEOCMrvLtNR@public.gmane.org>

On 7/26/10 10:30 AM, Loke, Chetan wrote:
> 13:08:21.907971 00:yy:yy:yy:yy:yy (oui Unknown)>  00:xx:xx:xx:xx:xx (oui
> Unknown), ethertype Unknown (0x8906)
>
> On: tcpdump-4.0.0-3, tcpdump doesn't seem to have FCoE support(unless I
> capture it and pipe it to wireshark). Is there any effort currently in
> progress on adding FCoE parsers?
>
>
> Chetan Loke

Good question.  There's no effort in progress that I know of.
I find wireshark very handy and do tend to capture with tcpdump -w
and then read the file with wireshark.  tshark is handy if you
want a CLI mode.

If tcpdump interpreted FCoE frames, then we would want FC, FC-ELS,
and FCP.  Does it already dissect any SCSI frames (e.g., iSCSI)?
I'm not sure how much work it would be, but given tshark
is around, I don't think it's that important.

It could at least be changed to print out FCoE instead of 0x8906,
and FIP instead of 0x8914.

	Joe

^ permalink raw reply

* Re: [PATCH repost] sched: export sched_set/getaffinity to modules
From: Sridhar Samudrala @ 2010-07-26 17:51 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Oleg Nesterov, Peter Zijlstra, Tejun Heo, Ingo Molnar, netdev,
	lkml, kvm@vger.kernel.org, Andrew Morton, Dmitri Vorobiev,
	Jiri Kosina, Thomas Gleixner, Andi Kleen
In-Reply-To: <20100726171230.GA27644@redhat.com>

On Mon, 2010-07-26 at 20:12 +0300, Michael S. Tsirkin wrote:
> On Fri, Jul 02, 2010 at 11:06:37PM +0200, Oleg Nesterov wrote:
> > On 07/02, Peter Zijlstra wrote:
> > >
> > > On Fri, 2010-07-02 at 11:01 -0700, Sridhar Samudrala wrote:
> > > >
> > > >  Does  it (Tejun's kthread_clone() patch) also  inherit the
> > > > cgroup of the caller?
> > >
> > > Of course, its a simple do_fork() which inherits everything just as you
> > > would expect from a similar sys_clone()/sys_fork() call.
> > 
> > Yes. And I'm afraid it can inherit more than we want. IIUC, this is called
> > from ioctl(), right?
> > 
> > Then the new thread becomes the natural child of the caller, and it shares
> > ->mm with the parent. And files, dup_fd() without CLONE_FS.
> > 
> > Signals. Say, if you send SIGKILL to this new thread, it can't sleep in
> > TASK_INTERRUPTIBLE or KILLABLE after that. And this SIGKILL can be sent
> > just because the parent gets SIGQUIT or abother coredumpable signal.
> > Or the new thread can recieve SIGSTOP via ^Z.
> > 
> > Perhaps this is OK, I do not know. Just to remind that kernel_thread()
> > is merely clone(CLONE_VM).
> > 
> > Oleg.
> 
> With some machinery to stop it later, yes.
> Oleg, how does the below look to you?
> 
> Here I explicitly drop the fds so we don't share them.
> CLONE_VM takes care of sharing the mm I think.
> About signals - for the vhost-net use this is OK as we use
> uninterruptible sleep anyway (like the new kthread_worker does).
> 
> This code seems to work fine for me so far - any comments?
> 
> ---
> 
> diff --git a/include/linux/kthread.h b/include/linux/kthread.h
> index aabc8a1..72c7b17 100644
> --- a/include/linux/kthread.h
> +++ b/include/linux/kthread.h
> @@ -9,6 +9,11 @@ struct task_struct *kthread_create(int (*threadfn)(void *data),
>  				   const char namefmt[], ...)
>  	__attribute__((format(printf, 3, 4)));
> 
> +struct task_struct *kthread_create_inherit(int (*threadfn)(void *data),
> +					   void *data,
> +					   const char namefmt[], ...)
> +	__attribute__((format(printf, 3, 4)));
> +
>  /**
>   * kthread_run - create and wake a thread.
>   * @threadfn: the function to run until signal_pending(current).
> diff --git a/kernel/kthread.c b/kernel/kthread.c
> index 83911c7..b81588c 100644
> --- a/kernel/kthread.c
> +++ b/kernel/kthread.c
> @@ -149,6 +149,38 @@ struct task_struct *kthread_create(int (*threadfn)(void *data),
>  }
>  EXPORT_SYMBOL(kthread_create);
> 
> +/* Same as kthread_create, but inherit attributes (cgroups, priority, CPU mask)
> + * from current. */
> +struct task_struct *kthread_create_inherit(int (*threadfn)(void *data),
> +					   void *data,
> +					   const char namefmt[],
> +					   ...)
> +{
> +	struct kthread_create_info create;
> +
> +	create.threadfn = threadfn;
> +	create.data = data;
> +	init_completion(&create.done);
> +
> +	create_kthread(&create);
> +	wait_for_completion(&create.done);
> +
> +	if (!IS_ERR(create.result)) {
> +		va_list args;
> +
> +		/* Don't share files with parent as drivers use release for
> +		 * close on exit, etc. */
> +		exit_files(create.result);
> +
> +		va_start(args, namefmt);
> +		vsnprintf(create.result->comm, sizeof(create.result->comm),
> +			  namefmt, args);
> +		va_end(args);
> +	}
> +	return create.result;
> +}
> +EXPORT_SYMBOL(kthread_create_inherit);
> +
>  /**
>   * kthread_bind - bind a just-created kthread to a cpu.
>   * @p: thread created by kthread_create().


I have been testing out a similar patch that uses kernel_thread() without CLONE_FILES
flag rather than create_kthread() and then closing the files.
Either version should be fine.

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index aabc8a1..634eaf7 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -9,6 +9,11 @@ struct task_struct *kthread_create(int (*threadfn)(void *data),
 				   const char namefmt[], ...)
 	__attribute__((format(printf, 3, 4)));
 
+struct task_struct *kthread_clone(int (*threadfn)(void *data),
+				  void *data,
+				  const char namefmt[], ...)
+	__attribute__((format(printf, 3, 4)));
+
 /**
  * kthread_run - create and wake a thread.
  * @threadfn: the function to run until signal_pending(current).
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 83911c7..806dae5 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -149,6 +149,38 @@ struct task_struct *kthread_create(int (*threadfn)(void *data),
 }
 EXPORT_SYMBOL(kthread_create);
 
+struct task_struct *kthread_clone(int (*threadfn)(void *data),
+				  void *data,
+				  const char namefmt[],
+				  ...)
+{
+	struct kthread_create_info create;
+	int pid;
+
+	create.threadfn = threadfn;
+	create.data = data;
+	init_completion(&create.done);
+	INIT_LIST_HEAD(&create.list);
+
+	pid = kernel_thread(kthread, &create, CLONE_FS);
+	if (pid < 0) {
+		create.result = ERR_PTR(pid);
+		complete(&create.done);
+	}
+	wait_for_completion(&create.done);
+
+	if (!IS_ERR(create.result)) {
+		va_list args;
+		va_start(args, namefmt);
+		vsnprintf(create.result->comm, sizeof(create.result->comm),
+			  namefmt, args);
+		va_end(args);
+	}
+
+	return create.result;
+}
+EXPORT_SYMBOL(kthread_clone);
+
 /**
  * kthread_bind - bind a just-created kthread to a cpu.
  * @p: thread created by kthread_create().






^ permalink raw reply related

* tcpdump <-> FCoE support
From: Loke, Chetan @ 2010-07-26 17:30 UTC (permalink / raw)
  To: devel-s9riP+hp16TNLxjTenLetw, netdev-u79uwXL29TY76Z2rM5mHXA

Folks,

13:08:21.907971 00:yy:yy:yy:yy:yy (oui Unknown) > 00:xx:xx:xx:xx:xx (oui
Unknown), ethertype Unknown (0x8906)

On: tcpdump-4.0.0-3, tcpdump doesn't seem to have FCoE support(unless I
capture it and pipe it to wireshark). Is there any effort currently in
progress on adding FCoE parsers?

Chetan Loke

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox