Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: DSCP values in TCP handshake
From: Stephen Hemminger @ 2011-04-18 15:38 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Joe Buehler, netdev
In-Reply-To: <1303135512.3137.335.camel@edumazet-laptop>

On Mon, 18 Apr 2011 16:05:12 +0200
Eric Dumazet <eric.dumazet@gmail.com> wrote:

> Le lundi 18 avril 2011 à 13:48 +0000, Joe Buehler a écrit :
> > What does the LINUX network code put in the DSCP field of TCP SYN-ACK packets
> > that are replies to a SYN connection request?  I have a customer sending SYN
> > with 0x2e in the DSCP field but apparently getting DSCP 0x0 in the reply.
> 
> The other way (server->client) is depending on application and listener
> only.
> 
> If it uses a standard socket, standard bind() + listen(), TOS is 0

If you want to set DSCP in the response, the application needs to apply
set it on the listening socket. 
   dscp = 0x2e;
   setsockopt(s, IPPROTO_IP, IP_TOS, &dscp, sizeof(dscp));

^ permalink raw reply

* Fw: [Bug 33502] New: Caught 64-bit read from uninitialized memory in __alloc_skb
From: Stephen Hemminger @ 2011-04-18 15:30 UTC (permalink / raw)
  To: netdev



Begin forwarded message:

Date: Sun, 17 Apr 2011 19:29:39 GMT
From: bugzilla-daemon@bugzilla.kernel.org
To: shemminger@linux-foundation.org
Subject: [Bug 33502] New: Caught 64-bit read from uninitialized memory in __alloc_skb


https://bugzilla.kernel.org/show_bug.cgi?id=33502

           Summary: Caught 64-bit read from uninitialized memory in
                    __alloc_skb
           Product: Networking
           Version: 2.5
    Kernel Version: 2.6.39-rc3
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: IPV4
        AssignedTo: shemminger@linux-foundation.org
        ReportedBy: casteyde.christian@free.fr
        Regression: Yes


Acer Aspire 1511LMi
Athlon 64 3GHz in 64bits mode
Slackware 64 13.1

Since 2.6.39-rc3 with kmemcheck enabled, I get the following warning:
...
pcmcia_socket pcmcia_socket0: cs: memory probe 0x0c0000-0x0fffff: excluding
0xc0000-0xfffff
pcmcia_socket pcmcia_socket0: cs: memory probe 0x60000000-0x60ffffff: excluding
0x60000000-0x
60ffffff
pcmcia_socket pcmcia_socket0: cs: memory probe 0xa0000000-0xa0ffffff: excluding
0xa0000000-0x
a0ffffff
udev: renamed network interface wlan0 to eth1
WARNING: kmemcheck: Caught 64-bit read from uninitialized memory
(ffff88001b0bb800)
00b00b1b0088ffff0000000000000000cafe1dea20009b0000299a3100000000
 u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u u
 ^

Pid: 1511, comm: udevd Not tainted 2.6.39-rc3 #1 Acer,Inc. Aspire 1510  /Aspire
1510
RIP: 0010:[<ffffffff810c2f0c>]  [<ffffffff810c2f0c>]
__kmalloc_track_caller+0xbc/0x1d0
RSP: 0018:ffff88001d3a7a18  EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000010 RCX: 000000000000284f
RDX: 000000000000284e RSI: ffff88001fe5b160 RDI: ffffffff8177e39a
RBP: ffff88001d3a7a48 R08: 0000000000000000 R09: ffff88001b931100
R10: 0000000000000000 R11: 0000000000000003 R12: ffff88001b0bb800
R13: ffff88001f803840 R14: 00000000000004d0 R15: ffffffff814769c6
FS:  00007f6ee81f1700(0000) GS:ffffffff81a1b000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: ffff88001d0b3938 CR3: 000000001d38b000 CR4: 00000000000006f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400
 [<ffffffff8147ccf2>] __alloc_skb+0x72/0x190
 [<ffffffff814769c6>] sock_alloc_send_pskb+0x236/0x3a0
 [<ffffffff81476b40>] sock_alloc_send_skb+0x10/0x20
 [<ffffffff81523c18>] unix_dgram_sendmsg+0x298/0x770
 [<ffffffff814715f3>] sock_sendmsg+0xe3/0x110
 [<ffffffff81472603>] sys_sendmsg+0x243/0x3c0
 [<ffffffff815e7238>] system_call_fastpath+0x16/0x1b
 [<ffffffffffffffff>] 0xffffffffffffffff
Adding 506012k swap on /dev/sda1.  Priority:-1 extents:1 across:506012k 
EXT4-fs (sda2): re-mounted. Opts: (null)
EXT4-fs (sda2): re-mounted. Opts: (null)
EXT3-fs: barriers not enabled
kjournald starting.  Commit interval 5 seconds
EXT3-fs (sda3): using internal journal
EXT3-fs (sda3): mounted filesystem with writeback data mode
b43-phy0: Loading firmware version 478.104 (2008-07-01 00:50:23)
ADDRCONF(NETDEV_UP): eth1: link is not ready
b43-phy0: Loading firmware version 478.104 (2008-07-01 00:50:23)
...

I cannot build 2.6.39-rc2, and in 2.6.39-rc1 I used to have another warning but
not this one.

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


-- 

^ permalink raw reply

* Re: Driver build-testing (was: [PATCH] net: dm9000: Fix build)
From: Randy Dunlap @ 2011-04-18 15:28 UTC (permalink / raw)
  To: Micha? Miros?aw; +Cc: Mark Brown, netdev
In-Reply-To: <20110418151729.GA4721@rere.qmqm.pl>

On Mon, 18 Apr 2011 17:17:29 +0200 Micha? Miros?aw wrote:

> On Mon, Apr 18, 2011 at 07:55:08AM -0700, Randy Dunlap wrote:
> > On Mon, 18 Apr 2011 16:51:02 +0200 Micha? Miros?aw wrote:
> > 
> > > On Mon, Apr 18, 2011 at 01:25:22PM +0100, Mark Brown wrote:
> > > > On Mon, Apr 18, 2011 at 01:12:03PM +0200, Micha? Miros?aw wrote:
> > > > > This brings the issue of build testing effectiveness. In current code
> > > > > there is no configuration that makes all drivers build. I would like
> > > > > to see something like 'make brokenconfig' that would allow most of
> > > > > the code to be built, and not necessarily working. Maybe someone has
> > > > > an idea how to implement that?
> > > > For the drivers that genuinely are rather platform specific this tends
> > > > to fail very badly as you need headers that only come along with the
> > > > architecture.
> > > > 
> > > > In the case of DM9000 if it fails to build on your platform then the
> > > > driver is just buggy - looking at the Kconfig I rather suspect that the
> > > > dependency on architectures should just be removed.
> > > I wonder if allyesconfig/allmodconfig is supposed to include code that's
> > > known not to work for a particular architecture.
> > all*config just use whatever is in all of the various Kconfig* files.
> > If they say "depends on $somearch", then so be it.  If not, then the
> > remaining dependencies are used.
> 
> Yes, I know how it works. I just wonder if removing those dependencies so
> that all drivers (even if known not to work) are built on all*config
> is acceptable. Or maybe there should be a config like 'disable all drivers
> that are known to build but not to work on this arch'?

AFAIK, it's always the case that we prefer not to have
	depends on $somearch
for drivers, but the reality is that lots of them do depend on $ARCH
for header files etc., so they are listed that way.  There's not much
that we can do about that.  I don't think that removing those dependencies
is acceptable.  OTOH, it may be acceptable to enable CONFIG_BROKEN so that
the drivers that depend on BROKEN can try to be built.


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply

* Re: Driver build-testing (was: [PATCH] net: dm9000: Fix build)
From: Micha? Miros?aw @ 2011-04-18 15:17 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: Mark Brown, netdev
In-Reply-To: <20110418075508.f7b14f43.rdunlap@xenotime.net>

On Mon, Apr 18, 2011 at 07:55:08AM -0700, Randy Dunlap wrote:
> On Mon, 18 Apr 2011 16:51:02 +0200 Micha? Miros?aw wrote:
> 
> > On Mon, Apr 18, 2011 at 01:25:22PM +0100, Mark Brown wrote:
> > > On Mon, Apr 18, 2011 at 01:12:03PM +0200, Micha? Miros?aw wrote:
> > > > This brings the issue of build testing effectiveness. In current code
> > > > there is no configuration that makes all drivers build. I would like
> > > > to see something like 'make brokenconfig' that would allow most of
> > > > the code to be built, and not necessarily working. Maybe someone has
> > > > an idea how to implement that?
> > > For the drivers that genuinely are rather platform specific this tends
> > > to fail very badly as you need headers that only come along with the
> > > architecture.
> > > 
> > > In the case of DM9000 if it fails to build on your platform then the
> > > driver is just buggy - looking at the Kconfig I rather suspect that the
> > > dependency on architectures should just be removed.
> > I wonder if allyesconfig/allmodconfig is supposed to include code that's
> > known not to work for a particular architecture.
> all*config just use whatever is in all of the various Kconfig* files.
> If they say "depends on $somearch", then so be it.  If not, then the
> remaining dependencies are used.

Yes, I know how it works. I just wonder if removing those dependencies so
that all drivers (even if known not to work) are built on all*config
is acceptable. Or maybe there should be a config like 'disable all drivers
that are known to build but not to work on this arch'?

Best Regards,
Michał Mirosław

^ permalink raw reply

* Re: Driver build-testing (was: [PATCH] net: dm9000: Fix build)
From: Randy Dunlap @ 2011-04-18 14:55 UTC (permalink / raw)
  To: Micha? Miros?aw; +Cc: Mark Brown, netdev
In-Reply-To: <20110418145102.GA2555@rere.qmqm.pl>

On Mon, 18 Apr 2011 16:51:02 +0200 Micha? Miros?aw wrote:

> On Mon, Apr 18, 2011 at 01:25:22PM +0100, Mark Brown wrote:
> > On Mon, Apr 18, 2011 at 01:12:03PM +0200, Micha? Miros?aw wrote:
> > > On Mon, Apr 18, 2011 at 12:04:37PM +0100, Mark Brown wrote:
> > > > Commit c88fcb (net: dm9000: convert to hw_features) broke the build of
> > > > the dm9000 driver since it merged functions which use different names
> > > > for the board info structure used for I/O operations without updating
> > > > all the references to use the same name. Fix that.
> > > This brings the issue of build testing effectiveness. In current code
> > > there is no configuration that makes all drivers build. I would like
> > > to see something like 'make brokenconfig' that would allow most of
> > > the code to be built, and not necessarily working. Maybe someone has
> > > an idea how to implement that?
> > For the drivers that genuinely are rather platform specific this tends
> > to fail very badly as you need headers that only come along with the
> > architecture.
> > 
> > In the case of DM9000 if it fails to build on your platform then the
> > driver is just buggy - looking at the Kconfig I rather suspect that the
> > dependency on architectures should just be removed.
> 
> I wonder if allyesconfig/allmodconfig is supposed to include code that's
> known not to work for a particular architecture.

all*config just use whatever is in all of the various Kconfig* files.
If they say "depends on $somearch", then so be it.  If not, then the
remaining dependencies are used.

---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

^ permalink raw reply

* Re: Driver build-testing (was: [PATCH] net: dm9000: Fix build)
From: Micha? Miros?aw @ 2011-04-18 14:51 UTC (permalink / raw)
  To: Mark Brown; +Cc: netdev
In-Reply-To: <20110418122522.GC9462@sirena.org.uk>

On Mon, Apr 18, 2011 at 01:25:22PM +0100, Mark Brown wrote:
> On Mon, Apr 18, 2011 at 01:12:03PM +0200, Micha? Miros?aw wrote:
> > On Mon, Apr 18, 2011 at 12:04:37PM +0100, Mark Brown wrote:
> > > Commit c88fcb (net: dm9000: convert to hw_features) broke the build of
> > > the dm9000 driver since it merged functions which use different names
> > > for the board info structure used for I/O operations without updating
> > > all the references to use the same name. Fix that.
> > This brings the issue of build testing effectiveness. In current code
> > there is no configuration that makes all drivers build. I would like
> > to see something like 'make brokenconfig' that would allow most of
> > the code to be built, and not necessarily working. Maybe someone has
> > an idea how to implement that?
> For the drivers that genuinely are rather platform specific this tends
> to fail very badly as you need headers that only come along with the
> architecture.
> 
> In the case of DM9000 if it fails to build on your platform then the
> driver is just buggy - looking at the Kconfig I rather suspect that the
> dependency on architectures should just be removed.

I wonder if allyesconfig/allmodconfig is supposed to include code that's
known not to work for a particular architecture.

Best Regards,
Michał Mirosław

^ permalink raw reply

* Re: [PATCH] net: myri10ge: convert to hw_features
From: Jon Mason @ 2011-04-18 14:24 UTC (permalink / raw)
  To: Michał Mirosław; +Cc: David Miller, netdev, gallatin, brice
In-Reply-To: <20110418110347.GA18324@rere.qmqm.pl>

On Mon, Apr 18, 2011 at 01:03:47PM +0200, Michał Mirosław wrote:
> On Sun, Apr 17, 2011 at 11:30:53PM -0700, David Miller wrote:
> > From: Jon Mason <jon.mason@myri.com>
> > Date: Fri, 15 Apr 2011 13:29:22 -0500
> > 
> > > On Fri, Apr 15, 2011 at 04:50:50PM +0200, Michał Mirosław wrote:
> > >> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>

Acked-by: Jon Mason <jon.mason@myri.com>

> > >> ---
> > >>  drivers/net/myri10ge/myri10ge.c |   66 +++++++-------------------------------
> > >>  1 files changed, 12 insertions(+), 54 deletions(-)
> > >> 
> > >> diff --git a/drivers/net/myri10ge/myri10ge.c b/drivers/net/myri10ge/myri10ge.c
> > >> index 1446de5..a48eb92 100644
> > >> --- a/drivers/net/myri10ge/myri10ge.c
> > >> +++ b/drivers/net/myri10ge/myri10ge.c
> > >> @@ -205,7 +205,6 @@ struct myri10ge_priv {
> > >>  	int tx_boundary;	/* boundary transmits cannot cross */
> > >>  	int num_slices;
> > >>  	int running;		/* running?             */
> > >> -	int csum_flag;		/* rx_csums?            */
> > > Get rid of MXGEFW_FLAGS_CKSUM in drivers/net/myri10ge/myri10ge_mcp.h,
> > > as this was the only thing using it.
> >  ...
> > > ethtool_op_set_tso does not support TSO6.  This would remove the
> > > enable/disable of that feature.
> > Michał please fix these issues and resubmit this patch, thanks!
> 
> There are no issues. MXGEFW_FLAGS_CKSUM is used elsewhere in the driver
> and TSO6 is handled by masking netdev->hw_features at devinit time.
> 
> BTW, ethtool_op_set_tso() is not used at all in new offload changing scheme.
> 
> Best Regards,
> Michał Mirosław
> 

^ permalink raw reply

* Re: [PATCH 12/12] mm: Throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage
From: Mel Gorman @ 2011-04-18 14:08 UTC (permalink / raw)
  To: NeilBrown; +Cc: Linux-MM, Linux-Netdev, LKML, Peter Zijlstra
In-Reply-To: <20110418223251.7ab148bb@notabene.brown>

On Mon, Apr 18, 2011 at 10:32:51PM +1000, NeilBrown wrote:
> On Thu, 14 Apr 2011 11:41:38 +0100 Mel Gorman <mgorman@suse.de> wrote:
> 
> > If swap is backed by network storage such as NBD, there is a risk that a
> > large number of reclaimers can hang the system by consuming all
> > PF_MEMALLOC reserves. To avoid these hangs, the administrator must tune
> > min_free_kbytes in advance. This patch will throttle direct reclaimers
> > if half the PF_MEMALLOC reserves are in use as the system is at risk of
> > hanging. A message will be displayed so the administrator knows that
> > min_free_kbytes should be tuned to a higher value to avoid the
> > throttling in the future.
> > 
> 
> (I knew there was something else).
> 
> I understand that there are suggestions that direct reclaim should always be
> serialised as this reduces lock contention and improve data patterns (or
> something like that).
> 

AFAIK, this suggestion never got much beyond the "hand-waving" stage
of development. It tended to trip up on the fact that such a feature
could also throttle processes on machines with plenty of free clean
unmapped pagecache which would be undesirable.

> Would that make this patch redundant? 

Depends on how it was being serialised but ....

> Or does this provide some extra
> guarantee that the other proposal would not?
> 

This patch could be extended to serialise direct reclaims in situations
other than PFMEMALLOC is low if someone demonstrated the benefit.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: DSCP values in TCP handshake
From: Eric Dumazet @ 2011-04-18 14:05 UTC (permalink / raw)
  To: Joe Buehler; +Cc: netdev
In-Reply-To: <loom.20110418T154445-102@post.gmane.org>

Le lundi 18 avril 2011 à 13:48 +0000, Joe Buehler a écrit :
> What does the LINUX network code put in the DSCP field of TCP SYN-ACK packets
> that are replies to a SYN connection request?  I have a customer sending SYN
> with 0x2e in the DSCP field but apparently getting DSCP 0x0 in the reply.

The other way (server->client) is depending on application and listener
only.

If it uses a standard socket, standard bind() + listen(), TOS is 0




^ permalink raw reply

* Re: [PATCH 12/12] mm: Throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage
From: Mel Gorman @ 2011-04-18 14:04 UTC (permalink / raw)
  To: NeilBrown; +Cc: Linux-MM, Linux-Netdev, LKML, Peter Zijlstra
In-Reply-To: <20110418223014.22a6e490@notabene.brown>

On Mon, Apr 18, 2011 at 10:30:14PM +1000, NeilBrown wrote:
> On Thu, 14 Apr 2011 11:41:38 +0100 Mel Gorman <mgorman@suse.de> wrote:
> 
> > If swap is backed by network storage such as NBD, there is a risk that a
> > large number of reclaimers can hang the system by consuming all
> > PF_MEMALLOC reserves. To avoid these hangs, the administrator must tune
> > min_free_kbytes in advance. This patch will throttle direct reclaimers
> > if half the PF_MEMALLOC reserves are in use as the system is at risk of
> > hanging. A message will be displayed so the administrator knows that
> > min_free_kbytes should be tuned to a higher value to avoid the
> > throttling in the future.
> 
> This sounds like a much simpler approach than all the pre-allocation.
> Is it certain to work? 

It should - at least I haven't conceived of a situation where it would
fail yet nor have I triggered the throttling logic during tests. The
logic was tested with a debugging patch that set the throttling
level higher.

> Are PF_MEMALLOC reserved only used from direct
> reclaim?
> 

No. They are also used by kswapd and by a task that is being OOM killed.

> Is printing a message for the admin really a good idea? 

Ordinarily no but initially I wanted to make it brutually obvious
when throttling is hit and what got hit. Ultimately it's more likely
to be a tracepoint.

> Auto-tuning is much
> better than requiring the sysadmin to tune.

That requires the memory reservation and pre-allocation patches. To
keep complexity down, I wanted to treat that as a separate series.

> Is throttling when we are low on memory really a problem that needs to be
> tuned away? Presumably we would get over the memory shortage fairly soon and
> the throttling would stop (??).
> 

It depends on what the administrator wants really. If they don't care
about the stall, they can simply ignore the problem because as you say,
it should get resolved quickly and the throttled processes continue.

> > +	if (printk_ratelimit())
> > +		printk(KERN_INFO "Throttling %s due to reclaim pressure on "
> > +				 "network storage\n",
> > +			current->comm);
> > +	do {
> > +		prepare_to_wait(&zone->zone_pgdat->pfmemalloc_wait, &wait,
> > +							TASK_INTERRUPTIBLE);
> > +		schedule();
> > +		finish_wait(&zone->zone_pgdat->pfmemalloc_wait, &wait);
> > +	} while (!pfmemalloc_watermark_ok(zone->zone_pgdat, high_zoneidx) &&
> > +			!fatal_signal_pending(current));
> > +}
> > +
> 
> This looks racing.  It is my understanding that you should always perform the
> test between the 'prepare_to_wait' and the 'schedule'. Otherwise the wakeup
> could happen just before the prepare_to_wait and you never wake from the
> schedule.
> If that doesn't apply in this case I would appreciate a comment explaining
> why.
> 

You're right, it's racy. Well spotted.

> 
> 
> >  unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> >  				gfp_t gfp_mask, nodemask_t *nodemask)
> >  {
> > @@ -2131,6 +2188,8 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
> >  		.nodemask = nodemask,
> >  	};
> >  
> > +	throttle_direct_reclaim(gfp_mask, zonelist, nodemask);
> > +
> >  	trace_mm_vmscan_direct_reclaim_begin(order,
> >  				sc.may_writepage,
> >  				gfp_mask);
> > @@ -2482,6 +2541,13 @@ loop_again:
> >  			}
> >  
> >  		}
> > +
> > +		/* Wake throttled direct reclaimers if low watermark is met */
> > +		if (sk_memalloc_socks() &&
> > +				waitqueue_active(&pgdat->pfmemalloc_wait) &&
> > +				pfmemalloc_watermark_ok(pgdat, MAX_NR_ZONES - 1))
> > +			wake_up_interruptible(&pgdat->pfmemalloc_wait);
> > +
> 
> This test on sk_memalloc_socks looks ugly to me.
> The VM shouldn't be checking on some networking state.
> Do we really need the test?  It is not reasonable to always throttle direct
> reclaim when mem gets really low?

It's a micro-optimisation. Throttling is not currently necessary
as backing storage such as local block devices have mempools in
place that avoid dipping into the PF_MEMALLOC reserves. On a normal
configuration, that waitqueue will simply never be active so I can
remove the sk_memalloc_socks() test.

What about in slab though? A function call in the fast path is avoided
by using the sk_memalloc_socks tests which is nice.

> If we do need the test - could networking set some global flag in the VM
> which the VM can then test.

I'd like to keep the test in slab at least but adding a new global flag
feels like a waste. I could add a VM wrapper around sk_memalloc_socks()
that would effectively be a rename but that doesn't seem much better.

> Maybe one day we will have something other than network which needs special
> care with the last dregs of memory - then it could set the global flag too
> (in which case it should probably be a global counter).
> 

When/if that happens, the naming would become more obvious. Right now,
it's network-related so doesn't seem unreasonable to have a
network-related check.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* DSCP values in TCP handshake
From: Joe Buehler @ 2011-04-18 13:48 UTC (permalink / raw)
  To: netdev

What does the LINUX network code put in the DSCP field of TCP SYN-ACK packets
that are replies to a SYN connection request?  I have a customer sending SYN
with 0x2e in the DSCP field but apparently getting DSCP 0x0 in the reply.

Joe Buehler

^ permalink raw reply

* Re: BUG: unable to handle kernel NULL pointer dereference at 000002c0 / IP: [<c04c70f2>] in6_dev_finish_destroy+0x35/0x8c
From: Patrick McHardy @ 2011-04-18 13:34 UTC (permalink / raw)
  To: Simon Arlott
  Cc: Eric Dumazet, Linux Kernel Mailing List, netdev,
	Netfilter Development Mailinglist
In-Reply-To: <4DA86FE5.8080507@simon.arlott.org.uk>

[-- Attachment #1: Type: text/plain, Size: 785 bytes --]

Am 15.04.2011 18:18, schrieb Simon Arlott:
> On 15/04/11 14:24, Eric Dumazet wrote:
>> Hmm.. a more complete patch :
>>
>> diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c
>> index 0857272..6f0bed0 100644
> 
> I applied the patch by recompiling and then reloading the nf_conntrack_ipv6
> module (temporarily flushing and then restoring all ip6tables rules).
> Then this happened 10 minutes later:
> 
> [33876.950100] BUG: unable to handle kernel NULL pointer dereference at 00000014
> [33876.951060] IP: [<f9b012bb>] nf_ct_frag6_gather+0x864/0x881 [nf_conntrack_ipv6]

nf_ct_frag6_reasm() can return NULL, so we need to check for a non-NULL
ret_skb before trying to set the device.

Does this patch (based on Eric's second version) help?


[-- Attachment #2: x --]
[-- Type: text/plain, Size: 872 bytes --]

diff --git a/net/ipv6/netfilter/nf_conntrack_reasm.c b/net/ipv6/netfilter/nf_conntrack_reasm.c
index 0857272..b7ecfce 100644
--- a/net/ipv6/netfilter/nf_conntrack_reasm.c
+++ b/net/ipv6/netfilter/nf_conntrack_reasm.c
@@ -576,7 +576,9 @@ struct sk_buff *nf_ct_frag6_gather(struct sk_buff *skb, u32 user)
 	if (fq->q.last_in == (INET_FRAG_FIRST_IN | INET_FRAG_LAST_IN) &&
 	    fq->q.meat == fq->q.len) {
 		ret_skb = nf_ct_frag6_reasm(fq, dev);
-		if (ret_skb == NULL)
+		if (ret_skb != NULL)
+			ret_skb->dev = dev;
+		else
 			pr_debug("Can't reassemble fragmented packets\n");
 	}
 	spin_unlock_bh(&fq->q.lock);
@@ -602,7 +604,7 @@ void nf_ct_frag6_output(unsigned int hooknum, struct sk_buff *skb,
 
 		s2 = s->next;
 		s->next = NULL;
-
+		s->dev = in;
 		NF_HOOK_THRESH(NFPROTO_IPV6, hooknum, s, in, out, okfn,
 			       NF_IP6_PRI_CONNTRACK_DEFRAG + 1);
 		s = s2;

^ permalink raw reply related

* Re: [PATCH 12/12] mm: Throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage
From: NeilBrown @ 2011-04-18 12:32 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Linux-MM, Linux-Netdev, LKML, Peter Zijlstra
In-Reply-To: <1302777698-28237-13-git-send-email-mgorman@suse.de>

On Thu, 14 Apr 2011 11:41:38 +0100 Mel Gorman <mgorman@suse.de> wrote:

> If swap is backed by network storage such as NBD, there is a risk that a
> large number of reclaimers can hang the system by consuming all
> PF_MEMALLOC reserves. To avoid these hangs, the administrator must tune
> min_free_kbytes in advance. This patch will throttle direct reclaimers
> if half the PF_MEMALLOC reserves are in use as the system is at risk of
> hanging. A message will be displayed so the administrator knows that
> min_free_kbytes should be tuned to a higher value to avoid the
> throttling in the future.
> 

(I knew there was something else).

I understand that there are suggestions that direct reclaim should always be
serialised as this reduces lock contention and improve data patterns (or
something like that).

Would that make this patch redundant?  Or does this provide some extra
guarantee that the other proposal would not?

Thanks again,

NeilBrown

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: [PATCH 12/12] mm: Throttle direct reclaimers if PF_MEMALLOC reserves are low and swap is backed by network storage
From: NeilBrown @ 2011-04-18 12:30 UTC (permalink / raw)
  To: Mel Gorman; +Cc: Linux-MM, Linux-Netdev, LKML, Peter Zijlstra
In-Reply-To: <1302777698-28237-13-git-send-email-mgorman@suse.de>

On Thu, 14 Apr 2011 11:41:38 +0100 Mel Gorman <mgorman@suse.de> wrote:

> If swap is backed by network storage such as NBD, there is a risk that a
> large number of reclaimers can hang the system by consuming all
> PF_MEMALLOC reserves. To avoid these hangs, the administrator must tune
> min_free_kbytes in advance. This patch will throttle direct reclaimers
> if half the PF_MEMALLOC reserves are in use as the system is at risk of
> hanging. A message will be displayed so the administrator knows that
> min_free_kbytes should be tuned to a higher value to avoid the
> throttling in the future.

This sounds like a much simpler approach than all the pre-allocation.
Is it certain to work?  Are PF_MEMALLOC reserved only used from direct
reclaim?

Is printing a message for the admin really a good idea?  Auto-tuning is much
better than requiring the sysadmin to tune.
Is throttling when we are low on memory really a problem that needs to be
tuned away?  Presumably we would get over the memory shortage fairly soon and
the throttling would stop (??).

> +	if (printk_ratelimit())
> +		printk(KERN_INFO "Throttling %s due to reclaim pressure on "
> +				 "network storage\n",
> +			current->comm);
> +	do {
> +		prepare_to_wait(&zone->zone_pgdat->pfmemalloc_wait, &wait,
> +							TASK_INTERRUPTIBLE);
> +		schedule();
> +		finish_wait(&zone->zone_pgdat->pfmemalloc_wait, &wait);
> +	} while (!pfmemalloc_watermark_ok(zone->zone_pgdat, high_zoneidx) &&
> +			!fatal_signal_pending(current));
> +}
> +

This looks racing.  It is my understanding that you should always perform the
test between the 'prepare_to_wait' and the 'schedule'.  Otherwise the wakeup
could happen just before the prepare_to_wait and you never wake from the
schedule.
If that doesn't apply in this case I would appreciate a comment explaining
why.

>  unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
>  				gfp_t gfp_mask, nodemask_t *nodemask)
>  {
> @@ -2131,6 +2188,8 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order,
>  		.nodemask = nodemask,
>  	};
>  
> +	throttle_direct_reclaim(gfp_mask, zonelist, nodemask);
> +
>  	trace_mm_vmscan_direct_reclaim_begin(order,
>  				sc.may_writepage,
>  				gfp_mask);
> @@ -2482,6 +2541,13 @@ loop_again:
>  			}
>  
>  		}
> +
> +		/* Wake throttled direct reclaimers if low watermark is met */
> +		if (sk_memalloc_socks() &&
> +				waitqueue_active(&pgdat->pfmemalloc_wait) &&
> +				pfmemalloc_watermark_ok(pgdat, MAX_NR_ZONES - 1))
> +			wake_up_interruptible(&pgdat->pfmemalloc_wait);
> +

This test on sk_memalloc_socks looks ugly to me.
The VM shouldn't be checking on some networking state.

Do we really need the test?  It is not reasonable to always throttle direct
reclaim when mem gets really low?
If we do need the test - could networking set some global flag in the VM
which the VM can then test.
Maybe one day we will have something other than network which needs special
care with the last dregs of memory - then it could set the global flag too
(in which case it should probably be a global counter).

Thanks,
NeilBrown

>  		if (all_zones_ok || (order && pgdat_balanced(pgdat, balanced, *classzone_idx)))
>  			break;		/* kswapd: all done */
>  		/*

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply

* Re: Driver build-testing (was: [PATCH] net: dm9000: Fix build)
From: Mark Brown @ 2011-04-18 12:25 UTC (permalink / raw)
  To: Micha? Miros?aw; +Cc: netdev
In-Reply-To: <20110418111203.GB18324@rere.qmqm.pl>

On Mon, Apr 18, 2011 at 01:12:03PM +0200, Micha? Miros?aw wrote:
> On Mon, Apr 18, 2011 at 12:04:37PM +0100, Mark Brown wrote:
> > Commit c88fcb (net: dm9000: convert to hw_features) broke the build of
> > the dm9000 driver since it merged functions which use different names
> > for the board info structure used for I/O operations without updating
> > all the references to use the same name. Fix that.

> This brings the issue of build testing effectiveness. In current code
> there is no configuration that makes all drivers build. I would like
> to see something like 'make brokenconfig' that would allow most of
> the code to be built, and not necessarily working. Maybe someone has
> an idea how to implement that?

For the drivers that genuinely are rather platform specific this tends
to fail very badly as you need headers that only come along with the
architecture.

In the case of DM9000 if it fails to build on your platform then the
driver is just buggy - looking at the Kconfig I rather suspect that the
dependency on architectures should just be removed.

^ permalink raw reply

* Re: [PATCH v2 01/27] HFI: skeleton driver
From: Ben Hutchings @ 2011-04-18 12:23 UTC (permalink / raw)
  To: dykmanj
  Cc: netdev, Piyush Chaudhary, Fu-Chung Chang, William S. Cadden,
	Wen C. Chen, Scot Sakolish, Jian Xiao, Carol L. Soto,
	Sarah J. Sheppard
In-Reply-To: <1303096919-7367-2-git-send-email-dykmanj@linux.vnet.ibm.com>

On Sun, 2011-04-17 at 23:21 -0400, dykmanj@linux.vnet.ibm.com wrote:
> From: Jim Dykman <dykmanj@linux.vnet.ibm.com>
> 
> Device driver Makefile & Kconfig plumbing plus simple mod_init and mod_exit
[...]
> --- /dev/null
> +++ b/drivers/net/hfi/core/hfidd_init.c
[...]
> +#include <linux/version.h>

Never include <linux/version.h> in an in-tree driver.

[...]
> +static int __init hfidd_mod_init(void)
> +{
> +	int			rc = 0;
> +
> +	rc = hfidd_create_class();
> +	if (rc < 0) {
> +		printk(KERN_ERR "%s: hfidd_mod_init: hfidd_create_class failed"
> +			" rc=%d\n", HFIDD_DEV_NAME, rc);
> +		return -1;

Should be 'return rc'.

[...]
> --- /dev/null
> +++ b/include/linux/hfi/hfidd_client.h
[...]
> +#ifndef _HFIDD_CLIENT_H_
> +#define _HFIDD_CLIENT_H_
> +
> +#define MAX_TORRENTS            1
> +#define MAX_HFI_PER_TORRENT     2
> +#define MAX_HFIS                (MAX_TORRENTS * MAX_HFI_PER_TORRENT)
[...]

Are you sure you want to expose these values to userland?  You can never
change them later.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Re: [PATCH v2 02/27] HFI: Add HFI adapter control structure
From: Ben Hutchings @ 2011-04-18 12:19 UTC (permalink / raw)
  To: dykmanj
  Cc: netdev, Piyush Chaudhary, Fu-Chung Chang, William S. Cadden,
	Wen C. Chen, Scot Sakolish, Jian Xiao, Carol L. Soto,
	Sarah J. Sheppard
In-Reply-To: <1303096919-7367-3-git-send-email-dykmanj@linux.vnet.ibm.com>

On Sun, 2011-04-17 at 23:21 -0400, dykmanj@linux.vnet.ibm.com wrote:
> From: Jim Dykman <dykmanj@linux.vnet.ibm.com>
> 
> Alloc/free of hfidd_acs to track the state of each HFI
[...]
> --- /dev/null
> +++ b/drivers/net/hfi/core/hfidd_adpt.c
[...]
> +int hfidd_alloc_adapter(struct hfidd_acs **adpt, dev_t devno, void *uiop)
> +{
> +
> +	struct hfidd_acs	*p_acs = NULL;
> +
> +	p_acs = kzalloc(sizeof(*p_acs), GFP_KERNEL);
> +	if (p_acs == NULL)
> +		return -ENOMEM;
> +
> +	p_acs->dev_num = devno;
> +	p_acs->index  = MINOR(devno);
> +	p_acs->state  = HFI_INVALID;
> +	snprintf(p_acs->name, HFI_DEVICE_NAME_MAX - 1,
> +			"%s%d", HFIDD_DEV_NAME, p_acs->index);

snprintf() always null-terminates so the buffer length should be
specified as HFI_DEVICE_NAME_MAX or sizeof(p_acs->name).

[...]
> --- a/drivers/net/hfi/core/hfidd_init.c
> +++ b/drivers/net/hfi/core/hfidd_init.c
[...]
>  static int __init hfidd_mod_init(void)
>  {
>  	int			rc = 0;
>  
> +	hfidd_global.acs_cnt = 0;
> +
>  	rc = hfidd_create_class();
>  	if (rc < 0) {
>  		printk(KERN_ERR "%s: hfidd_mod_init: hfidd_create_class failed"
> @@ -129,12 +172,26 @@ static int __init hfidd_mod_init(void)
>  		return -1;
>  	}
>  
> +	rc = hfidd_create_devices();
> +	if (rc < 0) {
> +		printk(KERN_ERR "%s: hfidd_mod_init: hfidd_create_devices"
> +			" failed rc = %d\n", HFIDD_DEV_NAME, rc);
> +		goto error1;
> +	}
> +
>  	printk(KERN_INFO "IBM hfi device driver loaded sucessfully\n");
>  	return 0;
> +
> +error1:
> +	hfidd_destroy_class();
> +
> +	/* Returning -1 so insmod will fail */
> +	return -1;
>  }
[...]

Should be 'return rc'.  Never return -1 as a generic failure; it means
-EPERM.

Ben.

-- 
Ben Hutchings, Senior Software Engineer, Solarflare
Not speaking for my employer; that's the marketing department's job.
They asked us to note that Solarflare product names are trademarked.


^ permalink raw reply

* Driver build-testing (was: [PATCH] net: dm9000: Fix build)
From: Michał Mirosław @ 2011-04-18 11:12 UTC (permalink / raw)
  To: netdev
In-Reply-To: <1303124677-6168-1-git-send-email-broonie@opensource.wolfsonmicro.com>

On Mon, Apr 18, 2011 at 12:04:37PM +0100, Mark Brown wrote:
> Commit c88fcb (net: dm9000: convert to hw_features) broke the build of
> the dm9000 driver since it merged functions which use different names
> for the board info structure used for I/O operations without updating
> all the references to use the same name. Fix that.

[...]

This brings the issue of build testing effectiveness. In current code
there is no configuration that makes all drivers build. I would like
to see something like 'make brokenconfig' that would allow most of
the code to be built, and not necessarily working. Maybe someone has
an idea how to implement that?

Best Regards,
Michał Mirosław

^ permalink raw reply

* [PATCH] net: dm9000: Fix build
From: Mark Brown @ 2011-04-18 11:04 UTC (permalink / raw)
  To: David S. Miller, Michał Mirosław; +Cc: netdev, patches, Mark Brown

Commit c88fcb (net: dm9000: convert to hw_features) broke the build of
the dm9000 driver since it merged functions which use different names
for the board info structure used for I/O operations without updating
all the references to use the same name. Fix that.

Signed-off-by: Mark Brown <broonie@opensource.wolfsonmicro.com>
---
 drivers/net/dm9000.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/net/dm9000.c b/drivers/net/dm9000.c
index f7bdebb..fbaff35 100644
--- a/drivers/net/dm9000.c
+++ b/drivers/net/dm9000.c
@@ -768,7 +768,7 @@ dm9000_init_dm9000(struct net_device *dev)

 	/* Checksum mode */
 	if (dev->hw_features & NETIF_F_RXCSUM)
-		iow(dm, DM9000_RCSR,
+		iow(db, DM9000_RCSR,
 			(dev->features & NETIF_F_RXCSUM) ? RCSR_CSUM : 0);

 	iow(db, DM9000_GPCR, GPCR_GEP_CNTL);	/* Let GPIO0 output */
-- 
1.7.4.1

^ permalink raw reply related

* Re: [PATCH] net: myri10ge: convert to hw_features
From: Michał Mirosław @ 2011-04-18 11:03 UTC (permalink / raw)
  To: David Miller; +Cc: jon.mason, netdev, gallatin, brice
In-Reply-To: <20110417.233053.70188684.davem@davemloft.net>

On Sun, Apr 17, 2011 at 11:30:53PM -0700, David Miller wrote:
> From: Jon Mason <jon.mason@myri.com>
> Date: Fri, 15 Apr 2011 13:29:22 -0500
> 
> > On Fri, Apr 15, 2011 at 04:50:50PM +0200, Michał Mirosław wrote:
> >> Signed-off-by: Michał Mirosław <mirq-linux@rere.qmqm.pl>
> >> ---
> >>  drivers/net/myri10ge/myri10ge.c |   66 +++++++-------------------------------
> >>  1 files changed, 12 insertions(+), 54 deletions(-)
> >> 
> >> diff --git a/drivers/net/myri10ge/myri10ge.c b/drivers/net/myri10ge/myri10ge.c
> >> index 1446de5..a48eb92 100644
> >> --- a/drivers/net/myri10ge/myri10ge.c
> >> +++ b/drivers/net/myri10ge/myri10ge.c
> >> @@ -205,7 +205,6 @@ struct myri10ge_priv {
> >>  	int tx_boundary;	/* boundary transmits cannot cross */
> >>  	int num_slices;
> >>  	int running;		/* running?             */
> >> -	int csum_flag;		/* rx_csums?            */
> > Get rid of MXGEFW_FLAGS_CKSUM in drivers/net/myri10ge/myri10ge_mcp.h,
> > as this was the only thing using it.
>  ...
> > ethtool_op_set_tso does not support TSO6.  This would remove the
> > enable/disable of that feature.
> Michał please fix these issues and resubmit this patch, thanks!

There are no issues. MXGEFW_FLAGS_CKSUM is used elsewhere in the driver
and TSO6 is handled by masking netdev->hw_features at devinit time.

BTW, ethtool_op_set_tso() is not used at all in new offload changing scheme.

Best Regards,
Michał Mirosław

^ permalink raw reply

* Re: unregister_netdevice: waiting for lo to become free. Usage count = 8
From: Hans Schillstrom @ 2011-04-18 10:43 UTC (permalink / raw)
  To: Julian Anastasov; +Cc: Simon Horman, netdev, lvs-devel, Eric W. Biederman
In-Reply-To: <201104180810.27198.hans@schillstrom.com>

Hello
On Monday, April 18, 2011 08:10:26 Hans Schillstrom wrote:
> On Friday, April 15, 2011 22:11:32 Julian Anastasov wrote:
> > 
> > 	Hello,
> > 
> > On Fri, 15 Apr 2011, Hans Schillstrom wrote:
> > 
> > > Hello Julian
> > > 
> > > I'm trying to fix the cleanup process when a namespace get "killed",
> > > which is a new feature for ipvs. However an old problem appears again
> > > 
> > > When there has been traffic trough ipvs where the destination is unreachable
> > > the usage count on loopback dev increases one for every packet....
[snip]

> > 
> > > Do you have an idea why  this happens in the ipvs case ?
> > 
> > 	Do you see with debug level 3 the "Removing destination"
> > messages. Only real servers can hold dest->dst_cache reference
> > for dev which can be a problem because the real servers are not
> > deleted immediately - on traffic they are moved to trash
> > list. 

Actually I forgot to tell there is a need for a
ip_vs_service_cleanup() due to above.
Do you see any drawbacks with it ?

/*
 *	Delete service by {netns} in the service table.
 */
static void ip_vs_service_cleanup(struct net *net)
{
	unsigned hash;
	struct ip_vs_service *svc, *tmp;

	EnterFunction(2);
	/* Check for "full" addressed entries */
	for (hash = 0; hash<IP_VS_SVC_TAB_SIZE; hash++) {
		write_lock_bh(&__ip_vs_svc_lock);
		list_for_each_entry_safe(svc, tmp, &ip_vs_svc_table[hash],
					 s_list) {
			if (net_eq(svc->net, net)) {
				ip_vs_svc_unhash(svc);
				__ip_vs_del_service(svc);
			}
		}
		list_for_each_entry_safe(svc, tmp, &ip_vs_svc_fwm_table[hash],
					 f_list) {
			if (net_eq(svc->net, net)) {
				ip_vs_svc_unhash(svc);
				__ip_vs_del_service(svc);
			}
		}
		write_unlock_bh(&__ip_vs_svc_lock);
	}
	LeaveFunction(2);
}

Called just after the __ip_vs_control_cleanup_sysctl()

static void __net_exit __ip_vs_control_cleanup(struct net *net)
{
	struct netns_ipvs *ipvs = net_ipvs(net);

	ip_vs_trash_cleanup(net);
	ip_vs_stop_estimator(net, &ipvs->tot_stats);
	__ip_vs_control_cleanup_sysctl(net);
	ip_vs_service_cleanup(net);
	proc_net_remove(net, "ip_vs_stats_percpu");
	proc_net_remove(net, "ip_vs_stats");
	proc_net_remove(net, "ip_vs");
	free_percpu(ipvs->tot_stats.cpustats);
}


> > But ip_vs_trash_cleanup() should remove any left
> > structures. You should check in debug that all servers are
> > deleted. If all real server structures are freed but
> > problem remains we should look more deeply in the
> > dest->dst_cache usage. DR or NAT is used?
> 
> I have got some wise words from Eric, 
> i.e. moved all ipvs register/unregister from subsys to device 
> that solved plenty of my issues
> (Thanks Eric)
> 
> I'll will post a Patch later on regarding this.
> 
> > 
> > 	I assume cleanup really happens in this order:
> > 
> > ip_vs_cleanup():
> > 	nf_unregister_hooks()
> 
> This will not happens in a namespace since nf_unregister_hooks() is not per netns.
> We might need a flag but I don't think so, further test will show....
> 
> > 	...
> > 	ip_vs_conn_cleanup()
> > 	...
> > 	ip_vs_control_cleanup()
> > 
> 
Regards
Hans

^ permalink raw reply

* Re: [PATCH] net: greth: convert to hw_features
From: Kristoffer Glembo @ 2011-04-18  8:11 UTC (permalink / raw)
  To: Michał Mirosław; +Cc: netdev
In-Reply-To: <20110417101547.D59E313A6F@rere.qmqm.pl>

Michał Mirosław wrote:
> Note: Driver modifies its struct net_device_ops. This will break if used for
> multiple devices that are not all the same (if that HW config is possible).
> 

True. I will fix this in the future. Not a big problem currently though, but thanks for the note and the patch!

Best regards,
Kristoffer Glembo
Aeroflex Gaisler

^ permalink raw reply

* Re: [PATCH V14 3/4] ptp: Added a clock driver for the IXP46x.
From: Arnd Bergmann @ 2011-04-18  8:26 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Thomas Gleixner, Rodolfo Giometti, Peter Zijlstra,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	devicetree-discuss-uLR06cmDAlY/bJ5BZ2RsiQ,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA, Russell King, John Stultz,
	Alan Cox, netdev-u79uwXL29TY76Z2rM5mHXA, Mike Frysinger,
	Christoph Lameter, linuxppc-dev-uLR06cmDAlY/bJ5BZ2RsiQ,
	David Miller, linux-arm-kernel-IAPFreCvJWM7uuMidbF8XUB+6BGkLq7r,
	Krzysztof Halasa
In-Reply-To: <201104181021.01279.arnd-r2nGTMty4D4@public.gmane.org>

On Monday 18 April 2011, Arnd Bergmann wrote:
> On Monday 18 April 2011, Richard Cochran wrote:
> > On Mon, Apr 18, 2011 at 08:56:03AM +0200, Arnd Bergmann wrote:
> > > On Monday 18 April 2011, Richard Cochran wrote:
> > > > +
> > > > +       lo = __raw_readl(&regs->channel[ch].src_uuid_lo);
> > > > +       hi = __raw_readl(&regs->channel[ch].src_uuid_hi);
> > > > +
> > > 
> > > I guess you should use readl(), not __raw_readl() here. The __raw_* functions
> > > are not meant for device drivers.
> > 
> > Krzysztof had a different opinion about this.
> > 
> >     https://lkml.org/lkml/2011/1/8/67
> > 
> > Anyway, it is his driver, and I just followed what he does elsewhere
> > in the driver. It make sense to me to keep the driver consistent.
> > Maybe we should make the change throughout?
> 
> It would certainly be useful to fix it up. I now realized that this driver
> supports both big-endian and little-endian configurations, so just using
> readl() is certainly broken.
> There should probably be an ixp specific version of the safe I/O accessors
> to deal with it on all on-chip peripherals in a safe way.

Anyway, not your problem then, just leave it like it is.

Acked-by: Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>

^ permalink raw reply

* Re: [PATCH V14 3/4] ptp: Added a clock driver for the IXP46x.
From: Arnd Bergmann @ 2011-04-18  8:21 UTC (permalink / raw)
  To: Richard Cochran
  Cc: Thomas Gleixner, Rodolfo Giometti, Peter Zijlstra, linux-api,
	devicetree-discuss, linux-kernel, Russell King, Paul Mackerras,
	John Stultz, Alan Cox, netdev, Mike Frysinger, Christoph Lameter,
	linuxppc-dev, David Miller, linux-arm-kernel, Krzysztof Halasa
In-Reply-To: <20110418080606.GA3809@riccoc20.at.omicron.at>

On Monday 18 April 2011, Richard Cochran wrote:
> On Mon, Apr 18, 2011 at 08:56:03AM +0200, Arnd Bergmann wrote:
> > On Monday 18 April 2011, Richard Cochran wrote:
> > > +
> > > +       lo = __raw_readl(&regs->channel[ch].src_uuid_lo);
> > > +       hi = __raw_readl(&regs->channel[ch].src_uuid_hi);
> > > +
> > 
> > I guess you should use readl(), not __raw_readl() here. The __raw_* functions
> > are not meant for device drivers.
> 
> Krzysztof had a different opinion about this.
> 
>     https://lkml.org/lkml/2011/1/8/67
> 
> Anyway, it is his driver, and I just followed what he does elsewhere
> in the driver. It make sense to me to keep the driver consistent.
> Maybe we should make the change throughout?

It would certainly be useful to fix it up. I now realized that this driver
supports both big-endian and little-endian configurations, so just using
readl() is certainly broken.
There should probably be an ixp specific version of the safe I/O accessors
to deal with it on all on-chip peripherals in a safe way.

	Arnd

^ permalink raw reply

* Re: [PATCH V14 3/4] ptp: Added a clock driver for the IXP46x.
From: Richard Cochran @ 2011-04-18  8:16 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: linux-kernel, linux-api, netdev, devicetree-discuss,
	linux-arm-kernel, linuxppc-dev, Alan Cox, Christoph Lameter,
	David Miller, John Stultz, Krzysztof Halasa, Peter Zijlstra,
	Rodolfo Giometti, Thomas Gleixner, Benjamin Herrenschmidt,
	Mike Frysinger, Paul Mackerras, Russell King
In-Reply-To: <20110418080606.GA3809@riccoc20.at.omicron.at>


Also, I forgot to add Krzysztof's ack to the commit message, but he
did ack V12.

Thanks,
Richard

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox