Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Stefan Richter @ 2007-08-16  9:54 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <20070816081049.GA1431@gondor.apana.org.au>

Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 10:06:31AM +0200, Stefan Richter wrote:
>> > 
>> > Do you (or anyone else for that matter) have an example of this?
>> 
>> The only code I somewhat know, the ieee1394 subsystem, was perhaps
>> authored and is currently maintained with the expectation that each
>> occurrence of atomic_read actually results in a load operation, i.e. is
>> not optimized away.  This means all atomic_t (bus generation, packet and
>> buffer refcounts, and some other state variables)* and likewise all
>> atomic bitops in that subsystem.
> 
> Can you find an actual atomic_read code snippet there that is
> broken without the volatile modifier?

What do I have to look for?  atomic_read after another read or write
access to the same variable, in the same function scope?  Or in the sum
of scopes of functions that could be merged by function inlining?

One example was discussed here earlier:  The for (;;) loop in
nodemgr_host_thread.  There an msleep_interruptible implicitly acted as
barrier (at the moment because it's in a different translation unit; if
it were the same, then because it hopefully has own barriers).  So that
happens to work, although such an implicit barrier is bad style:  Better
enforce the desired behaviour (== guaranteed load operation) *explicitly*.
-- 
Stefan Richter
-=====-=-=== =--- =----
http://arcgraph.de/sr/

^ permalink raw reply

* Re: [PATCH] [IPv6]: Invalid semicolon after if statement
From: Ilpo Järvinen @ 2007-08-16  9:25 UTC (permalink / raw)
  To: David Miller; +Cc: davej, Netdev
In-Reply-To: <20070815.175128.82362004.davem@davemloft.net>

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1996 bytes --]

On Wed, 15 Aug 2007, David Miller wrote:

> From: Dave Jones <davej@redhat.com>
> Date: Wed, 15 Aug 2007 19:52:20 -0400
> 
> > On Wed, Aug 15, 2007 at 03:08:14PM -0700, David Miller wrote:
> >  > From: "Ilpo_Järvinen" <ilpo.jarvinen@helsinki.fi>
> >  > Date: Thu, 16 Aug 2007 00:57:00 +0300 (EEST)
> >  > 
> >  > > A similar fix to netfilter from Eric Dumazet inspired me to
> >  > > look around a bit by using some grep/sed stuff as looking for
> >  > > this kind of bugs seemed easy to automate. This is one of them
> >  > > I found where it looks like this semicolon is not valid.
> >  > > 
> >  > > Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@helsinki.fi>
> >  > 
> >  > Yikes!  Makes you want to audit the entire tree for these
> >  > things :-)))
> >  
> > Indeed.  Here's another one.
> > 
> > Signed-off-by: Dave Jones <davej@redhat.com>
> 
> That got fixed the other day and is the "A similar fix to netfilter
> from Eric Dumazet" Ilpo is reffering to above :)

...heh, when I said "a bit", I meant that it took a very little effort
to check over the whole tree... :-)

...I've already reported all four things one could find from whole tree 
with this cmd (to the relevant parties), so no need for you to redo the 
effort :-) :

$ for i in `find . -name '*.[ch]'`; do echo $i;
	sed -n -e '/^\t*if *[(].*[)] *; *$/ N'
		-e '/^\(\t*\)if *[(].*[)] *; *\n\1\t/ p' $i; done | tee result

...Basically it checks if the next line has more indentation than the
if line. ...Obviously it will work only if the code follows current 
CodingStyle in indentation. I'm currently experimenting with indent 
preprocessing step... ...but I'm not yet sure if I can pull something 
useful out from that too :-)

...Better cmdlines could be invented, e.g. by manually checking every "if 
();" lines once one has first automated separation of them from "if () 
do_smthg();" lines (I haven't learned myself how to easily count/work with 
parenthesis pairs in a cmdline, which seems necessary here)...

-- 
 i.

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Satyam Sharma @ 2007-08-16  9:25 UTC (permalink / raw)
  To: Bill Fink
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney
In-Reply-To: <alpine.LFD.0.999.0708161115370.25951@enigma.security.iitk.ac.in>

[ Bill tells me in private communication he gets this already, but I
  think it's more complicated than the shoddy explanation I'd made
  earlier so would wish to make this clearer in detail one last time,
  for the benefit of others listening in or reading the archives. ]


On Thu, 16 Aug 2007, Satyam Sharma wrote:

> On Thu, 16 Aug 2007, Satyam Sharma wrote:
> [...]
> > On Wed, 15 Aug 2007, Bill Fink wrote:
> > > [...]
> > > I'm curious about one minor tangential point.  Why, instead of:
> > > 
> > > 	b = *(volatile int *)&a;
> > > 
> > > why can't this just be expressed as:
> > > 
> > > 	b = (volatile int)a;
> > > 
> > > Isn't it the contents of a that's volatile, i.e. it's value can change
> > > invisibly to the compiler, and that's why you want to force a read from
> > > memory?  Why do you need the "*(volatile int *)&" construct?
> > 
> > "b = (volatile int)a;" doesn't help us because a cast to a qualified type
> > has the same effect as a cast to an unqualified version of that type, as
> > mentioned in 6.5.4:4 (footnote 86) of the standard. Note that "volatile"
> > is a type-qualifier, not a type itself, so a cast of the _object_ itself
> > to a qualified-type i.e. (volatile int) would not make the access itself
> > volatile-qualified.

Casts don't produce lvalues, and the cast ((volatile int)a) does not
produce the object-int-a-qualified-as-"volatile" -- in fact, the
result of the above cast is whatever is the _value_ of "int a", with
the access to that object having _already_ taken place, as per the
actual type-qualification of the object (that was originally declared
as being _non-volatile_, in fact). Hence, defining atomic_read() as:

#define atomic_read(v)          ((volatile int)((v)->counter))

would be buggy and not give "volatility" semantics at all, unless the
"counter" object itself isn't volatile-qualified already (which it
isn't).

The result of the cast itself being the _value_ of the int object, and
not the object itself (i.e., not an lvalue), is thereby independent of
type-qualification in that cast itself (it just wouldn't make any
difference), hence the "cast to a qualified type has the same effect
as a cast to an unqualified version of that type" bit in section 6.5.4:4
of the standard.


> > To serve our purposes, it is necessary for us to take the address of this
> > (non-volatile) object, cast the resulting _pointer_ to the corresponding
> > volatile-qualified pointer-type, and then dereference it. This makes that
> > particular _access_ be volatile-qualified, without the object itself being
> > such. Also note that the (dereferenced) result is also a valid lvalue and
> > hence can be used in "*(volatile int *)&a = b;" kind of construction
> > (which we use for the atomic_set case).

Dereferencing using the *(pointer-type-cast)& construct, OTOH, serves
us well:

#define atomic_read(v)          (*(volatile int *)&(v)->counter)

Firstly, note that the cast here being (volatile int *) and not
(int * volatile) qualifies the type of the _object_ being pointed to
by the pointer in question as being volatile-qualified, and not the
pointer itself (6.2.5:27 of the standard, and 6.3.2.3:2 allows us to
convert from a pointer-to-non-volatile-qualified-int to a pointer-to-
volatile-qualified-int, which suits us just fine) -- but note that
the _access_ to that address itself has not yet occurred.

_After_ specifying the memory address as containing a volatile-qualified-
int-type object, (and GCC co-operates as mentioned below), we proceed to
dereference it, which is when the _actual access_ occurs, therefore with
"volatility" semantics this time.

Interesting.


> Here, I should obviously admit that the semantics of *(volatile int *)&
> aren't any neater or well-defined in the _language standard_ at all. The
> standard does say (verbatim) "precisely what constitutes as access to
> object of volatile-qualified type is implementation-defined", but GCC
> does help us out here by doing the right thing. Accessing the non-volatile
> object there using the volatile-qualified pointer-type cast makes GCC
> treat the object stored at that memory address itself as if it were a 
> volatile object, thus making the access end up having what we're calling
> as "volatility" semantics here.
> 
> Honestly, given such confusion, and the propensity of the "volatile"
> type-qualifier keyword to be ill-defined (or at least poorly understood,
> often inconsistently implemented), I'd (again) express my opinion that it
> would be best to avoid its usage, given other alternatives do exist.


Satyam

^ permalink raw reply

* Re: [GENETLINK] some thoughts on the usage
From: Richard MUSIL @ 2007-08-16  8:50 UTC (permalink / raw)
  To: Thomas Graf; +Cc: netdev
In-Reply-To: <20070815175009.GA32236@postel.suug.ch>

Thomas Graf wrote:
> * Richard MUSIL <richard.musil@st.com> 2007-08-10 10:45
>> I have noticed that although ops for each family are the same (each
>> device is functionally same) I cannot use same genl_ops struct for
>> registration, because it uses internal member to link in list. Therefore
>> it is necessary to allocate new genl_ops for each device and pass it to
>> registration. But I cannot "officially" use this list to track those
>> genl_ops (so I can properly destroy them later), because there is no
>> interface. So I need to redo the management of the structures on my own.
> 
> The intended usage of the interface in your example would be to register
> only one genetlink family, say "tpm", register one set of operations
> and then have an attribute in every message which specifies which TPM
> device to use. This helps keeping the total number of genetlink families
> down.

I got your point. The fact that there are several families of the same
device type seems however quite convenient. For example, I
create/register virtual device /dev/tpm0 and register family with the
same name for that device, the same for /dev/tpm1 etc. Then I got
straightforward association in between devices and families and get for
free the whole management what happen if I try to talk to device which
is not registered/present etc.
I could multiplex it over one channel, but I will need to make the
communication protocol more complex and make me handle all exceptions
myself.
Since in my case there will be probably not more than one device
present, and the device itself is quite "exotic" I would probably not
rewrite it to the multiplexing scheme, but I understand what you mean
and will take it into account next time I face decision how to use
genetlink.

However I am still wondering, whether the allocation of structures
(genl_family, genl_ops) should not be done by genetlink layer instead of
the user (I mean allocating copy of the struct passed by user). This is
for example, what TPM layer (tpm.c) does. This would come at slight cost
at memory usage and performance, but will protect the caller from
inspecting internal behavior and taking care of that. And internal links
would nicely help in keeping of track of allocated structures. I could
write a patch for this.

>> The second "inconvenience" is that for each family I register, I also
>> register basically same ops (basically means, the definitions, and doit,
>> dumpit handlers are same, though the structures are at different
>> addresses for reasons described above). When the handler receives the
>> message it needs to associate the message with the actual device it is
>> handling. This could be done through family lookup (using
>> nlmsghdr::nlmsg_type), but I wondered if it would make sense to extend
>> genl_family for user custom data pointer and then pass this custom data
>> (or genl_family reference) to each handler (for example inside
>> genl_info). It is already parsed by genetlink layer, so it should not
>> slow things down.
> 
> That's not a bad idea, although I think we should try and keep the
> generic netlink part as simple as possible. There is a family specific
> header, referred to as user header in genl_info which is basically
> what you're looking for with the custom header. I believe making the
> generic netlink family aware of anything beyond family id and operations
> id only complicates things.

Ok, this was just an idea ;-) - probably important only for high
performance genetlink users.

Richard

^ permalink raw reply

* Re: drivers/infiniband/mlx/mad.c misplaced ;
From: Heiko Carstens @ 2007-08-16  8:46 UTC (permalink / raw)
  To: Joe Perches
  Cc: Dave Jones, Linux Kernel, rolandd, Chas Williams, Paul Mundt,
	isdn4linux, mikep, netdev, swen, linux390, linux-s390, jdike,
	user-mode-linux-devel, user-mode-linux-user, netfilter-devel,
	coreteam
In-Reply-To: <1187224811.5906.55.camel@localhost>

On Wed, Aug 15, 2007 at 05:40:11PM -0700, Joe Perches wrote:
> On Wed, 2007-08-15 at 19:58 -0400, Dave Jones wrote:
> > Signed-off-by: Dave Jones <davej@redhat.com>
> > 
> > diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
> > index 3330917..0ed02b7 100644
> > --- a/drivers/infiniband/hw/mlx4/mad.c
> > +++ b/drivers/infiniband/hw/mlx4/mad.c
> > @@ -109,7 +109,7 @@ int mlx4_MAD_IFC(struct mlx4_ib_dev *dev, int ignore_mkey, int ignore_bkey,
> >  			   in_modifier, op_modifier,
> >  			   MLX4_CMD_MAD_IFC, MLX4_CMD_TIME_CLASS_C);
> >  
> > -	if (!err);
> > +	if (!err)
> 
> There's more than a few of these (not inspected).
> 
> $ egrep -r --include=*.c "\bif[[:space:]]*\([^\)]*\)[[:space:]]*\;" * 

[...]

> drivers/s390/scsi/zfcp_erp.c:           if (status == ZFCP_ERP_SUCCEEDED) ;     /* no further action */

At least this one is not a bug. But I'm going to add a "break" there, so it
doesn't look that strange anymore. Thanks!

^ permalink raw reply

* Re: [patch 4/4] Update e1000 driver to use devres.
From: Waskiewicz Jr, Peter P @ 2007-08-16  8:38 UTC (permalink / raw)
  To: Brandon Philips, netdev, e1000-devel; +Cc: teheo
In-Reply-To: <20070815190014.GE7294@ifup.org>

> Index: linux-2.6/drivers/net/e1000/e1000_main.c
> ===================================================================
> --- linux-2.6.orig/drivers/net/e1000/e1000_main.c
> +++ linux-2.6/drivers/net/e1000/e1000_main.c
> @@ -860,15 +860,14 @@ e1000_probe(struct pci_dev *pdev,  {
>  	struct net_device *netdev;
>  	struct e1000_adapter *adapter;
> -	unsigned long mmio_start, mmio_len;
> -	unsigned long flash_start, flash_len;
> +	unsigned long mmio_len, flash_len;
>  
>  	static int cards_found = 0;
>  	static int global_quad_port_a = 0; /* global ksp3 port 
> a indication */
>  	int i, err, pci_using_dac;
>  	uint16_t eeprom_data = 0;
>  	uint16_t eeprom_apme_mask = E1000_EEPROM_APME;
> -	if ((err = pci_enable_device(pdev)))
> +	if ((err = pcim_enable_device(pdev)))
>  		return err;
>  
>  	if (!(err = pci_set_dma_mask(pdev, DMA_64BIT_MASK)) && 
> @@ -878,20 +877,19 @@ e1000_probe(struct pci_dev *pdev,
>  		if ((err = pci_set_dma_mask(pdev, DMA_32BIT_MASK)) &&
>  		    (err = pci_set_consistent_dma_mask(pdev, 
> DMA_32BIT_MASK))) {
>  			E1000_ERR("No usable DMA configuration, 
> aborting\n");
> -			goto err_dma;
> +			return err;
>  		}
>  		pci_using_dac = 0;
>  	}
>  
>  	if ((err = pci_request_regions(pdev, e1000_driver_name)))
> -		goto err_pci_reg;
> +		return err;
>  
>  	pci_set_master(pdev);
>  
> -	err = -ENOMEM;
> -	netdev = alloc_etherdev(sizeof(struct e1000_adapter));
> +	netdev = devm_alloc_etherdev(&pdev->dev, sizeof(struct 
> +e1000_adapter));
>  	if (!netdev)
> -		goto err_alloc_etherdev;
> +		return -ENOMEM;

I'm a bit confused why you removed the goto's, and then removed all the
target unwinding code at the bottom of e1000_probe().   Those labels
clean up resources if something fails, like the err_sw_init label.  I
don't see anything in the devres code that jumps out at me that explains
why we can do away with these cleanup routines.  Thoughts?

Thanks,
-PJ Waskiewicz

-------------------------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc.
Still grepping through log files to find problems?  Stop.
Now Search log events and configuration files using AJAX and a browser.
Download your FREE copy of Splunk now >>  http://get.splunk.com/

^ permalink raw reply

* Re: 2.6.23-rc3 and SKY2 driver issue
From: Michal Piotrowski @ 2007-08-16  8:25 UTC (permalink / raw)
  To: James Corey; +Cc: linux-kernel, Netdev, Stephen Hemminger
In-Reply-To: <663186.13204.qm@web90403.mail.mud.yahoo.com>

[Adding Stephen and netdev to CC]

On 15/08/07, James Corey <ploversegg@yahoo.com> wrote:
>
> I tried running a D-link gig card on kernel 2.6.21.1
> and it came up fine, but when I did a sftp of
> an linux dvd ISO to it, the interface would lock
> up hard with the error
>
> eth1: hw csum failure.
>
> Call Trace:
>  <IRQ>  [<ffffffff804d3e31>]
> __skb_checksum_complete_head+0x46/0x5f
>  [<ffffffff804d3e56>] __skb_checksum_complete+0xc/0x11
>  [<ffffffff805046c7>] tcp_v4_rcv+0x157/0x810
>  [<ffffffff804d8a6f>] dev_queue_xmit+0x237/0x260
>  [<ffffffff80229990>] find_busiest_group+0x252/0x684
>  [<ffffffff804ead50>] ip_local_deliver+0xca/0x14c
>  [<ffffffff804eb24a>] ip_rcv+0x478/0x4ba
>  [<ffffffff803ec7d3>] sky2_poll+0x6f9/0x9b9
>  [<ffffffff8022bd5d>]
> run_rebalance_domains+0x13e/0x408
>  [<ffffffff804d877a>] net_rx_action+0xa8/0x166
>  [<ffffffff80235d62>] __do_softirq+0x55/0xc3
>  [<ffffffff8020a4ec>] call_softirq+0x1c/0x28
>  [<ffffffff8020b611>] do_softirq+0x2c/0x7d
>  [<ffffffff8020b8cf>] do_IRQ+0x13e/0x15f
>  [<ffffffff802086ce>] mwait_idle+0x0/0x46
>  [<ffffffff80209871>] ret_from_intr+0x0/0xa
>  <EOI>  [<ffffffff80208710>] mwait_idle+0x42/0x46
>  [<ffffffff80208666>] cpu_idle+0x8c/0xaf
>  [<ffffffff8078174e>] start_kernel+0x2ac/0x2b8
>  [<ffffffff80781140>] _sinittext+0x140/0x144
>
>
> So I tried running the latest snapshot 2.6.23-rc3
> and the almost the same thing happens. Only
> difference is that now the entire box locks up.
> The error is almost the same
>
> eth1: hw csum failure.
>
> Call Trace:
>  <IRQ>  [<ffffffff804779b6>]
> __skb_checksum_complete_head+0x43/0x56
>  [<ffffffff804779d5>] __skb_checksum_complete+0xc/0x11
>  [<ffffffff804a989d>] tcp_v4_rcv+0x14e/0x801
>  [<ffffffff8048ff84>] ip_local_deliver+0xca/0x14c
>  [<ffffffff80490472>] ip_rcv+0x46c/0x4ae
>  [<ffffffff880060f9>] :sky2:sky2_poll+0x72b/0x9c7
>  [<ffffffff8047c934>] net_rx_action+0xa8/0x166
>  [<ffffffff80235ced>] __do_softirq+0x55/0xc4
>  [<ffffffff8020c5cc>] call_softirq+0x1c/0x28
>  [<ffffffff8020d6fd>] do_softirq+0x2c/0x7d
>  [<ffffffff8020d9bb>] do_IRQ+0x13e/0x15f
>  [<ffffffff8020a780>] mwait_idle+0x0/0x48
>  [<ffffffff8020b951>] ret_from_intr+0x0/0xa
>  <EOI>  [<ffffffff880063e7>]
> :sky2:sky2_xmit_frame+0x0/0x41d
>  [<ffffffff8020a7c2>] mwait_idle+0x42/0x48
>  [<ffffffff8020a718>] cpu_idle+0xbd/0xe0
>  [<ffffffff80704a5a>] start_kernel+0x2ac/0x2b8
>  [<ffffffff80704140>] _sinittext+0x140/0x144
>
>
> I see that the new kernel includes some sort of
> SKY2 DEBUG stuff. I would be happy to rerun
> the test with that turned on, given some minor
> direction.
>

Regards,
Michal

-- 
LOG
http://www.stardust.webpages.pl/log/

^ permalink raw reply

* [PATCH 2/2] [DM9000] External PHY support
From: Laurent Pinchart @ 2007-08-16  8:16 UTC (permalink / raw)
  To: netdev; +Cc: Ben Dooks

This patch adds a flag to the DM9000 platform data which, when set,
configures the device to use an external PHY.

Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com>
---
 drivers/net/dm9000.c   |    6 ++++++
 include/linux/dm9000.h |    1 +
 2 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/drivers/net/dm9000.c b/drivers/net/dm9000.c
index a424810..a86984e 100644
--- a/drivers/net/dm9000.c
+++ b/drivers/net/dm9000.c
@@ -136,6 +136,7 @@ typedef struct board_info {
 	u16 dbug_cnt;
 	u8 io_mode;		/* 0:word, 2:byte */
 	u8 phy_addr;
+	unsigned int flags;
 
 	void (*inblk)(void __iomem *port, void *data, int length);
 	void (*outblk)(void __iomem *port, void *data, int length);
@@ -528,6 +529,8 @@ dm9000_probe(struct platform_device *pdev)
 
 		if (pdata->dumpblk != NULL)
 			db->dumpblk = pdata->dumpblk;
+
+		db->flags = pdata->flags;
 	}
 
 	dm9000_reset(db);
@@ -670,6 +673,9 @@ dm9000_init_dm9000(struct net_device *dev)
 	iow(db, DM9000_GPCR, GPCR_GEP_CNTL);	/* Let GPIO0 output */
 	iow(db, DM9000_GPR, 0);	/* Enable PHY */
 
+	if (db->flags & DM9000_PLATF_EXT_PHY)
+		iow(db, DM9000_NCR, NCR_EXT_PHY);
+
 	/* Program operating register */
 	iow(db, DM9000_TCR, 0);	        /* TX Polling clear */
 	iow(db, DM9000_BPTR, 0x3f);	/* Less 3Kb, 200us */
diff --git a/include/linux/dm9000.h b/include/linux/dm9000.h
index 0008e2a..ea530fd 100644
--- a/include/linux/dm9000.h
+++ b/include/linux/dm9000.h
@@ -19,6 +19,7 @@
 #define DM9000_PLATF_8BITONLY	(0x0001)
 #define DM9000_PLATF_16BITONLY	(0x0002)
 #define DM9000_PLATF_32BITONLY	(0x0004)
+#define DM9000_PLATF_EXT_PHY	(0x0008)
 
 /* platfrom data for platfrom device structure's platfrom_data field */
 
-- 
1.5.0

-- 
Laurent Pinchart
CSE Semaphore Belgium

Chaussée de Bruxelles, 732A
B-1410 Waterloo
Belgium

T +32 (2) 387 42 59
F +32 (2) 387 42 75

^ permalink raw reply related

* [PATCH 1/2] [DM9000] Added support for big-endian hosts
From: Laurent Pinchart @ 2007-08-16  8:15 UTC (permalink / raw)
  To: netdev; +Cc: Ben Dooks

This patch splits the receive status in 8bit wide fields and convert the
packet length from little endian to CPU byte order.

Signed-off-by: Laurent Pinchart <laurentp@cse-semaphore.com>
---
 drivers/net/dm9000.c |   13 +++++++------
 1 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/drivers/net/dm9000.c b/drivers/net/dm9000.c
index c3de81b..a424810 100644
--- a/drivers/net/dm9000.c
+++ b/drivers/net/dm9000.c
@@ -894,7 +894,8 @@ dm9000_timer(unsigned long data)
 }
 
 struct dm9000_rxhdr {
-	u16	RxStatus;
+	u8	RxPktReady;
+	u8	RxStatus;
 	u16	RxLen;
 } __attribute__((__packed__));
 
@@ -935,7 +936,7 @@ dm9000_rx(struct net_device *dev)
 
 		(db->inblk)(db->io_data, &rxhdr, sizeof(rxhdr));
 
-		RxLen = rxhdr.RxLen;
+		RxLen = le16_to_cpu(rxhdr.RxLen);
 
 		/* Packet Status check */
 		if (RxLen < 0x40) {
@@ -947,17 +948,17 @@ dm9000_rx(struct net_device *dev)
 			PRINTK1("RST: RX Len:%x\n", RxLen);
 		}
 
-		if (rxhdr.RxStatus & 0xbf00) {
+		if (rxhdr.RxStatus & 0xbf) {
 			GoodPacket = false;
-			if (rxhdr.RxStatus & 0x100) {
+			if (rxhdr.RxStatus & 0x01) {
 				PRINTK1("fifo error\n");
 				db->stats.rx_fifo_errors++;
 			}
-			if (rxhdr.RxStatus & 0x200) {
+			if (rxhdr.RxStatus & 0x02) {
 				PRINTK1("crc error\n");
 				db->stats.rx_crc_errors++;
 			}
-			if (rxhdr.RxStatus & 0x8000) {
+			if (rxhdr.RxStatus & 0x80) {
 				PRINTK1("length error\n");
 				db->stats.rx_length_errors++;
 			}
-- 
1.5.0

-- 
Laurent Pinchart
CSE Semaphore Belgium

Chaussée de Bruxelles, 732A
B-1410 Waterloo
Belgium

T +32 (2) 387 42 59
F +32 (2) 387 42 75

^ permalink raw reply related

* [PATCH 0/2] [DM9000] Support for big endian hosts and external PHY
From: Laurent Pinchart @ 2007-08-16  8:14 UTC (permalink / raw)
  To: netdev; +Cc: Ben Dooks

These patches update the DM9000 driver to add support for big endian hosts and 
external PHY.

They are resubmissions of patches based on 2.6.22, that I have now updated to 
2.6.23-rc3.

-- 
Laurent Pinchart
CSE Semaphore Belgium

Chaussée de Bruxelles, 732A
B-1410 Waterloo
Belgium

T +32 (2) 387 42 59
F +32 (2) 387 42 75

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Herbert Xu @ 2007-08-16  8:10 UTC (permalink / raw)
  To: Stefan Richter
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <46C40587.7050708@s5r6.in-berlin.de>

On Thu, Aug 16, 2007 at 10:06:31AM +0200, Stefan Richter wrote:
> > 
> > Do you (or anyone else for that matter) have an example of this?
> 
> The only code I somewhat know, the ieee1394 subsystem, was perhaps
> authored and is currently maintained with the expectation that each
> occurrence of atomic_read actually results in a load operation, i.e. is
> not optimized away.  This means all atomic_t (bus generation, packet and
> buffer refcounts, and some other state variables)* and likewise all
> atomic bitops in that subsystem.

Can you find an actual atomic_read code snippet there that is
broken without the volatile modifier?

Thanks,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH] [DM9000] Add support for big-endian hosts
From: Laurent Pinchart @ 2007-08-16  8:10 UTC (permalink / raw)
  To: Ben Dooks; +Cc: netdev
In-Reply-To: <20070814182558.GA12547@fluff.org.uk>

On Tuesday 14 August 2007 20:25, Ben Dooks wrote:
> On Tue, Aug 14, 2007 at 01:22:32PM +0200, Laurent Pinchart wrote:
> > This patch splits the receive status in 8bit wide fields and convert the
> > packet length from little endian to CPU byte order.
>
> Which version of the the kernel was this against, it applies with
> fuzz to 2.6.23-rc3:
>
> $ patch -p1 < ~/dm9000-fix-endianness.patch
> patching file drivers/net/dm9000.c
> Hunk #1 succeeded at 894 (offset 15 lines).
> Hunk #2 succeeded at 936 (offset 15 lines).
> Hunk #3 succeeded at 948 (offset 15 lines).

That was against 2.6.22. Sorry. I'll post updated patches against 2.6.23-rc3.

Best regards.

-- 
Laurent Pinchart
CSE Semaphore Belgium

Chaussée de Bruxelles, 732A
B-1410 Waterloo
Belgium

T +32 (2) 387 42 59
F +32 (2) 387 42 75

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Stefan Richter @ 2007-08-16  8:06 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Paul Mackerras, Satyam Sharma, Christoph Lameter,
	Paul E. McKenney, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <20070816070907.GA964@gondor.apana.org.au>

Herbert Xu wrote:
> On Thu, Aug 16, 2007 at 04:56:21PM +1000, Paul Mackerras wrote:
>>
>> Note that I said these are the cases _where one might want to allow
>> caching_, so of course adding volatile doesn't help _these_ cases.
>> There are of course other cases where one definitely doesn't want to
>> allow the compiler to cache the value, such as when polling an atomic
>> variable waiting for another CPU to change it, and from my inspection
>> so far these cases seem to be the majority.
> 
> We've been through that already.  If it's a busy-wait it
> should use cpu_relax.  If it's scheduling away that already
> forces the compiler to reread anyway.
> 
> Do you have an actual example where volatile is needed?
> 
>> - It matches the normal expectation based on the name "atomic_read"
>> - It matches the behaviour of the other atomic_* primitives
> 
> Can't argue since you left out what those expectations
> or properties are.

We use atomic_t for data that is concurrently locklessly written and
read at arbitrary times.  My naive expectation as driver author (driver
maintainer) is that all atomic_t accessors, including atomic_read, (and
atomic bitops) work with the then current value of the atomic data.

>> - It avoids bugs in the cases where "volatile" behaviour is required
> 
> Do you (or anyone else for that matter) have an example of this?

The only code I somewhat know, the ieee1394 subsystem, was perhaps
authored and is currently maintained with the expectation that each
occurrence of atomic_read actually results in a load operation, i.e. is
not optimized away.  This means all atomic_t (bus generation, packet and
buffer refcounts, and some other state variables)* and likewise all
atomic bitops in that subsystem.

If that assumption is wrong, then what is the API or language primitive
to force a load operation to occur?


*)  Interesting what a quick LXR session in search for all atomic_t
usages in 'my' subsystem brings to light.  I now noticed an apparently
unused struct member in the bitrotting pcilynx driver, and more
importantly, a pairing of two atomic_t variables in raw1394 that should
be audited for race conditions and for possible replacement by plain int.
-- 
Stefan Richter
-=====-=-=== =--- =----
http://arcgraph.de/sr/

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Herbert Xu @ 2007-08-16  7:09 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <18115.62741.807704.969977@cargo.ozlabs.ibm.com>

On Thu, Aug 16, 2007 at 04:56:21PM +1000, Paul Mackerras wrote:
>
> Note that I said these are the cases _where one might want to allow
> caching_, so of course adding volatile doesn't help _these_ cases.
> There are of course other cases where one definitely doesn't want to
> allow the compiler to cache the value, such as when polling an atomic
> variable waiting for another CPU to change it, and from my inspection
> so far these cases seem to be the majority.

We've been through that already.  If it's a busy-wait it
should use cpu_relax.  If it's scheduling away that already
forces the compiler to reread anyway.

Do you have an actual example where volatile is needed?

> - It matches the normal expectation based on the name "atomic_read"
> - It matches the behaviour of the other atomic_* primitives

Can't argue since you left out what those expectations
or properties are.

> - It avoids bugs in the cases where "volatile" behaviour is required

Do you (or anyone else for that matter) have an example of this?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul Mackerras @ 2007-08-16  6:56 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <20070816053945.GB32442@gondor.apana.org.au>

Herbert Xu writes:

> On Thu, Aug 16, 2007 at 02:11:43PM +1000, Paul Mackerras wrote:
> >
> > The uses of atomic_read where one might want it to allow caching of
> > the result seem to me to fall into 3 categories:
> > 
> > 1. Places that are buggy because of a race arising from the way it's
> >    used.
> > 
> > 2. Places where there is a race but it doesn't matter because we're
> >    doing some clever trick.
> > 
> > 3. Places where there is some locking in place that eliminates any
> >    potential race.
> 
> Agreed.
> 
> > In case 1, adding volatile won't solve the race, of course, but it's
> > hard to argue that we shouldn't do something because it will slow down
> > buggy code.  Case 2 is hopefully pretty rare and accompanied by large
> > comment blocks, and in those cases caching the result of atomic_read
> > explicitly in a local variable would probably make the code clearer.
> > And in case 3 there is no reason to use atomic_t at all; we might as
> > well just use an int.
> 
> Since adding volatile doesn't help any of the 3 cases, and
> takes away optimisations from both 2 and 3, I wonder what
> is the point of the addition after all?

Note that I said these are the cases _where one might want to allow
caching_, so of course adding volatile doesn't help _these_ cases.
There are of course other cases where one definitely doesn't want to
allow the compiler to cache the value, such as when polling an atomic
variable waiting for another CPU to change it, and from my inspection
so far these cases seem to be the majority.

The reasons for having "volatile" behaviour of atomic_read (whether or
not that is achieved by use of the "volatile" C keyword) are

- It matches the normal expectation based on the name "atomic_read"
- It matches the behaviour of the other atomic_* primitives
- It avoids bugs in the cases where "volatile" behaviour is required

To my mind these outweigh the small benefit for some code of the
non-volatile (caching-allowed) behaviour.  In fact it's pretty minor
either way, and since x86[-64] has this behaviour, one can expect the
potential bugs in generic code to have mostly been found, although
perhaps not all of them since x86[-64] has less aggressive reordering
of memory accesses and fewer registers in which to cache things than
some other architectures.

Paul.

^ permalink raw reply

* RE: drivers/net/tokenring/3c359.c
From: Sivakumar Subramani @ 2007-08-16  6:43 UTC (permalink / raw)
  To: Jeff Garzik, surya.prabhakar
  Cc: mikep, netdev, linux-tr, Linux Kernel, kernel-janitors
In-Reply-To: <46C14297.1050800@garzik.org>

What exactly the difference between kzalloc and kcalloc? From the
definition, I could see that kcalloc should be used for array
allocation. But I could see kzalloc is used for allocation arrays as in
the below patch.

Any coding standard (or) developers can use kzalloc and kcalloc as per
their coding practice??

Thanks,
~Siva
-----Original Message-----
From: netdev-owner@vger.kernel.org [mailto:netdev-owner@vger.kernel.org]
On Behalf Of Jeff Garzik
Sent: Tuesday, August 14, 2007 11:20 AM
To: surya.prabhakar@wipro.com
Cc: mikep@linuxtr.net; netdev@vger.kernel.org; linux-tr@linuxtr.net;
Linux Kernel; kernel-janitors
Subject: Re: drivers/net/tokenring/3c359.c

Surya Prabhakar N wrote:
> Hi,
>    Replacing kmalloc with kzalloc and cleaning up memset in 
> drivers/net/tokenring/3c359.c
> 
> 
> Signed-off-by: Surya Prabhakar <surya.prabhakar@wipro.com>

applied


-
To unsubscribe from this list: send the line "unsubscribe netdev" in the
body of a message to majordomo@vger.kernel.org More majordomo info at
http://vger.kernel.org/majordomo-info.html

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul Mackerras @ 2007-08-16  6:00 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <20070816053741.GA32442@gondor.apana.org.au>

Herbert Xu writes:

> It doesn't matter.  The memory pressure flag is an *advisory*
> flag.  If we get it wrong the worst that'll happen is that we'd
> waste some time doing work that'll be thrown away.

Ah, so it's the "racy but I don't care because it's only an
optimization" case.  That's fine.  Somehow I find it hard to believe
that all the racy uses of atomic_read in the kernel are like that,
though. :)

> In any case, this actually illustrates why the addition of
> volatile is completely pointless.  Even if this code was
> broken, which it definitely is not, having the volatile
> there wouldn't have helped at all.

Yes, adding volatile to racy code doesn't somehow make it race-free.
Neither does using atomic_t, despite what some seem to believe.

I have actually started going through all the uses of atomic_read in
the kernel.  So far out of the first 100 I have found none where we
have two atomic_reads of the same variable and the compiler could
usefully use the value from the first as the result of the second.
But there's still > 2500 to go...

Paul.

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Satyam Sharma @ 2007-08-16  5:57 UTC (permalink / raw)
  To: Bill Fink
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney
In-Reply-To: <alpine.LFD.0.999.0708161045360.25951@enigma.security.iitk.ac.in>



On Thu, 16 Aug 2007, Satyam Sharma wrote:

> Hi Bill,
> 
> 
> On Wed, 15 Aug 2007, Bill Fink wrote:
> 
> > On Wed, 15 Aug 2007, Satyam Sharma wrote:
> > 
> > > (C)
> > > $ cat tp3.c
> > > int a;
> > > 
> > > void func(void)
> > > {
> > > 	*(volatile int *)&a = 10;
> > > 	*(volatile int *)&a = 20;
> > > }
> > > $ gcc -Os -S tp3.c
> > > $ cat tp3.s
> > > ...
> > > movl    $10, a
> > > movl    $20, a
> > > ...
> > 
> > I'm curious about one minor tangential point.  Why, instead of:
> > 
> > 	b = *(volatile int *)&a;
> > 
> > why can't this just be expressed as:
> > 
> > 	b = (volatile int)a;
> > 
> > Isn't it the contents of a that's volatile, i.e. it's value can change
> > invisibly to the compiler, and that's why you want to force a read from
> > memory?  Why do you need the "*(volatile int *)&" construct?
> 
> "b = (volatile int)a;" doesn't help us because a cast to a qualified type
> has the same effect as a cast to an unqualified version of that type, as
> mentioned in 6.5.4:4 (footnote 86) of the standard. Note that "volatile"
> is a type-qualifier, not a type itself, so a cast of the _object_ itself
> to a qualified-type i.e. (volatile int) would not make the access itself
> volatile-qualified.
> 
> To serve our purposes, it is necessary for us to take the address of this
> (non-volatile) object, cast the resulting _pointer_ to the corresponding
> volatile-qualified pointer-type, and then dereference it. This makes that
> particular _access_ be volatile-qualified, without the object itself being
> such. Also note that the (dereferenced) result is also a valid lvalue and
> hence can be used in "*(volatile int *)&a = b;" kind of construction
> (which we use for the atomic_set case).

Here, I should obviously admit that the semantics of *(volatile int *)&
aren't any neater or well-defined in the _language standard_ at all. The
standard does say (verbatim) "precisely what constitutes as access to
object of volatile-qualified type is implementation-defined", but GCC
does help us out here by doing the right thing. Accessing the non-volatile
object there using the volatile-qualified pointer-type cast makes GCC
treat the object stored at that memory address itself as if it were a 
volatile object, thus making the access end up having what we're calling
as "volatility" semantics here.

Honestly, given such confusion, and the propensity of the "volatile"
type-qualifier keyword to be ill-defined (or at least poorly understood,
often inconsistently implemented), I'd (again) express my opinion that it
would be best to avoid its usage, given other alternatives do exist.

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Herbert Xu @ 2007-08-16  5:39 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Satyam Sharma, Christoph Lameter, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <18115.52863.638655.658466@cargo.ozlabs.ibm.com>

On Thu, Aug 16, 2007 at 02:11:43PM +1000, Paul Mackerras wrote:
>
> The uses of atomic_read where one might want it to allow caching of
> the result seem to me to fall into 3 categories:
> 
> 1. Places that are buggy because of a race arising from the way it's
>    used.
> 
> 2. Places where there is a race but it doesn't matter because we're
>    doing some clever trick.
> 
> 3. Places where there is some locking in place that eliminates any
>    potential race.

Agreed.

> In case 1, adding volatile won't solve the race, of course, but it's
> hard to argue that we shouldn't do something because it will slow down
> buggy code.  Case 2 is hopefully pretty rare and accompanied by large
> comment blocks, and in those cases caching the result of atomic_read
> explicitly in a local variable would probably make the code clearer.
> And in case 3 there is no reason to use atomic_t at all; we might as
> well just use an int.

Since adding volatile doesn't help any of the 3 cases, and
takes away optimisations from both 2 and 3, I wonder what
is the point of the addition after all?

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Herbert Xu @ 2007-08-16  5:37 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <18115.54225.644905.463771@cargo.ozlabs.ibm.com>

On Thu, Aug 16, 2007 at 02:34:25PM +1000, Paul Mackerras wrote:
>
> I'm talking about this situation:
> 
> CPU 0 comes into __sk_stream_mem_reclaim, reads memory_allocated, but
> then before it can do the store to *memory_pressure, CPUs 1-1023 all
> go through sk_stream_mem_schedule, collectively increase
> memory_allocated to more than sysctl_mem[2] and set *memory_pressure.
> Finally CPU 0 gets to do its store and it sets *memory_pressure back
> to 0, but by this stage memory_allocated is way larger than
> sysctl_mem[2].

It doesn't matter.  The memory pressure flag is an *advisory*
flag.  If we get it wrong the worst that'll happen is that we'd
waste some time doing work that'll be thrown away.

Please look at the places where it's used before jumping to
conclusions.

> Now, maybe it's the case that it doesn't really matter whether
> *->memory_pressure is 0 or 1.  But if so, why bother computing it at
> all?

As long as we get it right most of the time (and I think you
would agree that we do get it right most of the time), then
this flag has achieved its purpose.

> People seem to think that using atomic_t means they don't need to use
> a spinlock.  That's fine if there is only one variable involved, but
> as soon as there's more than one, there's the possibility of a race,
> whether or not you use atomic_t, and whether or not atomic_read has
> "volatile" behaviour.

In any case, this actually illustrates why the addition of
volatile is completely pointless.  Even if this code was
broken, which it definitely is not, having the volatile
there wouldn't have helped at all.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Satyam Sharma @ 2007-08-16  5:20 UTC (permalink / raw)
  To: Bill Fink
  Cc: Stefan Richter, Christoph Lameter, Chris Snook,
	Linux Kernel Mailing List, linux-arch, Linus Torvalds, netdev,
	Andrew Morton, ak, heiko.carstens, davem, schwidefsky, wensong,
	horms, wjiang, cfriesen, zlynx, rpjday, jesper.juhl, segher,
	Herbert Xu, Paul E. McKenney
In-Reply-To: <20070815233721.91032366.billfink@mindspring.com>

Hi Bill,


On Wed, 15 Aug 2007, Bill Fink wrote:

> On Wed, 15 Aug 2007, Satyam Sharma wrote:
> 
> > (C)
> > $ cat tp3.c
> > int a;
> > 
> > void func(void)
> > {
> > 	*(volatile int *)&a = 10;
> > 	*(volatile int *)&a = 20;
> > }
> > $ gcc -Os -S tp3.c
> > $ cat tp3.s
> > ...
> > movl    $10, a
> > movl    $20, a
> > ...
> 
> I'm curious about one minor tangential point.  Why, instead of:
> 
> 	b = *(volatile int *)&a;
> 
> why can't this just be expressed as:
> 
> 	b = (volatile int)a;
> 
> Isn't it the contents of a that's volatile, i.e. it's value can change
> invisibly to the compiler, and that's why you want to force a read from
> memory?  Why do you need the "*(volatile int *)&" construct?

"b = (volatile int)a;" doesn't help us because a cast to a qualified type
has the same effect as a cast to an unqualified version of that type, as
mentioned in 6.5.4:4 (footnote 86) of the standard. Note that "volatile"
is a type-qualifier, not a type itself, so a cast of the _object_ itself
to a qualified-type i.e. (volatile int) would not make the access itself
volatile-qualified.

To serve our purposes, it is necessary for us to take the address of this
(non-volatile) object, cast the resulting _pointer_ to the corresponding
volatile-qualified pointer-type, and then dereference it. This makes that
particular _access_ be volatile-qualified, without the object itself being
such. Also note that the (dereferenced) result is also a valid lvalue and
hence can be used in "*(volatile int *)&a = b;" kind of construction
(which we use for the atomic_set case).


Satyam

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul Mackerras @ 2007-08-16  4:34 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <20070816040308.GA32133@gondor.apana.org.au>

Herbert Xu writes:

> > You mean it's intended that *sk->sk_prot->memory_pressure can end up
> > as 1 when sk->sk_prot->memory_allocated is small (less than
> > ->sysctl_mem[0]), or as 0 when ->memory_allocated is large (greater
> > than ->sysctl_mem[2])?  Because that's the effect of the current code.
> > If so I wonder why you bother computing it.
> 
> You need to remember that there are three different limits:
> minimum, pressure, and maximum.  By default we should never
> be in a situation where what you say can occur.
> 
> If you set all three limits to the same thing, then yes it
> won't work as intended but it's still well-behaved.

I'm not talking about setting all three limits to the same thing.

I'm talking about this situation:

CPU 0 comes into __sk_stream_mem_reclaim, reads memory_allocated, but
then before it can do the store to *memory_pressure, CPUs 1-1023 all
go through sk_stream_mem_schedule, collectively increase
memory_allocated to more than sysctl_mem[2] and set *memory_pressure.
Finally CPU 0 gets to do its store and it sets *memory_pressure back
to 0, but by this stage memory_allocated is way larger than
sysctl_mem[2].

Yes, it's unlikely, but that is the nature of race conditions - they
are unlikely, and only show up at inconvenient times, never when
someone who could fix the bug is watching. :)

Similarly it would be possible for other CPUs to decrease
memory_allocated from greater than sysctl_mem[2] to less than
sysctl_mem[0] in the interval between when we read memory_allocated
and set *memory_pressure to 1.  And it's quite possible for their
setting of *memory_pressure to 0 to happen before our setting of it to
1, so that it ends up at 1 when it should be 0.

Now, maybe it's the case that it doesn't really matter whether
*->memory_pressure is 0 or 1.  But if so, why bother computing it at
all?

People seem to think that using atomic_t means they don't need to use
a spinlock.  That's fine if there is only one variable involved, but
as soon as there's more than one, there's the possibility of a race,
whether or not you use atomic_t, and whether or not atomic_read has
"volatile" behaviour.

Paul.

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul Mackerras @ 2007-08-16  4:11 UTC (permalink / raw)
  To: Satyam Sharma
  Cc: Herbert Xu, Christoph Lameter, Paul E. McKenney, Stefan Richter,
	Chris Snook, Linux Kernel Mailing List, linux-arch,
	Linus Torvalds, netdev, Andrew Morton, ak, heiko.carstens, davem,
	schwidefsky, wensong, horms, wjiang, cfriesen, zlynx, rpjday,
	jesper.juhl, segher
In-Reply-To: <alpine.LFD.0.999.0708160815510.24380@enigma.security.iitk.ac.in>

Satyam Sharma writes:

> Anyway, the problem, of course, is that this conversion to a stronger /
> safer-by-default behaviour doesn't happen with zero cost to performance.
> Converting atomic ops to "volatile" behaviour did add ~2K to kernel text
> for archs such as i386 (possibly to important codepaths) that didn't have
> those semantics already so it would be constructive to actually look at
> those differences and see if there were really any heisenbugs that got
> rectified. Or if there were legitimate optimizations that got wrongly
> disabled. Onus lies on those proposing the modifications, I'd say ;-)

The uses of atomic_read where one might want it to allow caching of
the result seem to me to fall into 3 categories:

1. Places that are buggy because of a race arising from the way it's
   used.

2. Places where there is a race but it doesn't matter because we're
   doing some clever trick.

3. Places where there is some locking in place that eliminates any
   potential race.

In case 1, adding volatile won't solve the race, of course, but it's
hard to argue that we shouldn't do something because it will slow down
buggy code.  Case 2 is hopefully pretty rare and accompanied by large
comment blocks, and in those cases caching the result of atomic_read
explicitly in a local variable would probably make the code clearer.
And in case 3 there is no reason to use atomic_t at all; we might as
well just use an int.

So I don't see any good reason to make the atomic API more complex by
having "volatile" and "non-volatile" versions of atomic_read.  It
should just have the "volatile" behaviour.

Paul.

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Herbert Xu @ 2007-08-16  4:03 UTC (permalink / raw)
  To: Paul Mackerras
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <18115.51472.408193.332905@cargo.ozlabs.ibm.com>

On Thu, Aug 16, 2007 at 01:48:32PM +1000, Paul Mackerras wrote:
> Herbert Xu writes:
> 
> > If you're referring to the code in sk_stream_mem_schedule
> > then it's working as intended.  The atomicity guarantees
> 
> You mean it's intended that *sk->sk_prot->memory_pressure can end up
> as 1 when sk->sk_prot->memory_allocated is small (less than
> ->sysctl_mem[0]), or as 0 when ->memory_allocated is large (greater
> than ->sysctl_mem[2])?  Because that's the effect of the current code.
> If so I wonder why you bother computing it.

You need to remember that there are three different limits:
minimum, pressure, and maximum.  By default we should never
be in a situation where what you say can occur.

If you set all three limits to the same thing, then yes it
won't work as intended but it's still well-behaved.

Cheers,
-- 
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 0/24] make atomic_read() behave consistently across all architectures
From: Paul Mackerras @ 2007-08-16  3:48 UTC (permalink / raw)
  To: Herbert Xu
  Cc: Christoph Lameter, Satyam Sharma, Paul E. McKenney,
	Stefan Richter, Chris Snook, Linux Kernel Mailing List,
	linux-arch, Linus Torvalds, netdev, Andrew Morton, ak,
	heiko.carstens, davem, schwidefsky, wensong, horms, wjiang,
	cfriesen, zlynx, rpjday, jesper.juhl, segher
In-Reply-To: <20070816033343.GA31844@gondor.apana.org.au>

Herbert Xu writes:

> If you're referring to the code in sk_stream_mem_schedule
> then it's working as intended.  The atomicity guarantees

You mean it's intended that *sk->sk_prot->memory_pressure can end up
as 1 when sk->sk_prot->memory_allocated is small (less than
->sysctl_mem[0]), or as 0 when ->memory_allocated is large (greater
than ->sysctl_mem[2])?  Because that's the effect of the current code.
If so I wonder why you bother computing it.

> that the atomic_add/atomic_sub won't be seen in parts by
> other readers.
> 
> We certainly do not need to see other atomic_add/atomic_sub
> operations immediately.
> 
> If you're referring to another code snippet please cite.
> 
> > I'd go so far as to say that anywhere where you want a non-"volatile"
> > atomic_read, either your code is buggy, or else an int would work just
> > as well.
> 
> An int won't work here because += and -= do not have the
> atomicity guarantees that atomic_add/atomic_sub do.  In
> particular, this may cause an atomic_read on another CPU
> to give a bogus reading.

The point is that guaranteeing the atomicity of the increment or
decrement does not suffice to make the code race-free.  In this case
the race arises from the fact that reading ->memory_allocated and
setting *->memory_pressure are separate operations.  To make that code
work properly you need a lock.  And once you have the lock an ordinary
int would suffice for ->memory_allocated.

Paul.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox