* coding style lesson: iwlwifi vs. endianness
@ 2007-11-27 18:44 Johannes Berg
2007-11-28 18:20 ` Tomas Winkler
0 siblings, 1 reply; 19+ messages in thread
From: Johannes Berg @ 2007-11-27 18:44 UTC (permalink / raw)
To: linux-wireless; +Cc: Tomas Winkler, Zhu Yi, Reinette Chatre
[-- Attachment #1: Type: text/plain, Size: 5911 bytes --]
Today's lesson is brought to you by the motivation to teach you how to
write simpler code since apparently the only way you found so far to get
drivers working on big endian platforms is to use a big endian
conversion hammer (ambiguity intentional :) ). Which is not at all
necessary. Maybe you don't want to do this as it's a lot of work to
start with, but the current way you do things more than suboptimal.
But let me explain. And since examples are always good, I picked one at
random from the iwl4965 driver.
Consider the phy_flags field in the iwl4965_rx_phy_res structure which
is defined as follows:
__le16 phy_flags; /* general phy flags: band, modulation, ... */
along with these definitions:
#define RX_RES_PHY_FLAGS_BAND_24_MSK __constant_cpu_to_le16(1 << 0)
#define RX_RES_PHY_FLAGS_MOD_CCK_MSK __constant_cpu_to_le16(1 << 1)
#define RX_RES_PHY_FLAGS_SHORT_PREAMBLE_MSK __constant_cpu_to_le16(1 << 2)
#define RX_RES_PHY_FLAGS_NARROW_BAND_MSK __constant_cpu_to_le16(1 << 3)
#define RX_RES_PHY_FLAGS_ANTENNA_MSK __constant_cpu_to_le16(0xf0) [1]
This is very complex. Why? Because you continually have to use __le16
for any phy flags field. To extract the antenna, here's what you need to
do:
__le16 phy_flags_hw = phy_res->phy_flags;
[...]
antenna = le16_to_cpu(phy_flags_hw & ANTENNA_MSK) >> 4;
Also, on a big-endian architecture that expands to this beast:
rt_antenna =
(__builtin_constant_p((__u16)(( __u16)(__le16)(phy_flags_hw &
(( __le16)((__u16)( (((__u16)((0xf0)) & (__u16)0x00ffU) << 8) |
(((__u16)((0xf0)) & (__u16)0xff00U) >> 8) )))))) ?
((__u16)( (((__u16)((( __u16)(__le16)(phy_flags_hw &
(( __le16)((__u16)( (((__u16)((0xf0)) & (__u16)0x00ffU) << 8) |
(((__u16)((0xf0)) & (__u16)0xff00U) >> 8) )))))) & (__u16)0x00ffU) << 8)
| (((__u16)((( __u16)(__le16)(phy_flags_hw &
(( __le16)((__u16)( (((__u16)((0xf0)) & (__u16)0x00ffU) << 8) |
(((__u16)((0xf0)) & (__u16)0xff00U) >> 8) )))))) & (__u16)0xff00U) >>
8) )) : __fswab16((( __u16)(__le16)(phy_flags_hw &
(( __le16)((__u16)( (((__u16)((0xf0)) & (__u16)0x00ffU) << 8) |
(((__u16)((0xf0)) & (__u16)0xff00U) >> 8) ))))))) >> 4;
It also results in a compiler warning:
[...] warning: integer overflow in expression
although I have to admit that right now I do not know where in that
beast the compiler thinks it got an overflow. Twice, in fact. In any
case, it's pretty hard on the compiler.
Additionally, doing it this way means that programmers continually need
to think about endianness *all over the code*. Literally *everywhere*
that touches values coming from hardware/going to hardware, which is
pretty much everywhere in a driver.
Now, here's what I want to teach you to do instead.
Keep the phy_flags field defined as it was:
__le16 phy_flags; /* general phy flags: band, modulation, ... */
but do it like everybody else and define the values as they are in the
phy_flags field without thinking about endianness at all:
#define RX_RES_PHY_FLAGS_BAND_24_MSK (1 << 0)
#define RX_RES_PHY_FLAGS_MOD_CCK_MSK (1 << 1)
#define RX_RES_PHY_FLAGS_SHORT_PREAMBLE_MSK (1 << 2)
#define RX_RES_PHY_FLAGS_NARROW_BAND_MSK (1 << 3)
#define RX_RES_PHY_FLAGS_ANTENNA_MSK 0xf0
This is easier for the person writing the definitions and looks much
cleaner to boot.
Then, when it comes to using a value, simply do:
u16 phy_flags = le16_to_cpu(phy_res->phy_flags);
(and if you get it wrong, sparse will warn here.)
Then, you can do the easy:
antenna = (phy_flags & ANTENNA_MASK) >> 4;
without having to think about endianness. And if you use the field again
and again you never have any conversion functions.
Presto. As long as you think in terms of
this is the phy_flags field that contains X, Y and Z at
positions x, y, z
rather than
this is 16 bits that are laid out in such and such way
(which I think everybody except hardware designers does), you win.
As a bonus, your code is much easier to read and much smaller (C/header
file size I mean). Also, it's a lot more efficient for fields that
contain multiple fields like the antenna field (with more than one bit)
because you do the conversion only once.
Think of it this way:
Your current style of taking care of endianness requires thinking about
it everywhere, is thus prone to errors and looks ugly.
The way I'm suggesting is to convert all data to native (CPU) endianness
_as it enters the system_, i.e. at the driver/hardware boundary and then
think in terms of the "field" after that, never bothering to think about
endianness again. In fact, you shouldn't ever need to use cpu_to_le*()
in the RX path, only in any control/TX paths where you need to push a
value down to the hardware.
The iwlwifi code is a big mess this way. I'm willing to help with the
conversion, it should be a pretty mechanical removal of many many
cpu_to_le16/32 calls and some fixups and will definitely make the code
much easier to maintain. In fact, it would probably have been easier if
you'd written the code without respect for endianness and then we'd
simply annotated all structures that are shared with the hardware and
fixed the sparse warnings at the system boundaries.
I hope this helps. Let me know if you have any questions.
I have one for you: Whatever gave you the idea of doing it this way?
Isn't it common sense that converting data at system boundaries is much
easier and less prone to errors than trying to pull it through the whole
second system in a non-native format? Whether it's endianness, character
set conversion, ...
johannes
[1] By the way, using __constant_cpu_to_le16() is not useful because
cpu_to_le16() already checks whether the argument is a constant. You
should only use it if absolutely necessary for some reason. That's why
it has two underscores.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: coding style lesson: iwlwifi vs. endianness
2007-11-27 18:44 coding style lesson: iwlwifi vs. endianness Johannes Berg
@ 2007-11-28 18:20 ` Tomas Winkler
2007-11-28 18:50 ` Johannes Berg
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Tomas Winkler @ 2007-11-28 18:20 UTC (permalink / raw)
To: Johannes Berg; +Cc: linux-wireless, Zhu Yi, Reinette Chatre
If you would compose your email in less arrogant tone I would answer
you why your assumptions are wrong.
I know it is tempting to teach BIG Intel, but please try to keep good
spirit on this mailing list as it was so far.
Thanks
Tomas.
On Nov 27, 2007 8:44 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> Today's lesson is brought to you by the motivation to teach you how to
> write simpler code since apparently the only way you found so far to get
> drivers working on big endian platforms is to use a big endian
> conversion hammer (ambiguity intentional :) ). Which is not at all
> necessary. Maybe you don't want to do this as it's a lot of work to
> start with, but the current way you do things more than suboptimal.
>
> But let me explain. And since examples are always good, I picked one at
> random from the iwl4965 driver.
>
> Consider the phy_flags field in the iwl4965_rx_phy_res structure which
> is defined as follows:
> __le16 phy_flags; /* general phy flags: band, modulation, ... */
>
> along with these definitions:
>
> #define RX_RES_PHY_FLAGS_BAND_24_MSK __constant_cpu_to_le16(1 << 0)
> #define RX_RES_PHY_FLAGS_MOD_CCK_MSK __constant_cpu_to_le16(1 << 1)
> #define RX_RES_PHY_FLAGS_SHORT_PREAMBLE_MSK __constant_cpu_to_le16(1 << 2)
> #define RX_RES_PHY_FLAGS_NARROW_BAND_MSK __constant_cpu_to_le16(1 << 3)
> #define RX_RES_PHY_FLAGS_ANTENNA_MSK __constant_cpu_to_le16(0xf0) [1]
>
> This is very complex. Why? Because you continually have to use __le16
> for any phy flags field. To extract the antenna, here's what you need to
> do:
>
> __le16 phy_flags_hw = phy_res->phy_flags;
> [...]
>
> antenna = le16_to_cpu(phy_flags_hw & ANTENNA_MSK) >> 4;
>
> Also, on a big-endian architecture that expands to this beast:
> rt_antenna =
> (__builtin_constant_p((__u16)(( __u16)(__le16)(phy_flags_hw &
> (( __le16)((__u16)( (((__u16)((0xf0)) & (__u16)0x00ffU) << 8) |
> (((__u16)((0xf0)) & (__u16)0xff00U) >> 8) )))))) ?
> ((__u16)( (((__u16)((( __u16)(__le16)(phy_flags_hw &
> (( __le16)((__u16)( (((__u16)((0xf0)) & (__u16)0x00ffU) << 8) |
> (((__u16)((0xf0)) & (__u16)0xff00U) >> 8) )))))) & (__u16)0x00ffU) << 8)
> | (((__u16)((( __u16)(__le16)(phy_flags_hw &
> (( __le16)((__u16)( (((__u16)((0xf0)) & (__u16)0x00ffU) << 8) |
> (((__u16)((0xf0)) & (__u16)0xff00U) >> 8) )))))) & (__u16)0xff00U) >>
> 8) )) : __fswab16((( __u16)(__le16)(phy_flags_hw &
> (( __le16)((__u16)( (((__u16)((0xf0)) & (__u16)0x00ffU) << 8) |
> (((__u16)((0xf0)) & (__u16)0xff00U) >> 8) ))))))) >> 4;
>
> It also results in a compiler warning:
> [...] warning: integer overflow in expression
>
> although I have to admit that right now I do not know where in that
> beast the compiler thinks it got an overflow. Twice, in fact. In any
> case, it's pretty hard on the compiler.
>
> Additionally, doing it this way means that programmers continually need
> to think about endianness *all over the code*. Literally *everywhere*
> that touches values coming from hardware/going to hardware, which is
> pretty much everywhere in a driver.
>
> Now, here's what I want to teach you to do instead.
>
> Keep the phy_flags field defined as it was:
> __le16 phy_flags; /* general phy flags: band, modulation, ... */
>
> but do it like everybody else and define the values as they are in the
> phy_flags field without thinking about endianness at all:
>
> #define RX_RES_PHY_FLAGS_BAND_24_MSK (1 << 0)
> #define RX_RES_PHY_FLAGS_MOD_CCK_MSK (1 << 1)
> #define RX_RES_PHY_FLAGS_SHORT_PREAMBLE_MSK (1 << 2)
> #define RX_RES_PHY_FLAGS_NARROW_BAND_MSK (1 << 3)
> #define RX_RES_PHY_FLAGS_ANTENNA_MSK 0xf0
>
> This is easier for the person writing the definitions and looks much
> cleaner to boot.
>
> Then, when it comes to using a value, simply do:
>
> u16 phy_flags = le16_to_cpu(phy_res->phy_flags);
>
> (and if you get it wrong, sparse will warn here.)
>
> Then, you can do the easy:
> antenna = (phy_flags & ANTENNA_MASK) >> 4;
>
> without having to think about endianness. And if you use the field again
> and again you never have any conversion functions.
>
> Presto. As long as you think in terms of
> this is the phy_flags field that contains X, Y and Z at
> positions x, y, z
>
> rather than
>
> this is 16 bits that are laid out in such and such way
>
> (which I think everybody except hardware designers does), you win.
>
> As a bonus, your code is much easier to read and much smaller (C/header
> file size I mean). Also, it's a lot more efficient for fields that
> contain multiple fields like the antenna field (with more than one bit)
> because you do the conversion only once.
>
> Think of it this way:
>
> Your current style of taking care of endianness requires thinking about
> it everywhere, is thus prone to errors and looks ugly.
>
> The way I'm suggesting is to convert all data to native (CPU) endianness
> _as it enters the system_, i.e. at the driver/hardware boundary and then
> think in terms of the "field" after that, never bothering to think about
> endianness again. In fact, you shouldn't ever need to use cpu_to_le*()
> in the RX path, only in any control/TX paths where you need to push a
> value down to the hardware.
>
> The iwlwifi code is a big mess this way. I'm willing to help with the
> conversion, it should be a pretty mechanical removal of many many
> cpu_to_le16/32 calls and some fixups and will definitely make the code
> much easier to maintain. In fact, it would probably have been easier if
> you'd written the code without respect for endianness and then we'd
> simply annotated all structures that are shared with the hardware and
> fixed the sparse warnings at the system boundaries.
>
> I hope this helps. Let me know if you have any questions.
>
> I have one for you: Whatever gave you the idea of doing it this way?
> Isn't it common sense that converting data at system boundaries is much
> easier and less prone to errors than trying to pull it through the whole
> second system in a non-native format? Whether it's endianness, character
> set conversion, ...
>
> johannes
>
> [1] By the way, using __constant_cpu_to_le16() is not useful because
> cpu_to_le16() already checks whether the argument is a constant. You
> should only use it if absolutely necessary for some reason. That's why
> it has two underscores.
>
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: coding style lesson: iwlwifi vs. endianness
2007-11-28 18:20 ` Tomas Winkler
@ 2007-11-28 18:50 ` Johannes Berg
2007-11-28 21:43 ` Dan Williams
2007-11-29 9:03 ` Holger Schurig
2 siblings, 0 replies; 19+ messages in thread
From: Johannes Berg @ 2007-11-28 18:50 UTC (permalink / raw)
To: Tomas Winkler; +Cc: linux-wireless, Zhu Yi, Reinette Chatre
[-- Attachment #1: Type: text/plain, Size: 440 bytes --]
Apologies. It wasn't my intention to piss anyone off. I did write it in
a rather pissed-off mood though because I was reading iwlwifi code and
it keeps telling me how mac80211 sucks (we work around this here, we
work around this there, ...). It'd be nice if the people writing the
driver could post their comments on the mailing list rather than writing
crappy workarounds for the sake of pushing code out to us.
Thanks,
Johannes
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-11-28 18:20 ` Tomas Winkler
2007-11-28 18:50 ` Johannes Berg
@ 2007-11-28 21:43 ` Dan Williams
2007-11-29 0:58 ` John W. Linville
2007-11-29 9:03 ` Holger Schurig
2 siblings, 1 reply; 19+ messages in thread
From: Dan Williams @ 2007-11-28 21:43 UTC (permalink / raw)
To: Tomas Winkler; +Cc: Johannes Berg, linux-wireless, Zhu Yi, Reinette Chatre
On Wed, 2007-11-28 at 20:20 +0200, Tomas Winkler wrote:
> If you would compose your email in less arrogant tone I would answer
> you why your assumptions are wrong.
> I know it is tempting to teach BIG Intel, but please try to keep good
> spirit on this mailing list as it was so far.
> Thanks
> Tomas.
Ignoring the email tones and focusing on the problem, could you
elaborate your reasons? Doing endian conversions at the boundaries is
quite a bit simpler and does lead to cleaner, more readable code. The
bulk of the work being done with a softmac card is in the driver +
stack, so the clearer those are, the better for everyone.
Dan
> On Nov 27, 2007 8:44 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
> > Today's lesson is brought to you by the motivation to teach you how to
> > write simpler code since apparently the only way you found so far to get
> > drivers working on big endian platforms is to use a big endian
> > conversion hammer (ambiguity intentional :) ). Which is not at all
> > necessary. Maybe you don't want to do this as it's a lot of work to
> > start with, but the current way you do things more than suboptimal.
> >
> > But let me explain. And since examples are always good, I picked one at
> > random from the iwl4965 driver.
> >
> > Consider the phy_flags field in the iwl4965_rx_phy_res structure which
> > is defined as follows:
> > __le16 phy_flags; /* general phy flags: band, modulation, ... */
> >
> > along with these definitions:
> >
> > #define RX_RES_PHY_FLAGS_BAND_24_MSK __constant_cpu_to_le16(1 << 0)
> > #define RX_RES_PHY_FLAGS_MOD_CCK_MSK __constant_cpu_to_le16(1 << 1)
> > #define RX_RES_PHY_FLAGS_SHORT_PREAMBLE_MSK __constant_cpu_to_le16(1 << 2)
> > #define RX_RES_PHY_FLAGS_NARROW_BAND_MSK __constant_cpu_to_le16(1 << 3)
> > #define RX_RES_PHY_FLAGS_ANTENNA_MSK __constant_cpu_to_le16(0xf0) [1]
> >
> > This is very complex. Why? Because you continually have to use __le16
> > for any phy flags field. To extract the antenna, here's what you need to
> > do:
> >
> > __le16 phy_flags_hw = phy_res->phy_flags;
> > [...]
> >
> > antenna = le16_to_cpu(phy_flags_hw & ANTENNA_MSK) >> 4;
> >
> > Also, on a big-endian architecture that expands to this beast:
> > rt_antenna =
> > (__builtin_constant_p((__u16)(( __u16)(__le16)(phy_flags_hw &
> > (( __le16)((__u16)( (((__u16)((0xf0)) & (__u16)0x00ffU) << 8) |
> > (((__u16)((0xf0)) & (__u16)0xff00U) >> 8) )))))) ?
> > ((__u16)( (((__u16)((( __u16)(__le16)(phy_flags_hw &
> > (( __le16)((__u16)( (((__u16)((0xf0)) & (__u16)0x00ffU) << 8) |
> > (((__u16)((0xf0)) & (__u16)0xff00U) >> 8) )))))) & (__u16)0x00ffU) << 8)
> > | (((__u16)((( __u16)(__le16)(phy_flags_hw &
> > (( __le16)((__u16)( (((__u16)((0xf0)) & (__u16)0x00ffU) << 8) |
> > (((__u16)((0xf0)) & (__u16)0xff00U) >> 8) )))))) & (__u16)0xff00U) >>
> > 8) )) : __fswab16((( __u16)(__le16)(phy_flags_hw &
> > (( __le16)((__u16)( (((__u16)((0xf0)) & (__u16)0x00ffU) << 8) |
> > (((__u16)((0xf0)) & (__u16)0xff00U) >> 8) ))))))) >> 4;
> >
> > It also results in a compiler warning:
> > [...] warning: integer overflow in expression
> >
> > although I have to admit that right now I do not know where in that
> > beast the compiler thinks it got an overflow. Twice, in fact. In any
> > case, it's pretty hard on the compiler.
> >
> > Additionally, doing it this way means that programmers continually need
> > to think about endianness *all over the code*. Literally *everywhere*
> > that touches values coming from hardware/going to hardware, which is
> > pretty much everywhere in a driver.
> >
> > Now, here's what I want to teach you to do instead.
> >
> > Keep the phy_flags field defined as it was:
> > __le16 phy_flags; /* general phy flags: band, modulation, ... */
> >
> > but do it like everybody else and define the values as they are in the
> > phy_flags field without thinking about endianness at all:
> >
> > #define RX_RES_PHY_FLAGS_BAND_24_MSK (1 << 0)
> > #define RX_RES_PHY_FLAGS_MOD_CCK_MSK (1 << 1)
> > #define RX_RES_PHY_FLAGS_SHORT_PREAMBLE_MSK (1 << 2)
> > #define RX_RES_PHY_FLAGS_NARROW_BAND_MSK (1 << 3)
> > #define RX_RES_PHY_FLAGS_ANTENNA_MSK 0xf0
> >
> > This is easier for the person writing the definitions and looks much
> > cleaner to boot.
> >
> > Then, when it comes to using a value, simply do:
> >
> > u16 phy_flags = le16_to_cpu(phy_res->phy_flags);
> >
> > (and if you get it wrong, sparse will warn here.)
> >
> > Then, you can do the easy:
> > antenna = (phy_flags & ANTENNA_MASK) >> 4;
> >
> > without having to think about endianness. And if you use the field again
> > and again you never have any conversion functions.
> >
> > Presto. As long as you think in terms of
> > this is the phy_flags field that contains X, Y and Z at
> > positions x, y, z
> >
> > rather than
> >
> > this is 16 bits that are laid out in such and such way
> >
> > (which I think everybody except hardware designers does), you win.
> >
> > As a bonus, your code is much easier to read and much smaller (C/header
> > file size I mean). Also, it's a lot more efficient for fields that
> > contain multiple fields like the antenna field (with more than one bit)
> > because you do the conversion only once.
> >
> > Think of it this way:
> >
> > Your current style of taking care of endianness requires thinking about
> > it everywhere, is thus prone to errors and looks ugly.
> >
> > The way I'm suggesting is to convert all data to native (CPU) endianness
> > _as it enters the system_, i.e. at the driver/hardware boundary and then
> > think in terms of the "field" after that, never bothering to think about
> > endianness again. In fact, you shouldn't ever need to use cpu_to_le*()
> > in the RX path, only in any control/TX paths where you need to push a
> > value down to the hardware.
> >
> > The iwlwifi code is a big mess this way. I'm willing to help with the
> > conversion, it should be a pretty mechanical removal of many many
> > cpu_to_le16/32 calls and some fixups and will definitely make the code
> > much easier to maintain. In fact, it would probably have been easier if
> > you'd written the code without respect for endianness and then we'd
> > simply annotated all structures that are shared with the hardware and
> > fixed the sparse warnings at the system boundaries.
> >
> > I hope this helps. Let me know if you have any questions.
> >
> > I have one for you: Whatever gave you the idea of doing it this way?
> > Isn't it common sense that converting data at system boundaries is much
> > easier and less prone to errors than trying to pull it through the whole
> > second system in a non-native format? Whether it's endianness, character
> > set conversion, ...
> >
> > johannes
> >
> > [1] By the way, using __constant_cpu_to_le16() is not useful because
> > cpu_to_le16() already checks whether the argument is a constant. You
> > should only use it if absolutely necessary for some reason. That's why
> > it has two underscores.
> >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-11-28 21:43 ` Dan Williams
@ 2007-11-29 0:58 ` John W. Linville
2007-11-29 23:02 ` Tomas Winkler
0 siblings, 1 reply; 19+ messages in thread
From: John W. Linville @ 2007-11-29 0:58 UTC (permalink / raw)
To: Dan Williams
Cc: Tomas Winkler, Johannes Berg, linux-wireless, Zhu Yi,
Reinette Chatre
On Wed, Nov 28, 2007 at 04:43:12PM -0500, Dan Williams wrote:
> On Wed, 2007-11-28 at 20:20 +0200, Tomas Winkler wrote:
> > If you would compose your email in less arrogant tone I would answer
> > you why your assumptions are wrong.
> > I know it is tempting to teach BIG Intel, but please try to keep good
> > spirit on this mailing list as it was so far.
> > Thanks
> > Tomas.
>
> Ignoring the email tones and focusing on the problem, could you
> elaborate your reasons? Doing endian conversions at the boundaries is
> quite a bit simpler and does lead to cleaner, more readable code. The
> bulk of the work being done with a softmac card is in the driver +
> stack, so the clearer those are, the better for everyone.
ACK
While I can understand why you might think that Johannes was being
smug, I doubt if he really meant to be. Please try to presume the
best intent. :-)
As Dan points-out, Johannes main point (i.e. convert data at system
boundaries) makes plenty of sense. Please do take the time to tell
us what he is missing?
Thanks,
John
--
John W. Linville
linville@tuxdriver.com
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-11-29 0:58 ` John W. Linville
@ 2007-11-29 23:02 ` Tomas Winkler
2007-12-10 11:42 ` Johannes Berg
0 siblings, 1 reply; 19+ messages in thread
From: Tomas Winkler @ 2007-11-29 23:02 UTC (permalink / raw)
To: John W. Linville
Cc: Dan Williams, Johannes Berg, linux-wireless, Zhu Yi,
Reinette Chatre
> ACK
>
> While I can understand why you might think that Johannes was being
> smug, I doubt if he really meant to be. Please try to presume the
> best intent. :-)
>
Here I'm speaking for myself here not for Intel.
He is a smug, he has 'everything is a crap' attitude yet that's good
for revolution and I honestly appreciate Johannes' work on mac80211.
Things are finally moving somewhere. Thanks Johannes.
> As Dan points-out, Johannes main point (i.e. convert data at system
> boundaries) makes plenty of sense. Please do take the time to tell
> us what he is missing?
>
Since all of also hold a CS degree the first thing that come to us is
to create a clean cut, to translate the data from native order to
little endian just before submitting to HW and same on the opposite
direction.
Iwlwifi driver talks to firmware using so called host command
structures defined naturally in little endian. Statistically most of
the operations are setting and testing bits which is endianity
agnostic meaning no run time swap operations are needed.
The few computation including the shifting one visible in Johannes
example are the trade off of run time efficiency and 'ugliness'. It
was better bargain for us. Iwlwifi keeps that in host command
structures through the driver operation so we don't need to keep
shadow native layered structures and there is quite a few of them.
Also number swap operation is much lower then the other approach.
We rather think of all constants for host commands as liitle endian.
As we stick consistently to this paradigma it becomes natural.
I'm not sure if this was a good decision for a long run as we are
adding more hardware and we are need more abstraction, at that point
it looked correct.
As Stephan is pressing for splitting the driver. If I look from his
point it seams like a right way but as I see it now from my point he
just made us few weeks of work more. And don't take me wrong I also
share his opinion that those ifdefs were evil.
So I'm not saying that Johannes is wrong in general he just came with
different assumptions.
Thanks for your attentions and I'm waiting for your flames, just
remember I don't answers to insults :)
Thanks
Tomas..
> Thanks,
>
> John
> --
> John W. Linville
> linville@tuxdriver.com
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-11-29 23:02 ` Tomas Winkler
@ 2007-12-10 11:42 ` Johannes Berg
2007-12-10 14:18 ` Tomas Winkler
0 siblings, 1 reply; 19+ messages in thread
From: Johannes Berg @ 2007-12-10 11:42 UTC (permalink / raw)
To: Tomas Winkler
Cc: John W. Linville, Dan Williams, linux-wireless, Zhu Yi,
Reinette Chatre
[-- Attachment #1: Type: text/plain, Size: 5240 bytes --]
Hi,
Apologies for the late reply, I was rather busy and wanted to
investigate your claims a bit more in detail.
> Iwlwifi driver talks to firmware using so called host command
> structures defined naturally in little endian.
Yes, I know that. I actually read the code before saying it was ugly.
> Statistically most of
> the operations are setting and testing bits which is endianity
> agnostic meaning no run time swap operations are needed.
It's interesting that you say this, single bits don't need any
endianness conversion at all if you simply define the container types as
u8. Of course, that isn't always how the hardware is documented and not
always the right thing to do, but it is possible.
However, with much of the code it it would actually make sense: For
example, RXON_FLG_* go from bit 0 to bit 7 (a single u8) and then comes
the antenna selection mask. "HT flags" are called "flags" but really are
two- and three-bit values mashed into a single field so the "flags"
argument doesn't apply. Filter flags go from 0-6. Similar for QoS flags.
Key "flags" really aren't completely flags, contain a three-bit field
for the encryption type. etc etc.
> The few computation including the shifting one visible in Johannes
> example are the trade off of run time efficiency and 'ugliness'. It
> was better bargain for us. Iwlwifi keeps that in host command
> structures through the driver operation so we don't need to keep
> shadow native layered structures and there is quite a few of them.
I never said you should keep native layout structures. I merely said
that it leads to better code if the data is converted to CPU endianness
at the system boundary. While the actual system boundary is obviously
the PCI(-E) bus, the logical system boundary may well be at the point
where you load a certain value from the DMA memory into the CPU to
operate with it (whether it's then pushed back out to memory on the
stack is uninteresting). Yes, doing that can lead to larger C files, but
the compile result won't suffer; given a sufficiently smart compiler
it'll even be *completely* equivalent on little-endian platforms
(because nothing is declared volatile so the extra variables containing
the CPU byte order fields can be completely removed by the compiler by
going back to the memory the data came from if not enough registers are
available).
> Also number swap operation is much lower then the other approach.
As I will show, this is not true. At least not if you go away from
"shadow" structures.
> We rather think of all constants for host commands as liitle endian.
> As we stick consistently to this paradigma it becomes natural.
> I'm not sure if this was a good decision for a long run as we are
> adding more hardware and we are need more abstraction, at that point
> it looked correct.
Maybe it's "natural" to Intel people to think in little endian, but I'd
guess most other people don't want to be bothered thinking about
endianness in every single line of code.
Here's my argument for why this is inefficient not only in terms of
comprehensibility but also in terms of runtime penalty.
Let's consider STA flags. I don't know whether they're written or read
more but it hardly matters.
The field is defined as follows:
| bits | 0-7 | 8 | 9-16 | 19-20 | 21 | 22 | 23-25 | ...
| means| - | pwrsave | - | aggsz | fat | mimodisable | mpdu density | -
In order to set all information in this field, you need code like this:
--- >% ---
__le32 staflags = 0;
if (sta power save)
staflags |= STA_FLG_PWR_SAVE_MSK;
staflags |= cpu_to_le32(ampdu_factor << STA_FLG_MAX_AGG_SIZE_POS);
staflags |= cpu_to_le32(ampdu_density << STA_FLG_AGG_MPDU_DENSITY_POS);
[...]
memstruct->sta_flags = staflags;
--- %< ---
(similar code is present in iwl4965_set_ht_add_station)
Let's compare to my approach (assuming the constants are appropriately
redefined):
--- >% ---
u32 staflags = 0;
if (sta power save)
staflags |= STA_FLG_PWR_SAVE_MSK;
staflags |= ampdu_factor << STA_FLG_MAX_AGG_SIZE_POS;
staflags |= ampdu_density << STA_FLG_AGG_MPDU_DENSITY_POS;
[...]
memstruct->sta_flags = cpu_to_le32(staflags);
--- %< ---
Notice how it is
(a) easier to understand (especially when you look at the constants)
and
(b) more (run-time) efficient?
Yes, you can make a case for "flags are nicer this way", but in fact
even when you have flags it doesn't matter much at all because the flags
value is loaded from memory which takes much longer than byte-swapping
it anyway. [1]
The above code becomes (in my opinion) even harder to understand when
you read out these fields, because it requires doing something like
this:
ampdu_factor = le32_to_cpu(staflags) >> STA_FLG_MAX_AGG_SIZE_POS;
[etc]
which is similar to the example I quoted in my first mail and absolutely
not intuitive.
johannes
[1] And you could even use the little-endian load macros (like
le32_to_cpup, beware of alignment though) that powerpc for example
optimises into a single "lwbrx" statement, sparc64 does a similar thing.
For little endian processors it doesn't make a difference at all.
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-12-10 11:42 ` Johannes Berg
@ 2007-12-10 14:18 ` Tomas Winkler
2007-12-10 15:18 ` Johannes Berg
0 siblings, 1 reply; 19+ messages in thread
From: Tomas Winkler @ 2007-12-10 14:18 UTC (permalink / raw)
To: Johannes Berg
Cc: John W. Linville, Dan Williams, linux-wireless, Zhu Yi,
Reinette Chatre
> It's interesting that you say this, single bits don't need any
> endianness conversion at all if you simply define the container types as
> u8. Of course, that isn't always how the hardware is documented and not
> always the right thing to do, but it is possible.
>
You are right it will be much more damange for us not to be not
aligned with the HW/Firmware documentation. It is still pretty much
alive. It's hard to talk to HW people if you don't use the same
language. Sometimes the bit can move across the u8 boundaries and it
becomes nasty.
> However, with much of the code it it would actually make sense:
We may consider this as it get more stable.
> I never said you should keep native layout structures. I merely said
> that it leads to better code if the data is converted to CPU endianness
> at the system boundary. While the actual system boundary is obviously
> the PCI(-E) bus, the logical system boundary may well be at the point
> where you load a certain value from the DMA memory into the CPU to
> operate with it (whether it's then pushed back out to memory on the
> stack is uninteresting).
I'm not buying this logical boundary definition if you don't do clean
cut you still have to be endian aware across the whole code so what's
a difference.
Yes, doing that can lead to larger C files, but
> the compile result won't suffer; given a sufficiently smart compiler
> it'll even be *completely* equivalent on little-endian platforms
> (because nothing is declared volatile so the extra variables containing
> the CPU byte order fields can be completely removed by the compiler by
> going back to the memory the data came from if not enough registers are
> available).
b = le_to_cpu(a)
b != FLAG
a = cpu_to_le(b)
is probably longer then
a |= FL:AG_LE;
n matter what.
I didn't follow latest compiler optimizations but that way I don't
relay on them. I will try to look at the assembly what's difference..
> > Also number swap operation is much lower then the other approach.
>
> As I will show, this is not true. At least not if you go away from
> "shadow" structures.
>
I appreciate your effort but for sake of your time don't bother to do
any dramatic changes we are adding new HW and it this course will
clean up the code, naturaly mostly in host commands
> > We rather think of all constants for host commands as liitle endian.
> > As we stick consistently to this paradigma it becomes natural.
> > I'm not sure if this was a good decision for a long run as we are
> > adding more hardware and we are need more abstraction, at that point
> > it looked correct.
>
> Maybe it's "natural" to Intel people to think in little endian, but I'd
> guess most other people don't want to be bothered thinking about
> endianness in every single line of code
I still argue that you will have more swap operation in the code
meaning that you also will have to be aware of endianity at least with
your definition of logical boundaries.
> Here's my argument for why this is inefficient not only in terms of
> comprehensibility but also in terms of runtime penalty.
>
> Let's consider STA flags. I don't know whether they're written or read
> more but it hardly matters.
>
> The field is defined as follows:
> | bits | 0-7 | 8 | 9-16 | 19-20 | 21 | 22 | 23-25 | ...
> | means| - | pwrsave | - | aggsz | fat | mimodisable | mpdu density | -
>
> In order to set all information in this field, you need code like this:
>
> --- >% ---
> __le32 staflags = 0;
>
> if (sta power save)
> staflags |= STA_FLG_PWR_SAVE_MSK;
>
> staflags |= cpu_to_le32(ampdu_factor << STA_FLG_MAX_AGG_SIZE_POS);
> staflags |= cpu_to_le32(ampdu_density << STA_FLG_AGG_MPDU_DENSITY_POS);
>
> [...]
>
> memstruct->sta_flags = staflags;
> --- %< ---
>
> (similar code is present in iwl4965_set_ht_add_station)
>
This I will fix. This just another of exemption of our approach.
> [1] And you could even use the little-endian load macros (like
> le32_to_cpup, beware of alignment though) that powerpc for example
> optimises into a single "lwbrx" statement, sparc64 does a similar thing.
> For little endian processors it doesn't make a difference at all.
Hmm I wasn't aware. I will look at it.
Thanks
Tomas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-12-10 14:18 ` Tomas Winkler
@ 2007-12-10 15:18 ` Johannes Berg
2007-12-10 15:30 ` Michael Buesch
2007-12-10 16:10 ` Tomas Winkler
0 siblings, 2 replies; 19+ messages in thread
From: Johannes Berg @ 2007-12-10 15:18 UTC (permalink / raw)
To: Tomas Winkler
Cc: John W. Linville, Dan Williams, linux-wireless, Zhu Yi,
Reinette Chatre
[-- Attachment #1: Type: text/plain, Size: 2797 bytes --]
> > I never said you should keep native layout structures. I merely said
> > that it leads to better code if the data is converted to CPU endianness
> > at the system boundary. While the actual system boundary is obviously
> > the PCI(-E) bus, the logical system boundary may well be at the point
> > where you load a certain value from the DMA memory into the CPU to
> > operate with it (whether it's then pushed back out to memory on the
> > stack is uninteresting).
>
> I'm not buying this logical boundary definition if you don't do clean
> cut you still have to be endian aware across the whole code so what's
> a difference.
No, you only have to be endian aware when you do a load from a structure
you *know* was/is DMAed from/to the card. That's a world of a
difference.
> Yes, doing that can lead to larger C files, but
> > the compile result won't suffer; given a sufficiently smart compiler
> > it'll even be *completely* equivalent on little-endian platforms
> > (because nothing is declared volatile so the extra variables containing
> > the CPU byte order fields can be completely removed by the compiler by
> > going back to the memory the data came from if not enough registers are
> > available).
>
> b = le_to_cpu(a)
> b != FLAG
> a = cpu_to_le(b)
> is probably longer then
> a |= FL:AG_LE;
> n matter what.
Obviously, with a flag this is true. But if you see my example below
then you'll realise that most of the time you don't just do a single
flag in a but multiple multi-bit values.
> > Maybe it's "natural" to Intel people to think in little endian, but I'd
> > guess most other people don't want to be bothered thinking about
> > endianness in every single line of code
>
> I still argue that you will have more swap operation in the code
> meaning that you also will have to be aware of endianity at least with
> your definition of logical boundaries.
Not considering single bits, you will have fewer swap operations as I've
shown in my example code. And you're only hiding the fact that you have
many more logical swap operations, most of which are of course constant,
when you do consider single bits.
> This I will fix. This just another of exemption of our approach.
There's nothing you can fix here. It's not an exemption of your
approach, it's a logical consequence of your approach. Making exemptions
for things like this in your approach will imho most likely make the
code even worse. But I'm interested to see how you want to "fix" this
without going to CPU-defined constants instead of LE-defined constants.
And once you start *mixing* the two approaches things will *really* be
confusing, so you really do need to write it the way I wrote (almost
quoted but simplified).
johannes
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-12-10 15:18 ` Johannes Berg
@ 2007-12-10 15:30 ` Michael Buesch
2007-12-10 16:18 ` Johannes Berg
2007-12-10 16:21 ` Tomas Winkler
2007-12-10 16:10 ` Tomas Winkler
1 sibling, 2 replies; 19+ messages in thread
From: Michael Buesch @ 2007-12-10 15:30 UTC (permalink / raw)
To: Johannes Berg
Cc: Tomas Winkler, John W. Linville, Dan Williams, linux-wireless,
Zhu Yi, Reinette Chatre
On Monday 10 December 2007 16:18:57 Johannes Berg wrote:
>
> > > I never said you should keep native layout structures. I merely said
> > > that it leads to better code if the data is converted to CPU endianness
> > > at the system boundary. While the actual system boundary is obviously
> > > the PCI(-E) bus, the logical system boundary may well be at the point
> > > where you load a certain value from the DMA memory into the CPU to
> > > operate with it (whether it's then pushed back out to memory on the
> > > stack is uninteresting).
> >
> > I'm not buying this logical boundary definition if you don't do clean
> > cut you still have to be endian aware across the whole code so what's
> > a difference.
>
> No, you only have to be endian aware when you do a load from a structure
> you *know* was/is DMAed from/to the card. That's a world of a
> difference.
I agree that it's a lot easier to read and maintain, if you convert _all_
your data structures (also single integers) to and from device endianness
at the time when you write/read them to the device. It unifies the
workflow a lot and results in much less swappings.
my_data = read_drom_device()
swap_to_cpu(my_data)
do_whatever(my_data)
store the data somewhere else in the device structs for
later use, maybe.
do_something_else(my_data)
You see that now you have only _one_ place that you have to care about.
And if you have to write your data back at some point, simply do it
just before the write.
And, little endian intel guys, this will simplify your life a lot by
removing lots of cpu_to_xxx stuff _and_ result
in no performance loss, as the swap will be optimized away. :)
That's a deal, eh?
And we big endian people, we don't care about one or two swap instructions.
It won't hurt performance in any way.
(Johannes also showed that in most cases this approach doesn't even
add extra instructions).
--
Greetings Michael.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-12-10 15:30 ` Michael Buesch
@ 2007-12-10 16:18 ` Johannes Berg
2007-12-10 16:48 ` Michael Buesch
2007-12-10 16:21 ` Tomas Winkler
1 sibling, 1 reply; 19+ messages in thread
From: Johannes Berg @ 2007-12-10 16:18 UTC (permalink / raw)
To: Michael Buesch
Cc: Tomas Winkler, John W. Linville, Dan Williams, linux-wireless,
Zhu Yi, Reinette Chatre
[-- Attachment #1: Type: text/plain, Size: 1600 bytes --]
> my_data = read_drom_device()
> swap_to_cpu(my_data)
> do_whatever(my_data)
> store the data somewhere else in the device structs for
> later use, maybe.
> do_something_else(my_data)
>
> You see that now you have only _one_ place that you have to care about.
> And if you have to write your data back at some point, simply do it
> just before the write.
The thing is that Tomas is saying that because they don't have a
function to "read_from_device()" but that is rather only
"shared_structure->something" it is special and completely different
than regular drivers.
The absolute worst thing imo is in the current radiotap code (before my
patch) where it passes around a value as "u16" but then needs to convert
it to __le16 before accessing bits in it because things are defined as
LE.
Also, let me reiterate the argument that this is not using more
code/run-time-memory size:
Consider
u16 phyflags = le16_to_cpu(shared_struct->phy_flags);
[...]
if (phyflags & ...)
[...]
vs.
[...]
if (shared_struct->phy_flags & ...)
[...]
If you have a LE machine, then the compiler should in both cases emit
the same code. If you have BE machine, you have a *single* "superfluous"
byteswap, and if there are multi-bit fields you have far *fewer*
byteswaps. If you instead wrote
u16 phyflags = le16_to_cpup(&shared_struct->phy_flags);
then most likely the byteswap would be microcoded/done by hardware
during the load.
The actual runtime penalty of byteswaps is insignificant since fetching
the value from memory already takes forever.
johannes
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-12-10 16:18 ` Johannes Berg
@ 2007-12-10 16:48 ` Michael Buesch
0 siblings, 0 replies; 19+ messages in thread
From: Michael Buesch @ 2007-12-10 16:48 UTC (permalink / raw)
To: Johannes Berg
Cc: Tomas Winkler, John W. Linville, Dan Williams, linux-wireless,
Zhu Yi, Reinette Chatre
On Monday 10 December 2007 17:18:43 Johannes Berg wrote:
>
> > my_data = read_drom_device()
> > swap_to_cpu(my_data)
> > do_whatever(my_data)
> > store the data somewhere else in the device structs for
> > later use, maybe.
> > do_something_else(my_data)
> >
> > You see that now you have only _one_ place that you have to care about.
> > And if you have to write your data back at some point, simply do it
> > just before the write.
>
> The thing is that Tomas is saying that because they don't have a
> function to "read_from_device()" but that is rather only
> "shared_structure->something" it is special and completely different
> than regular drivers.
The read_from_device wasn't supposed to be a function call, actually.
It was more a "get data from the device, somehow". That could be reading
a structure that we just received via DMA.
> The actual runtime penalty of byteswaps is insignificant since fetching
> the value from memory already takes forever.
Right, as per definition these values can't be in the cache. :)
--
Greetings Michael.
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-12-10 15:30 ` Michael Buesch
2007-12-10 16:18 ` Johannes Berg
@ 2007-12-10 16:21 ` Tomas Winkler
2007-12-10 16:25 ` Johannes Berg
1 sibling, 1 reply; 19+ messages in thread
From: Tomas Winkler @ 2007-12-10 16:21 UTC (permalink / raw)
To: Michael Buesch
Cc: Johannes Berg, John W. Linville, Dan Williams, linux-wireless,
Zhu Yi, Reinette Chatre
> my_data = read_drom_device()
> swap_to_cpu(my_data)
> do_whatever(my_data)
> store the data somewhere else in the device structs for
> later use, maybe.
> do_something_else(my_data)
>
> You see that now you have only _one_ place that you have to care about.
> And if you have to write your data back at some point, simply do it
> just before the write.
>
Correct xxcept that this is not the case.
Our case is
Loop:
my_big_little_endian_struct.
set little tiny bit in it
send to HW
end
> And, little endian intel guys, this will simplify your life a lot by
> removing lots of cpu_to_xxx stuff _and_ result
> in no performance loss, as the swap will be optimized away. :)
> That's a deal, eh?
> And we big endian people, we don't care about one or two swap instructions.
> It won't hurt performance in any way.
I care. I'm running it on big endian, truly. Yes I cannot really say
how the performance be hurt, but I still prefer my way.
> (Johannes also showed that in most cases this approach doesn't even
> add extra instructions).
>
> --
> Greetings Michael.
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-12-10 16:21 ` Tomas Winkler
@ 2007-12-10 16:25 ` Johannes Berg
2007-12-10 21:18 ` Tomas Winkler
0 siblings, 1 reply; 19+ messages in thread
From: Johannes Berg @ 2007-12-10 16:25 UTC (permalink / raw)
To: Tomas Winkler
Cc: Michael Buesch, John W. Linville, Dan Williams, linux-wireless,
Zhu Yi, Reinette Chatre
[-- Attachment #1: Type: text/plain, Size: 483 bytes --]
> Our case is
>
> Loop:
> my_big_little_endian_struct.
> set little tiny bit in it
> send to HW
> end
You could even code the "set little tiny bit in it" as
struct->flags |= cpu_to_le32(flag value);
which all other drivers do.
It boils down to what I just said:
Most people prefer making endian conversions explicit in the code while
you're hiding it in defines which is unintuitive if only because it
differs from what everybody else does.
johannes
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-12-10 16:25 ` Johannes Berg
@ 2007-12-10 21:18 ` Tomas Winkler
0 siblings, 0 replies; 19+ messages in thread
From: Tomas Winkler @ 2007-12-10 21:18 UTC (permalink / raw)
To: Johannes Berg
Cc: Michael Buesch, John W. Linville, Dan Williams, linux-wireless,
Zhu Yi, Reinette Chatre
On Dec 10, 2007 6:25 PM, Johannes Berg <johannes@sipsolutions.net> wrote:
>
> You could even code the "set little tiny bit in it" as
>
> struct->flags |= cpu_to_le32(flag value);
>
> which all other drivers do.
I hope you grep the kernel before you wrote this :). No iwlwifi is not
the only driver that does that. I cannot claim invented this.
> It boils down to what I just said:
>
> Most people prefer making endian conversions explicit in the code while
> you're hiding it in defines which is unintuitive if only because it
> differs from what everybody else does.
We did it because it's more readable for us you are free to think
differently. If you referring to everybody else as the code of other
drivers then they are different because the HW is different.
Just make it clear I'm part of this argument because I believe I'm
correct from technical point.. We didn't hide anything and we are glad
that other people contribute to the iwlwifi. any other suggestions are
pure speculations.
Personally I think this conversation was saturated enough. I'll
reconsider all your suggestions mainly the cpup stuff that I wasn't
aware of.
Thanks
Tomas
> johannes
>
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-12-10 15:18 ` Johannes Berg
2007-12-10 15:30 ` Michael Buesch
@ 2007-12-10 16:10 ` Tomas Winkler
2007-12-10 16:17 ` Johannes Berg
2007-12-10 16:23 ` Johannes Berg
1 sibling, 2 replies; 19+ messages in thread
From: Tomas Winkler @ 2007-12-10 16:10 UTC (permalink / raw)
To: Johannes Berg
Cc: John W. Linville, Dan Williams, linux-wireless, Zhu Yi,
Reinette Chatre
> > I'm not buying this logical boundary definition if you don't do clean
> > cut you still have to be endian aware across the whole code so what's
> > a difference.
>
> No, you only have to be endian aware when you do a load from a structure
> you *know* was/is DMAed from/to the card. That's a world of a
> difference.
I'm not only loading the structure I'm using it all over the code. I
don't have clean cut. You can blame me for that but that's another
discussion.
> > b = le_to_cpu(a)
> > b != FLAG
> > a = cpu_to_le(b)
> > is probably longer then
> > a |= FL:AG_LE;
> > n matter what.
>
> Obviously, with a flag this is true. But if you see my example below
> then you'll realise that most of the time you don't just do a single
> flag in a but multiple multi-bit values.
>
We are going around the same issue all gain. I already agreed that
your approach will be correct if we had more code of your example type
then setting bits. But we are mostly setting bits.
> Not considering single bits, you will have fewer swap operations as I've
> shown in my example code. And you're only hiding the fact that you have
> many more logical swap operations, most of which are of course constant,
> when you do consider single bits.
>
Again we have mostly bit setting.
> There's nothing you can fix here. It's not an exemption of your
> approach, it's a logical consequence of your approach. Making exemptions
> for things like this in your approach will imho most likely make the
> code even worse. But I'm interested to see how you want to "fix" this
> without going to CPU-defined constants instead of LE-defined constants.
_POS are (shift constants) are native defined.
> And once you start *mixing* the two approaches things will *really* be
> confusing, so you really do need to write it the way I wrote (almost
> quoted but simplified).
Yes, something like that.
>
This is really arguing about taste and feel. I don't find it more
confusing then any mixture of native and some-endian platform. I don't
know how to measure it but I didn't see any endianity bug for more
then 6 months so I guess my approach is working.
In addition it all may change as we cleanup the code.
Tomas
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-12-10 16:10 ` Tomas Winkler
@ 2007-12-10 16:17 ` Johannes Berg
2007-12-10 16:23 ` Johannes Berg
1 sibling, 0 replies; 19+ messages in thread
From: Johannes Berg @ 2007-12-10 16:17 UTC (permalink / raw)
To: Tomas Winkler
Cc: John W. Linville, Dan Williams, linux-wireless, Zhu Yi,
Reinette Chatre
[-- Attachment #1: Type: text/plain, Size: 361 bytes --]
> This is really arguing about taste and feel. I don't find it more
> confusing then any mixture of native and some-endian platform. I don't
> know how to measure it but I didn't see any endianity bug for more
> then 6 months so I guess my approach is working.
I recently fixed one due to this style in the ieee80211 TKIP (or QoS?)
code.
johannes
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-12-10 16:10 ` Tomas Winkler
2007-12-10 16:17 ` Johannes Berg
@ 2007-12-10 16:23 ` Johannes Berg
1 sibling, 0 replies; 19+ messages in thread
From: Johannes Berg @ 2007-12-10 16:23 UTC (permalink / raw)
To: Tomas Winkler
Cc: John W. Linville, Dan Williams, linux-wireless, Zhu Yi,
Reinette Chatre
[-- Attachment #1: Type: text/plain, Size: 1067 bytes --]
> I'm not only loading the structure I'm using it all over the code. I
> don't have clean cut. You can blame me for that but that's another
> discussion.
I actually thought that was intentional to reduce the number of local
variables.
> We are going around the same issue all gain. I already agreed that
> your approach will be correct if we had more code of your example type
> then setting bits. But we are mostly setting bits.
Going from the definitions in the header files that's not true. I'm too
lazy to do actual stats though. Yes, there are many bits, but very often
you have two/three-bit values intermingled with flags.
> This is really arguing about taste and feel.
Sort of. Maybe all I'm saying is that you're (artificially) limiting the
number of people who can work on your drivers because all non-Intel
drivers use the opposing style of defining things in native byte-order
while you're hiding the endian conversion in the defines rather than
making it explicit in the code which is much easier to understand.
johannes
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 828 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: coding style lesson: iwlwifi vs. endianness
2007-11-28 18:20 ` Tomas Winkler
2007-11-28 18:50 ` Johannes Berg
2007-11-28 21:43 ` Dan Williams
@ 2007-11-29 9:03 ` Holger Schurig
2 siblings, 0 replies; 19+ messages in thread
From: Holger Schurig @ 2007-11-29 9:03 UTC (permalink / raw)
To: linux-wireless; +Cc: Tomas Winkler, Johannes Berg, Zhu Yi, Reinette Chatre
> I know it is tempting to teach BIG Intel
Hmm, Intel is not the enemy or foe. It's actually quite liked,
because Intel is playing in many ways quite open and helpful
towards the OSS community. Having experienced openly
downloadable manuals for PXA processors from Intel and now the
barricaded website from Marvell, after they took over
PXA255/PXA270 ...
But back to the topic:
A driver that I happen to know quite well now is the
(non-mac80211) libertas in drivers/net/wireless/libertas/. And
this driver happens to use le16_to_cpu() and cpu_to_le16() all
over the place. Both in it's library part and in it's hardware
interface module (if_usb.c, if_cf.c, if_sdio.c).
That makes it easy to keep the thing "sparse"-clean. I wrote the
compact-flash code on x86. And others used that later on ARM and
MIPS (I'm going to use it on PXA as well). They didn't have a
endianness problem so far.
So, using that programming idiom paid well back.
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2007-12-10 21:18 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-11-27 18:44 coding style lesson: iwlwifi vs. endianness Johannes Berg
2007-11-28 18:20 ` Tomas Winkler
2007-11-28 18:50 ` Johannes Berg
2007-11-28 21:43 ` Dan Williams
2007-11-29 0:58 ` John W. Linville
2007-11-29 23:02 ` Tomas Winkler
2007-12-10 11:42 ` Johannes Berg
2007-12-10 14:18 ` Tomas Winkler
2007-12-10 15:18 ` Johannes Berg
2007-12-10 15:30 ` Michael Buesch
2007-12-10 16:18 ` Johannes Berg
2007-12-10 16:48 ` Michael Buesch
2007-12-10 16:21 ` Tomas Winkler
2007-12-10 16:25 ` Johannes Berg
2007-12-10 21:18 ` Tomas Winkler
2007-12-10 16:10 ` Tomas Winkler
2007-12-10 16:17 ` Johannes Berg
2007-12-10 16:23 ` Johannes Berg
2007-11-29 9:03 ` Holger Schurig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).