From: Jerome Pouiller <Jerome.Pouiller@silabs.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jules Irenge <jbi.octave@gmail.com>,
"gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"devel@driverdev.osuosl.org" <devel@driverdev.osuosl.org>,
"Boqun.Feng@microsoft.com" <Boqun.Feng@microsoft.com>
Subject: Re: [PATCH] staging: wfx: add gcc extension __force cast
Date: Tue, 12 Nov 2019 11:32:07 +0000 [thread overview]
Message-ID: <2852964.2jKPfdd9jE@pc-42> (raw)
In-Reply-To: <20191111202852.GX26530@ZenIV.linux.org.uk>
Hello Al,
Thank you for your extensive review.
On Monday 11 November 2019 21:28:52 CET Al Viro wrote:
> On Mon, Nov 11, 2019 at 01:51:33PM +0000, Jules Irenge wrote:
> >
> > > NAK. force-cast (and it's not a gcc extension, BTW - it's sparse) is basically
> > > "I know better; the code is right, so STFU already". *IF* counters.count_...
> > > is really little-endian 32bit, then why isn't it declared that way? And if
> > > it's host-endian, you've just papered over a real bug here.
> > >
> > > As a general rule "fix" doesn't mean "tell it to shut up"...
> > >
> >
> > Thanks for the comments, I have updated but I have a mixed mind on the
> > __le32. I have to read more about it.
> >
> > I would appreciate if you can comment again on the update.
>
> From the look at the driver, it seems that all these counters are a part of
> structure that is read from the hardware and only used as little-endian.
> So just declare all fields in struct hif_mib_extended_count_table as
> __le32; easy enough. Looking a bit further, the same goes for
> struct hif_mib_bcn_filter_table ->num_of_info_elmts
> struct hif_mib_keep_alive_period ->keep_alive_period (__le16)
> struct hif_mib_template_frame ->frame_length (__le16)
> struct hif_mib_set_association_mode ->basic_rate_set (__le32)
> struct hif_req_update_ie ->num_i_es (__le16)
> struct hif_req_write_mib ->mib_id, ->length (__le16 both)
> struct hif_req_read_mib ->mib_id (__le16)
> struct hif_req_configuration ->length (__le16)
Indeed, structs declared in hif_api* are shared with the hardware and
should use __le16/__le32. However, as you noticed below, these structs
are sometime used in other parts of the code that are not related to
the hardware.
I have in my local queue a set of patches that improve the situation.
Objective is to limit usage of hif structs to hif_tx.c, hif_tx_mib.c
and hif_rx.c (which are correct places to handle hardware
communication). I hope to be able to submit these patches in 2 weeks.
[...]
> and that's where the real bugs start to show up; leaving the misbegotten
> forest of macros in misbegotten tracing shite aside, we have this:
>
> static const struct ieee80211_supported_band wfx_band_2ghz = {
> .channels = wfx_2ghz_chantable,
> .n_channels = ARRAY_SIZE(wfx_2ghz_chantable),
> .bitrates = wfx_rates,
> .n_bitrates = ARRAY_SIZE(wfx_rates),
> .ht_cap = {
> // Receive caps
> .cap = IEEE80211_HT_CAP_GRN_FLD | IEEE80211_HT_CAP_SGI_20 |
> IEEE80211_HT_CAP_MAX_AMSDU | (1 << IEEE80211_HT_CAP_RX_STBC_SHIFT),
> .ht_supported = 1,
> .ampdu_factor = IEEE80211_HT_MAX_AMPDU_16K,
> .ampdu_density = IEEE80211_HT_MPDU_DENSITY_NONE,
> .mcs = {
> .rx_mask = { 0xFF }, // MCS0 to MCS7
> .rx_highest = 65,
> drivers/staging/wfx/main.c:108:39: refering to this initializer.
> Sparse say that it expects rx_highest to be __le16. And that's
> not a driver-specific structure; it's generic ieee80211 one. Which
> says
> struct ieee80211_mcs_info {
> u8 rx_mask[IEEE80211_HT_MCS_MASK_LEN];
> __le16 rx_highest;
> u8 tx_params;
> u8 reserved[3];
> } __packed;
> and grepping for rx_highest through the tree shows that everything else
> is treating it as little-endian 16bit.
>
> Almost certainly a bug on big-endian hosts; should be .rx_highest = cpu_to_le16(65),
> instead.
Agree.
> Looking for more low-hanging fruits, we have
> static int indirect_read32_locked(struct wfx_dev *wdev, int reg, u32 addr, u32 *val)
> {
> int ret;
> __le32 *tmp = kmalloc(sizeof(u32), GFP_KERNEL);
>
> if (!tmp)
> return -ENOMEM;
> wdev->hwbus_ops->lock(wdev->hwbus_priv);
> ret = indirect_read(wdev, reg, addr, tmp, sizeof(u32));
> *val = cpu_to_le32(*tmp);
> _trace_io_ind_read32(reg, addr, *val);
> wdev->hwbus_ops->unlock(wdev->hwbus_priv);
> kfree(tmp);
> return ret;
> }
> with warnings about val = cpu_to_le32(*tmp); fair enough, since *val is
> host-endian (u32) and *tmp - little-endian. Trivial misannotation -
> it should've been le32_to_cpu(), not cpu_to_le32(). Same mapping on
> all CPUs we are ever likely to support, so it's just a misannotation,
> not a bug per se.
Agree.
> drivers/staging/wfx/hif_tx_mib.h:34:38: warning: incorrect type in initializer (different base types)
> drivers/staging/wfx/hif_tx_mib.h:34:38: expected unsigned char [usertype] wakeup_period_max
> drivers/staging/wfx/hif_tx_mib.h:34:38: got restricted __le16 [usertype]
>
> is about
> static inline int hif_set_beacon_wakeup_period(struct wfx_vif *wvif,
> unsigned int dtim_interval,
> unsigned int listen_interval)
> {
> struct hif_mib_beacon_wake_up_period val = {
> .wakeup_period_min = dtim_interval,
> .receive_dtim = 0,
> .wakeup_period_max = cpu_to_le16(listen_interval),
> };
> and struct hif_mib_beacon_wake_up_period has wakeup_period_max declared
> as uint8_t. We are shoving a le16 value into it. Almost certain bug -
> that will result in the listen_interval % 256 on litte-endian host and
> listen_interval / 256 on big-endian one. Looking at the callers to
> see what's actually passed as listen_interval shows only
> hif_set_beacon_wakeup_period(wvif, wvif->dtim_period, wvif->dtim_period);
> and dtim_period in *wvif (struct wfx_vif) can be assigned 0, 1 or
> values coming from struct ieee80211_tim_ie ->dtim_period or
> struct ieee80211_bss_conf ->dtim_period, 8bit in either structure.
>
> In other words, the value stored in val.wakeup_period_max will be
> always zero on big-endian hosts. Definitely bogus, should just
> store that (8bit) value as-is; cpu_to_le16() is wrong here.
Absolutely agree.
> Next piece of fun:
> static inline int hif_beacon_filter_control(struct wfx_vif *wvif,
> int enable, int beacon_count)
> {
> struct hif_mib_bcn_filter_enable arg = {
> .enable = cpu_to_le32(enable),
> .bcn_count = cpu_to_le32(beacon_count),
> };
> return hif_write_mib(wvif->wdev, wvif->id,
> HIF_MIB_ID_BEACON_FILTER_ENABLE, &arg, sizeof(arg));
> }
> Sounds like ->enable and ->bcn_count should both be __le32, which makes
> sense since the structs passed to hardware appear to be fixed-endian on
> that thing. However, annotating them as such adds warnigns:
> drivers/staging/wfx/sta.c:246:35: warning: incorrect type in assignment (different base types)
> drivers/staging/wfx/sta.c:246:35: expected restricted __le32 [assigned] [usertype] bcn_count
> drivers/staging/wfx/sta.c:246:35: got int
> drivers/staging/wfx/sta.c:249:32: warning: incorrect type in assignment (different base types)
> drivers/staging/wfx/sta.c:249:32: expected restricted __le32 [assigned] [usertype] enable
> drivers/staging/wfx/sta.c:249:32: got int
> drivers/staging/wfx/sta.c:253:32: warning: incorrect type in assignment (different base types)
> drivers/staging/wfx/sta.c:253:32: expected restricted __le32 [assigned] [usertype] enable
> drivers/staging/wfx/sta.c:253:32: got int
> drivers/staging/wfx/sta.c:262:62: warning: incorrect type in argument 2 (different base types)
> drivers/staging/wfx/sta.c:262:62: expected int enable
> drivers/staging/wfx/sta.c:262:62: got restricted __le32 [assigned] [usertype] enable
> drivers/staging/wfx/sta.c:262:78: warning: incorrect type in argument 3 (different base types)
> drivers/staging/wfx/sta.c:262:78: expected int beacon_count
> drivers/staging/wfx/sta.c:262:78: got restricted __le32 [assigned] [usertype] bcn_count
>
> All in the same function (wfx_update_filtering()) and we really do store
> host-endian values in those (first 3 places). In the last one we pass
> them to hif_beacon_filter_control(), which does expect host-endian.
> And that's the only thing we do to the instance of hif_mib_bcn_filter_enable
> in there...
>
> Possible solutions:
> 1) store them little-endian there, pass to hif_beacon_filter_control()
> already l-e, get rid of cpu_to_le32() in the latter.
> 2) store them little-endian, pass the entire pointer to struct
> instead of forming it again in hif_beacon_filter_control()
> 3) don't pretend that the objects in hif_beacon_filter_control()
> and in wfx_update_filtering() are of the same type (different layouts on
> big-endian) and replace the one in the caller with two local variables.
> My preference would be (3), as in
[...]
> but that's a matter of taste.
Yes, this is one of the difficult parts. I work on it (I opted for
solution 3).
> Next is bx.c warning about __le32; that's about num_tx_count being fed to cpu_to_le32().
> grepping for that thing results in
> drivers/staging/wfx/bh.c:106: release_count = le32_to_cpu(((struct hif_cnf_multi_transmit *)hif->body)->num_tx_confs);
> drivers/staging/wfx/hif_api_cmd.h:316: uint32_t num_tx_confs;
> drivers/staging/wfx/hif_rx.c:78: int count = body->num_tx_confs;
> which is troubling - the first line (in rx_helper()) expects to find
> a little-endian value in that field, while the last (in hif_multi_tx_confirm())
> - a host-endian, with nothing in sight that might account for conversion
> from one to another.
>
> Let's look at the call chains: hif_multi_tx_confirm() is called only as
> hif_handlers[...]->handler(), which happens in in wfx_handle_rx().
> The call is
> hif_handlers[i].handler(wdev, hif, hif->body);
> and hif has come from
> struct hif_msg *hif = (struct hif_msg *) skb->data;
> wfx_handle_rx() is called by the same rx_helper()... skb is created by
> rx_helper() and apparently filled by the call
> if (wfx_data_read(wdev, skb->data, alloc_len))
> goto err;
> right next to the allocation... and prior to the
> release_count = le32_to_cpu(((struct hif_cnf_multi_transmit *)hif->body)->num_tx_confs);
> where we expect little-endian, with nothing to modify the skb contents
> between that and the call of wfx_handle_rx(). hif in rx_helper() points
> to the same place - skb->data. OK, we almost certainly have a bug here.
>
> That thing allocates a packet and fills it with incoming data. Then
> it parses the damn thing, apparently treating the same field of the
> incoming as little-endian in one place and host-endian in another.
> In principle it's possible that the rest of the packet determines
> which one it is, but by the look of that code both places are
> hit if and only if hif->id is equal to HIF_CNF_ID_MULTI_TRANSMIT.
> It *can't* be correct on big-endian. Not even theoretically.
>
> And since it's over-the-wire data, I would expect it to be fixed-endian.
> That needs to be confirmed with the driver's authors and/or direct
> experiment on big-endian host, but I strongly suspect that the right
> fix is to have
> int count = le32_to_cpu(body->num_tx_confs);
> in hif_multi_tx_confirm() (and num_tx_confs annotated as __le32).
Indeed, num_tx_confs is always a le32 value.
> HOWEVER, that opens another nasty can of worms. We have
> struct hif_cnf_tx *buf_loc = (struct hif_cnf_tx *) &body->tx_conf_payload;
> ...
> for (i = 0; i < count; ++i) {
> wfx_tx_confirm_cb(wvif, buf_loc);
> buf_loc++;
> }
> with count derived from the packet and body pointing into the packet. And no
> visible checks that would make sure the loop won't run out of the data we'd
> actually got.
>
> The check in rx_helper() verifies that hif->len matches the amount we'd
> received; the check for ->num_tx_confs in there doesn't look like what
> we'd needed (that would be offset of body.tx_conf_payload in packet +
> num_tx_confs * sizeof(struct hif_cnf_multi_transmit) compared to
> actual size).
>
> So it smells like a remote buffer overrun, little-endian or not.
> And at that point I would start looking for driver original authors with
> some rather pointed questions about the validation of data and lack
> thereof.
There are not so much checks done on data retrieved from the hardware.
I think we can find other similar issues in the driver.
In this particular case, indeed, a little check on length of received
data could be a good idea.
> BTW, if incoming packets are fixed-endian, I would expect more bugs on
> big-endian hosts - wfx_tx_confirm_cb() does things like
> tx_info->status.tx_time =
> arg->media_delay - arg->tx_queue_delay;
> with media_delay and tx_queue_delay both being 32bit fields in the
> incoming packet. So another question to the authors (or documentation,
> or direct experiments) is what endianness do various fields in the incoming
> data have. We can try and guess, but...
Fortunately, answer is simple enough: everything from hardware is
little endian :).
Jules, do you want to take care of fixing theses issues (except the one
about wfx_update_filtering())?
--
Jérôme Pouiller
prev parent reply other threads:[~2019-11-12 11:32 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-11-08 23:38 [PATCH] staging: wfx: add gcc extension __force cast Jules Irenge
2019-11-09 8:36 ` Greg KH
2019-11-09 9:19 ` Al Viro
2019-11-11 13:51 ` Jules Irenge
2019-11-11 20:28 ` Al Viro
2019-11-11 23:16 ` Al Viro
2019-11-12 12:00 ` Jerome Pouiller
2019-11-12 11:32 ` Jerome Pouiller [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2852964.2jKPfdd9jE@pc-42 \
--to=jerome.pouiller@silabs.com \
--cc=Boqun.Feng@microsoft.com \
--cc=devel@driverdev.osuosl.org \
--cc=gregkh@linuxfoundation.org \
--cc=jbi.octave@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.