* [PATCH v2 09/19] ipv6: prevent bounds-check bypass via speculative execution
2018-01-12 0:46 [PATCH v2 00/19] prevent bounds-check bypass via speculative execution Dan Williams
@ 2018-01-12 0:47 ` Dan Williams
2018-01-12 0:47 ` [PATCH v2 10/19] ipv4: " Dan Williams
` (6 subsequent siblings)
7 siblings, 0 replies; 24+ messages in thread
From: Dan Williams @ 2018-01-12 0:47 UTC (permalink / raw)
To: linux-kernel
Cc: linux-arch, akpm, kernel-hardening, Hideaki YOSHIFUJI, netdev,
Alexey Kuznetsov, tglx, torvalds, David S. Miller,
Elena Reshetova, alan
Static analysis reports that 'offset' may be a user controlled value
that is used as a data dependency reading from a raw6_frag_vec buffer.
In order to avoid potential leaks of kernel memory values, block
speculative execution of the instruction stream that could issue further
reads based on an invalid '*(rfv->c + offset)' value.
Based on an original patch by Elena Reshetova.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
net/ipv6/raw.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index 761a473a07c5..0b7ceeb6f709 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -33,6 +33,7 @@
#include <linux/skbuff.h>
#include <linux/compat.h>
#include <linux/uaccess.h>
+#include <linux/nospec.h>
#include <asm/ioctls.h>
#include <net/net_namespace.h>
@@ -725,17 +726,18 @@ static int raw6_getfrag(void *from, char *to, int offset, int len, int odd,
struct sk_buff *skb)
{
struct raw6_frag_vec *rfv = from;
+ char *rfv_buf;
- if (offset < rfv->hlen) {
+ rfv_buf = array_ptr(rfv->c, offset, rfv->hlen);
+ if (rfv_buf) {
int copy = min(rfv->hlen - offset, len);
if (skb->ip_summed == CHECKSUM_PARTIAL)
- memcpy(to, rfv->c + offset, copy);
+ memcpy(to, rfv_buf, copy);
else
skb->csum = csum_block_add(
skb->csum,
- csum_partial_copy_nocheck(rfv->c + offset,
- to, copy, 0),
+ csum_partial_copy_nocheck(rfv_buf, to, copy, 0),
odd);
odd = 0;
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 10/19] ipv4: prevent bounds-check bypass via speculative execution
2018-01-12 0:46 [PATCH v2 00/19] prevent bounds-check bypass via speculative execution Dan Williams
2018-01-12 0:47 ` [PATCH v2 09/19] ipv6: " Dan Williams
@ 2018-01-12 0:47 ` Dan Williams
2018-01-12 7:59 ` Greg KH
2018-01-12 0:47 ` [PATCH v2 15/19] carl9170: " Dan Williams
` (5 subsequent siblings)
7 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2018-01-12 0:47 UTC (permalink / raw)
To: linux-kernel
Cc: linux-arch, akpm, kernel-hardening, Hideaki YOSHIFUJI, netdev,
Alexey Kuznetsov, tglx, torvalds, David S. Miller,
Elena Reshetova, alan
Static analysis reports that 'offset' may be a user controlled value
that is used as a data dependency reading from a raw_frag_vec buffer.
In order to avoid potential leaks of kernel memory values, block
speculative execution of the instruction stream that could issue further
reads based on an invalid '*(rfv->c + offset)' value.
Based on an original patch by Elena Reshetova.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: netdev@vger.kernel.org
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
net/ipv4/raw.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 125c1eab3eaa..91091a10294f 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -57,6 +57,7 @@
#include <linux/in_route.h>
#include <linux/route.h>
#include <linux/skbuff.h>
+#include <linux/nospec.h>
#include <linux/igmp.h>
#include <net/net_namespace.h>
#include <net/dst.h>
@@ -472,17 +473,18 @@ static int raw_getfrag(void *from, char *to, int offset, int len, int odd,
struct sk_buff *skb)
{
struct raw_frag_vec *rfv = from;
+ char *rfv_buf;
- if (offset < rfv->hlen) {
+ rfv_buf = array_ptr(rfv->hdr.c, offset, rfv->hlen);
+ if (rfv_buf) {
int copy = min(rfv->hlen - offset, len);
if (skb->ip_summed == CHECKSUM_PARTIAL)
- memcpy(to, rfv->hdr.c + offset, copy);
+ memcpy(to, rfv_buf, copy);
else
skb->csum = csum_block_add(
skb->csum,
- csum_partial_copy_nocheck(rfv->hdr.c + offset,
- to, copy, 0),
+ csum_partial_copy_nocheck(rfv_buf, to, copy, 0),
odd);
odd = 0;
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v2 10/19] ipv4: prevent bounds-check bypass via speculative execution
2018-01-12 0:47 ` [PATCH v2 10/19] ipv4: " Dan Williams
@ 2018-01-12 7:59 ` Greg KH
2018-01-12 18:47 ` Dan Williams
0 siblings, 1 reply; 24+ messages in thread
From: Greg KH @ 2018-01-12 7:59 UTC (permalink / raw)
To: Dan Williams
Cc: linux-kernel, linux-arch, akpm, kernel-hardening,
Hideaki YOSHIFUJI, netdev, Alexey Kuznetsov, tglx, torvalds,
David S. Miller, Elena Reshetova, alan
On Thu, Jan 11, 2018 at 04:47:18PM -0800, Dan Williams wrote:
> Static analysis reports that 'offset' may be a user controlled value
> that is used as a data dependency reading from a raw_frag_vec buffer.
> In order to avoid potential leaks of kernel memory values, block
> speculative execution of the instruction stream that could issue further
> reads based on an invalid '*(rfv->c + offset)' value.
>
> Based on an original patch by Elena Reshetova.
There is the "Co-Developed-by:" tag now, if you want to use it...
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> Cc: netdev@vger.kernel.org
> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
> net/ipv4/raw.c | 10 ++++++----
> 1 file changed, 6 insertions(+), 4 deletions(-)
Ugh, what is this, the 4th time I've said "I don't think this is an
issue, so why are you changing this code." to this patch. To be
followed by a "oh yeah, you are right, I'll drop it", only to see it
show back up in the next time this patch series is sent out?
Same for the other patches in this series that I have reviewed 4, maybe
5, times already. The "v2" is not very true here...
thanks,
greg k-h
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 10/19] ipv4: prevent bounds-check bypass via speculative execution
2018-01-12 7:59 ` Greg KH
@ 2018-01-12 18:47 ` Dan Williams
2018-01-13 8:56 ` Greg KH
0 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2018-01-12 18:47 UTC (permalink / raw)
To: Greg KH
Cc: Linux Kernel Mailing List, linux-arch, Andrew Morton,
kernel-hardening, Hideaki YOSHIFUJI, Netdev, Alexey Kuznetsov,
Thomas Gleixner, Linus Torvalds, David S. Miller, Elena Reshetova,
Alan Cox
On Thu, Jan 11, 2018 at 11:59 PM, Greg KH <gregkh@linuxfoundation.org> wrote:
> On Thu, Jan 11, 2018 at 04:47:18PM -0800, Dan Williams wrote:
>> Static analysis reports that 'offset' may be a user controlled value
>> that is used as a data dependency reading from a raw_frag_vec buffer.
>> In order to avoid potential leaks of kernel memory values, block
>> speculative execution of the instruction stream that could issue further
>> reads based on an invalid '*(rfv->c + offset)' value.
>>
>> Based on an original patch by Elena Reshetova.
>
> There is the "Co-Developed-by:" tag now, if you want to use it...
Ok, thanks.
>
>> Cc: "David S. Miller" <davem@davemloft.net>
>> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
>> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
>> Cc: netdev@vger.kernel.org
>> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
>> net/ipv4/raw.c | 10 ++++++----
>> 1 file changed, 6 insertions(+), 4 deletions(-)
>
> Ugh, what is this, the 4th time I've said "I don't think this is an
> issue, so why are you changing this code." to this patch. To be
> followed by a "oh yeah, you are right, I'll drop it", only to see it
> show back up in the next time this patch series is sent out?
>
> Same for the other patches in this series that I have reviewed 4, maybe
> 5, times already. The "v2" is not very true here...
The theme of the review feedback on v1 was 'don't put ifence in any
net/ code', and that was addressed.
I honestly thought the new definition of array_ptr() changed the
calculus on this patch. Given the same pattern appears in the ipv6
case, and I have yet to hear that we should drop the ipv6 patch, make
the code symmetric just for readability purposes. Otherwise we need a
comment saying why this is safe for ipv4, but maybe not safe for ipv6,
I think 'array_ptr' is effectively that comment. I.e. 'array_ptr()' is
designed to be low impact for instrumenting false positives. If that
new argument does not hold water I will definitely drop this patch.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 10/19] ipv4: prevent bounds-check bypass via speculative execution
2018-01-12 18:47 ` Dan Williams
@ 2018-01-13 8:56 ` Greg KH
0 siblings, 0 replies; 24+ messages in thread
From: Greg KH @ 2018-01-13 8:56 UTC (permalink / raw)
To: Dan Williams
Cc: Linux Kernel Mailing List, linux-arch, Andrew Morton,
kernel-hardening, Hideaki YOSHIFUJI, Netdev, Alexey Kuznetsov,
Thomas Gleixner, Linus Torvalds, David S. Miller, Elena Reshetova,
Alan Cox
On Fri, Jan 12, 2018 at 10:47:44AM -0800, Dan Williams wrote:
> On Thu, Jan 11, 2018 at 11:59 PM, Greg KH <gregkh@linuxfoundation.org> wrote:
> >> Cc: "David S. Miller" <davem@davemloft.net>
> >> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
> >> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
> >> Cc: netdev@vger.kernel.org
> >> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
> >> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >> ---
> >> net/ipv4/raw.c | 10 ++++++----
> >> 1 file changed, 6 insertions(+), 4 deletions(-)
> >
> > Ugh, what is this, the 4th time I've said "I don't think this is an
> > issue, so why are you changing this code." to this patch. To be
> > followed by a "oh yeah, you are right, I'll drop it", only to see it
> > show back up in the next time this patch series is sent out?
> >
> > Same for the other patches in this series that I have reviewed 4, maybe
> > 5, times already. The "v2" is not very true here...
>
> The theme of the review feedback on v1 was 'don't put ifence in any
> net/ code', and that was addressed.
>
> I honestly thought the new definition of array_ptr() changed the
> calculus on this patch. Given the same pattern appears in the ipv6
> case, and I have yet to hear that we should drop the ipv6 patch, make
> the code symmetric just for readability purposes. Otherwise we need a
> comment saying why this is safe for ipv4, but maybe not safe for ipv6,
> I think 'array_ptr' is effectively that comment. I.e. 'array_ptr()' is
> designed to be low impact for instrumenting false positives. If that
> new argument does not hold water I will definitely drop this patch.
I also argued, again in an older review of this same patch series, that
the ipv6 patch should be dropped as well for this same exact reason.
I didn't think you wanted to hear me rant about the same thing on both
patches :)
greg k-h
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v2 15/19] carl9170: prevent bounds-check bypass via speculative execution
2018-01-12 0:46 [PATCH v2 00/19] prevent bounds-check bypass via speculative execution Dan Williams
2018-01-12 0:47 ` [PATCH v2 09/19] ipv6: " Dan Williams
2018-01-12 0:47 ` [PATCH v2 10/19] ipv4: " Dan Williams
@ 2018-01-12 0:47 ` Dan Williams
2018-01-12 14:42 ` Christian Lamparter
2018-01-12 0:47 ` [PATCH v2 16/19] p54: " Dan Williams
` (4 subsequent siblings)
7 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2018-01-12 0:47 UTC (permalink / raw)
To: linux-kernel
Cc: linux-arch, kernel-hardening, netdev, linux-wireless,
Elena Reshetova, Christian Lamparter, tglx, torvalds, akpm,
Kalle Valo, alan
Static analysis reports that 'queue' may be a user controlled value that
is used as a data dependency to read from the 'ar9170_qmap' array. In
order to avoid potential leaks of kernel memory values, block
speculative execution of the instruction stream that could issue reads
based on an invalid result of 'ar9170_qmap[queue]'. In this case the
value of 'ar9170_qmap[queue]' is immediately reused as an index to the
'ar->edcf' array.
Based on an original patch by Elena Reshetova.
Cc: Christian Lamparter <chunkeey@googlemail.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/net/wireless/ath/carl9170/main.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/ath/carl9170/main.c b/drivers/net/wireless/ath/carl9170/main.c
index 988c8857d78c..0acfa8c22b7d 100644
--- a/drivers/net/wireless/ath/carl9170/main.c
+++ b/drivers/net/wireless/ath/carl9170/main.c
@@ -41,6 +41,7 @@
#include <linux/module.h>
#include <linux/etherdevice.h>
#include <linux/random.h>
+#include <linux/nospec.h>
#include <net/mac80211.h>
#include <net/cfg80211.h>
#include "hw.h"
@@ -1384,11 +1385,13 @@ static int carl9170_op_conf_tx(struct ieee80211_hw *hw,
const struct ieee80211_tx_queue_params *param)
{
struct ar9170 *ar = hw->priv;
+ const u8 *elem;
int ret;
mutex_lock(&ar->mutex);
- if (queue < ar->hw->queues) {
- memcpy(&ar->edcf[ar9170_qmap[queue]], param, sizeof(*param));
+ elem = array_ptr(ar9170_qmap, queue, ar->hw->queues);
+ if (elem) {
+ memcpy(&ar->edcf[*elem], param, sizeof(*param));
ret = carl9170_set_qos(ar);
} else {
ret = -EINVAL;
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v2 15/19] carl9170: prevent bounds-check bypass via speculative execution
2018-01-12 0:47 ` [PATCH v2 15/19] carl9170: " Dan Williams
@ 2018-01-12 14:42 ` Christian Lamparter
2018-01-12 18:39 ` Dan Williams
0 siblings, 1 reply; 24+ messages in thread
From: Christian Lamparter @ 2018-01-12 14:42 UTC (permalink / raw)
To: Dan Williams
Cc: linux-kernel, linux-arch, kernel-hardening, netdev,
linux-wireless, Elena Reshetova, tglx, torvalds, akpm, Kalle Valo,
alan
On Friday, January 12, 2018 1:47:46 AM CET Dan Williams wrote:
> Static analysis reports that 'queue' may be a user controlled value that
> is used as a data dependency to read from the 'ar9170_qmap' array. In
> order to avoid potential leaks of kernel memory values, block
> speculative execution of the instruction stream that could issue reads
> based on an invalid result of 'ar9170_qmap[queue]'. In this case the
> value of 'ar9170_qmap[queue]' is immediately reused as an index to the
> 'ar->edcf' array.
>
> Based on an original patch by Elena Reshetova.
>
> Cc: Christian Lamparter <chunkeey@googlemail.com>
> Cc: Kalle Valo <kvalo@codeaurora.org>
> Cc: linux-wireless@vger.kernel.org
> Cc: netdev@vger.kernel.org
> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> ---
This patch (and p54, cw1200) look like the same patch?!
Can you tell me what happend to:
On Saturday, January 6, 2018 5:34:03 PM CET Dan Williams wrote:
> On Sat, Jan 6, 2018 at 6:23 AM, Christian Lamparter <chunkeey@gmail.com> wrote:
> > And Furthermore a invalid queue (param->ac) would cause a crash in
> > this line in mac80211 before it even reaches the driver [1]:
> > | sdata->tx_conf[params->ac] = p;
> > | ^^^^^^^^
> > | if (drv_conf_tx(local, sdata, >>>> params->ac <<<<, &p)) {
> > | ^^ (this is a wrapper for the *_op_conf_tx)
> >
> > I don't think these chin-up exercises are needed.
>
> Quite the contrary, you've identified a better place in the call stack
> to sanitize the input and disable speculation. Then we can kill the
> whole class of the wireless driver reports at once it seems.
<https://www.spinics.net/lists/netdev/msg476353.html>
Anyway, I think there's an easy way to solve this: remove the
"if (queue < ar->hw->queues)" check altogether. It's no longer needed
anymore as the "queue" value is validated long before the driver code
gets called. And from my understanding, this will fix the "In this case
the value of 'ar9170_qmap[queue]' is immediately reused as an index to
the 'ar->edcf' array." gripe your tool complains about.
This is here's a quick test-case for carl9170.:
---
diff --git a/drivers/net/wireless/ath/carl9170/main.c b/drivers/net/wireless/ath/carl9170/main.c
index 988c8857d78c..2d3afb15bb62 100644
--- a/drivers/net/wireless/ath/carl9170/main.c
+++ b/drivers/net/wireless/ath/carl9170/main.c
@@ -1387,13 +1387,8 @@ static int carl9170_op_conf_tx(struct ieee80211_hw *hw,
int ret;
mutex_lock(&ar->mutex);
- if (queue < ar->hw->queues) {
- memcpy(&ar->edcf[ar9170_qmap[queue]], param, sizeof(*param));
- ret = carl9170_set_qos(ar);
- } else {
- ret = -EINVAL;
- }
-
+ memcpy(&ar->edcf[ar9170_qmap[queue]], param, sizeof(*param));
+ ret = carl9170_set_qos(ar);
mutex_unlock(&ar->mutex);
return ret;
}
---
What does your tool say about this?
(If necessary, the "queue" value could be even sanitized with a
queue %= ARRAY_SIZE(ar9170_qmap); before the mutex_lock.)
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v2 15/19] carl9170: prevent bounds-check bypass via speculative execution
2018-01-12 14:42 ` Christian Lamparter
@ 2018-01-12 18:39 ` Dan Williams
2018-01-12 20:01 ` Christian Lamparter
0 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2018-01-12 18:39 UTC (permalink / raw)
To: Christian Lamparter
Cc: Linux Kernel Mailing List, linux-arch, kernel-hardening, Netdev,
Linux Wireless List, Elena Reshetova, Thomas Gleixner,
Linus Torvalds, Andrew Morton, Kalle Valo, Alan Cox
On Fri, Jan 12, 2018 at 6:42 AM, Christian Lamparter <chunkeey@gmail.com> wrote:
> On Friday, January 12, 2018 1:47:46 AM CET Dan Williams wrote:
>> Static analysis reports that 'queue' may be a user controlled value that
>> is used as a data dependency to read from the 'ar9170_qmap' array. In
>> order to avoid potential leaks of kernel memory values, block
>> speculative execution of the instruction stream that could issue reads
>> based on an invalid result of 'ar9170_qmap[queue]'. In this case the
>> value of 'ar9170_qmap[queue]' is immediately reused as an index to the
>> 'ar->edcf' array.
>>
>> Based on an original patch by Elena Reshetova.
>>
>> Cc: Christian Lamparter <chunkeey@googlemail.com>
>> Cc: Kalle Valo <kvalo@codeaurora.org>
>> Cc: linux-wireless@vger.kernel.org
>> Cc: netdev@vger.kernel.org
>> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
>> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> ---
> This patch (and p54, cw1200) look like the same patch?!
> Can you tell me what happend to:
>
> On Saturday, January 6, 2018 5:34:03 PM CET Dan Williams wrote:
>> On Sat, Jan 6, 2018 at 6:23 AM, Christian Lamparter <chunkeey@gmail.com> wrote:
>> > And Furthermore a invalid queue (param->ac) would cause a crash in
>> > this line in mac80211 before it even reaches the driver [1]:
>> > | sdata->tx_conf[params->ac] = p;
>> > | ^^^^^^^^
>> > | if (drv_conf_tx(local, sdata, >>>> params->ac <<<<, &p)) {
>> > | ^^ (this is a wrapper for the *_op_conf_tx)
>> >
>> > I don't think these chin-up exercises are needed.
>>
>> Quite the contrary, you've identified a better place in the call stack
>> to sanitize the input and disable speculation. Then we can kill the
>> whole class of the wireless driver reports at once it seems.
> <https://www.spinics.net/lists/netdev/msg476353.html>
I didn't see where ac is being validated against the driver specific
'queues' value in that earlier patch.
>
> Anyway, I think there's an easy way to solve this: remove the
> "if (queue < ar->hw->queues)" check altogether. It's no longer needed
> anymore as the "queue" value is validated long before the driver code
> gets called.
Can you point me to where that validation happens?
> And from my understanding, this will fix the "In this case
> the value of 'ar9170_qmap[queue]' is immediately reused as an index to
> the 'ar->edcf' array." gripe your tool complains about.
>
> This is here's a quick test-case for carl9170.:
> ---
> diff --git a/drivers/net/wireless/ath/carl9170/main.c b/drivers/net/wireless/ath/carl9170/main.c
> index 988c8857d78c..2d3afb15bb62 100644
> --- a/drivers/net/wireless/ath/carl9170/main.c
> +++ b/drivers/net/wireless/ath/carl9170/main.c
> @@ -1387,13 +1387,8 @@ static int carl9170_op_conf_tx(struct ieee80211_hw *hw,
> int ret;
>
> mutex_lock(&ar->mutex);
> - if (queue < ar->hw->queues) {
> - memcpy(&ar->edcf[ar9170_qmap[queue]], param, sizeof(*param));
> - ret = carl9170_set_qos(ar);
> - } else {
> - ret = -EINVAL;
> - }
> -
> + memcpy(&ar->edcf[ar9170_qmap[queue]], param, sizeof(*param));
> + ret = carl9170_set_qos(ar);
> mutex_unlock(&ar->mutex);
> return ret;
> }
> ---
> What does your tool say about this?
If you take away the 'if' then I don't the tool will report on this.
> (If necessary, the "queue" value could be even sanitized with a
> queue %= ARRAY_SIZE(ar9170_qmap); before the mutex_lock.)
That is what array_ptr() is doing in a more sophisticated way.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 15/19] carl9170: prevent bounds-check bypass via speculative execution
2018-01-12 18:39 ` Dan Williams
@ 2018-01-12 20:01 ` Christian Lamparter
2018-01-12 23:05 ` Dan Williams
0 siblings, 1 reply; 24+ messages in thread
From: Christian Lamparter @ 2018-01-12 20:01 UTC (permalink / raw)
To: Dan Williams
Cc: Linux Kernel Mailing List, linux-arch, kernel-hardening, Netdev,
Linux Wireless List, Elena Reshetova, Thomas Gleixner,
Linus Torvalds, Andrew Morton, Kalle Valo, Alan Cox
On Friday, January 12, 2018 7:39:50 PM CET Dan Williams wrote:
> On Fri, Jan 12, 2018 at 6:42 AM, Christian Lamparter <chunkeey@gmail.com> wrote:
> > On Friday, January 12, 2018 1:47:46 AM CET Dan Williams wrote:
> >> Static analysis reports that 'queue' may be a user controlled value that
> >> is used as a data dependency to read from the 'ar9170_qmap' array. In
> >> order to avoid potential leaks of kernel memory values, block
> >> speculative execution of the instruction stream that could issue reads
> >> based on an invalid result of 'ar9170_qmap[queue]'. In this case the
> >> value of 'ar9170_qmap[queue]' is immediately reused as an index to the
> >> 'ar->edcf' array.
> >>
> >> Based on an original patch by Elena Reshetova.
> >>
> >> Cc: Christian Lamparter <chunkeey@googlemail.com>
> >> Cc: Kalle Valo <kvalo@codeaurora.org>
> >> Cc: linux-wireless@vger.kernel.org
> >> Cc: netdev@vger.kernel.org
> >> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
> >> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >> ---
> > This patch (and p54, cw1200) look like the same patch?!
> > Can you tell me what happend to:
> >
> > On Saturday, January 6, 2018 5:34:03 PM CET Dan Williams wrote:
> >> On Sat, Jan 6, 2018 at 6:23 AM, Christian Lamparter <chunkeey@gmail.com> wrote:
> >> > And Furthermore a invalid queue (param->ac) would cause a crash in
> >> > this line in mac80211 before it even reaches the driver [1]:
> >> > | sdata->tx_conf[params->ac] = p;
> >> > | ^^^^^^^^
> >> > | if (drv_conf_tx(local, sdata, >>>> params->ac <<<<, &p)) {
> >> > | ^^ (this is a wrapper for the *_op_conf_tx)
> >> >
> >> > I don't think these chin-up exercises are needed.
> >>
> >> Quite the contrary, you've identified a better place in the call stack
> >> to sanitize the input and disable speculation. Then we can kill the
> >> whole class of the wireless driver reports at once it seems.
> > <https://www.spinics.net/lists/netdev/msg476353.html>
>
> I didn't see where ac is being validated against the driver specific
> 'queues' value in that earlier patch.
The link to the check is right there in the earlier post. It's in
parse_txq_params():
<https://github.com/torvalds/linux/blob/master/net/wireless/nl80211.c#L2070>
| if (txq_params->ac >= NL80211_NUM_ACS)
| return -EINVAL;
NL80211_NUM_ACS is 4
<http://elixir.free-electrons.com/linux/v4.15-rc7/source/include/uapi/linux/nl80211.h#L3748>
This check was added ever since mac80211's ieee80211_set_txq_params():
| sdata->tx_conf[params->ac] = p;
For cw1200: the driver just sets the dev->queue to 4.
In carl9170 dev->queues is set to __AR9170_NUM_TXQ and
p54 uses P54_QUEUE_AC_NUM.
Both __AR9170_NUM_TXQ and P54_QUEUE_AC_NUM are 4.
And this is not going to change since all drivers
have to follow mac80211's queue API:
<https://wireless.wiki.kernel.org/en/developers/documentation/mac80211/queues>
Some background:
In the old days (linux 2.6 and early 3.x), the parse_txq_params() function did
not verify the "queue" value. That's why these drivers had to do it.
Here's the relevant code from 2.6.39:
<http://elixir.free-electrons.com/linux/v2.6.39/source/net/wireless/nl80211.c#L879>
You'll notice that the check is missing there.
Here's mac80211's ieee80211_set_txq_params for reference:
<http://elixir.free-electrons.com/linux/v2.6.39/source/net/mac80211/cfg.c#L1197>
However over time, the check in the driver has become redundant.
> > Anyway, I think there's an easy way to solve this: remove the
> > "if (queue < ar->hw->queues)" check altogether. It's no longer needed
> > anymore as the "queue" value is validated long before the driver code
> > gets called.
> >
> > And from my understanding, this will fix the "In this case
> > the value of 'ar9170_qmap[queue]' is immediately reused as an index to
> > the 'ar->edcf' array." gripe your tool complains about.
> >
> > This is here's a quick test-case for carl9170.:
> > ---
> > diff --git a/drivers/net/wireless/ath/carl9170/main.c b/drivers/net/wireless/ath/carl9170/main.c
> > index 988c8857d78c..2d3afb15bb62 100644
> > --- a/drivers/net/wireless/ath/carl9170/main.c
> > +++ b/drivers/net/wireless/ath/carl9170/main.c
> > @@ -1387,13 +1387,8 @@ static int carl9170_op_conf_tx(struct ieee80211_hw *hw,
> > int ret;
> >
> > mutex_lock(&ar->mutex);
> > - if (queue < ar->hw->queues) {
> > - memcpy(&ar->edcf[ar9170_qmap[queue]], param, sizeof(*param));
> > - ret = carl9170_set_qos(ar);
> > - } else {
> > - ret = -EINVAL;
> > - }
> > -
> > + memcpy(&ar->edcf[ar9170_qmap[queue]], param, sizeof(*param));
> > + ret = carl9170_set_qos(ar);
> > mutex_unlock(&ar->mutex);
> > return ret;
> > }
> > ---
> > What does your tool say about this?
>
> If you take away the 'if' then I don't the tool will report on this.
>
> > (If necessary, the "queue" value could be even sanitized with a
> > queue %= ARRAY_SIZE(ar9170_qmap); before the mutex_lock.)
>
> That is what array_ptr() is doing in a more sophisticated way.
I think it's a very roundabout way :D. In any case the queue %= ...
could also be replaced by:
BUILD_BUG_ON(ARRAY_SIZE(ar9170_qmap) != NL80211_NUM_ACS));
(And the equivalent for p54, cw1200)
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 15/19] carl9170: prevent bounds-check bypass via speculative execution
2018-01-12 20:01 ` Christian Lamparter
@ 2018-01-12 23:05 ` Dan Williams
0 siblings, 0 replies; 24+ messages in thread
From: Dan Williams @ 2018-01-12 23:05 UTC (permalink / raw)
To: Christian Lamparter
Cc: Linux Kernel Mailing List, linux-arch, kernel-hardening, Netdev,
Linux Wireless List, Elena Reshetova, Thomas Gleixner,
Linus Torvalds, Andrew Morton, Kalle Valo, Alan Cox
On Fri, Jan 12, 2018 at 12:01 PM, Christian Lamparter
<chunkeey@gmail.com> wrote:
> On Friday, January 12, 2018 7:39:50 PM CET Dan Williams wrote:
>> On Fri, Jan 12, 2018 at 6:42 AM, Christian Lamparter <chunkeey@gmail.com> wrote:
>> > On Friday, January 12, 2018 1:47:46 AM CET Dan Williams wrote:
>> >> Static analysis reports that 'queue' may be a user controlled value that
>> >> is used as a data dependency to read from the 'ar9170_qmap' array. In
>> >> order to avoid potential leaks of kernel memory values, block
>> >> speculative execution of the instruction stream that could issue reads
>> >> based on an invalid result of 'ar9170_qmap[queue]'. In this case the
>> >> value of 'ar9170_qmap[queue]' is immediately reused as an index to the
>> >> 'ar->edcf' array.
>> >>
>> >> Based on an original patch by Elena Reshetova.
>> >>
>> >> Cc: Christian Lamparter <chunkeey@googlemail.com>
>> >> Cc: Kalle Valo <kvalo@codeaurora.org>
>> >> Cc: linux-wireless@vger.kernel.org
>> >> Cc: netdev@vger.kernel.org
>> >> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
>> >> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
>> >> ---
>> > This patch (and p54, cw1200) look like the same patch?!
>> > Can you tell me what happend to:
>> >
>> > On Saturday, January 6, 2018 5:34:03 PM CET Dan Williams wrote:
>> >> On Sat, Jan 6, 2018 at 6:23 AM, Christian Lamparter <chunkeey@gmail.com> wrote:
>> >> > And Furthermore a invalid queue (param->ac) would cause a crash in
>> >> > this line in mac80211 before it even reaches the driver [1]:
>> >> > | sdata->tx_conf[params->ac] = p;
>> >> > | ^^^^^^^^
>> >> > | if (drv_conf_tx(local, sdata, >>>> params->ac <<<<, &p)) {
>> >> > | ^^ (this is a wrapper for the *_op_conf_tx)
>> >> >
>> >> > I don't think these chin-up exercises are needed.
>> >>
>> >> Quite the contrary, you've identified a better place in the call stack
>> >> to sanitize the input and disable speculation. Then we can kill the
>> >> whole class of the wireless driver reports at once it seems.
>> > <https://www.spinics.net/lists/netdev/msg476353.html>
>>
>> I didn't see where ac is being validated against the driver specific
>> 'queues' value in that earlier patch.
> The link to the check is right there in the earlier post. It's in
> parse_txq_params():
> <https://github.com/torvalds/linux/blob/master/net/wireless/nl80211.c#L2070>
> | if (txq_params->ac >= NL80211_NUM_ACS)
> | return -EINVAL;
>
> NL80211_NUM_ACS is 4
> <http://elixir.free-electrons.com/linux/v4.15-rc7/source/include/uapi/linux/nl80211.h#L3748>
>
> This check was added ever since mac80211's ieee80211_set_txq_params():
> | sdata->tx_conf[params->ac] = p;
>
> For cw1200: the driver just sets the dev->queue to 4.
> In carl9170 dev->queues is set to __AR9170_NUM_TXQ and
> p54 uses P54_QUEUE_AC_NUM.
>
> Both __AR9170_NUM_TXQ and P54_QUEUE_AC_NUM are 4.
> And this is not going to change since all drivers
> have to follow mac80211's queue API:
> <https://wireless.wiki.kernel.org/en/developers/documentation/mac80211/queues>
>
> Some background:
> In the old days (linux 2.6 and early 3.x), the parse_txq_params() function did
> not verify the "queue" value. That's why these drivers had to do it.
>
> Here's the relevant code from 2.6.39:
> <http://elixir.free-electrons.com/linux/v2.6.39/source/net/wireless/nl80211.c#L879>
> You'll notice that the check is missing there.
> Here's mac80211's ieee80211_set_txq_params for reference:
> <http://elixir.free-electrons.com/linux/v2.6.39/source/net/mac80211/cfg.c#L1197>
>
> However over time, the check in the driver has become redundant.
>
Excellent, thank you for pointing that out and the background so clearly.
What this tells me though is that we want to inject an ifence() at
this input validation point, i.e.:
if (txq_params->ac >= NL80211_NUM_ACS) {
ifence();
return -EINVAL;
}
...but the kernel, in these patches, only has ifence() defined for
x86. The only way we can sanitize the 'txq_params->ac' value without
ifence() is to do it at array access time, but then we're stuck
touching all drivers when standard kernel development practice says
'refactor common code out of drivers'.
Ugh, the bigger concern is that this driver is being flagged and not
that initial bounds check. Imagine if cw1200 and p54 had already been
converted to assume that they can just trust 'queue'. It would never
have been flagged.
I think we should focus on the get_user path and __fcheck_files for v3.
^ permalink raw reply [flat|nested] 24+ messages in thread
* [PATCH v2 16/19] p54: prevent bounds-check bypass via speculative execution
2018-01-12 0:46 [PATCH v2 00/19] prevent bounds-check bypass via speculative execution Dan Williams
` (2 preceding siblings ...)
2018-01-12 0:47 ` [PATCH v2 15/19] carl9170: " Dan Williams
@ 2018-01-12 0:47 ` Dan Williams
2018-01-12 0:48 ` [PATCH v2 18/19] cw1200: " Dan Williams
` (3 subsequent siblings)
7 siblings, 0 replies; 24+ messages in thread
From: Dan Williams @ 2018-01-12 0:47 UTC (permalink / raw)
To: linux-kernel
Cc: linux-arch, kernel-hardening, netdev, linux-wireless,
Elena Reshetova, Christian Lamparter, tglx, torvalds, akpm,
Kalle Valo, alan
Static analysis reports that 'queue' may be a user controlled value that
is used as a data dependency to read from the 'priv->qos_params' array.
In order to avoid potential leaks of kernel memory values, block
speculative execution of the instruction stream that could issue reads
based on an invalid result of 'priv->qos_params[queue]'.
Based on an original patch by Elena Reshetova.
Cc: Christian Lamparter <chunkeey@googlemail.com>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/net/wireless/intersil/p54/main.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireless/intersil/p54/main.c b/drivers/net/wireless/intersil/p54/main.c
index ab6d39e12069..5ce693ff547e 100644
--- a/drivers/net/wireless/intersil/p54/main.c
+++ b/drivers/net/wireless/intersil/p54/main.c
@@ -20,6 +20,7 @@
#include <linux/firmware.h>
#include <linux/etherdevice.h>
#include <linux/module.h>
+#include <linux/nospec.h>
#include <net/mac80211.h>
@@ -411,12 +412,14 @@ static int p54_conf_tx(struct ieee80211_hw *dev,
const struct ieee80211_tx_queue_params *params)
{
struct p54_common *priv = dev->priv;
+ struct p54_edcf_queue_param *p54_q;
int ret;
mutex_lock(&priv->conf_mutex);
- if (queue < dev->queues) {
- P54_SET_QUEUE(priv->qos_params[queue], params->aifs,
- params->cw_min, params->cw_max, params->txop);
+ p54_q = array_ptr(priv->qos_params, queue, dev->queues);
+ if (p54_q) {
+ P54_SET_QUEUE(p54_q[0], params->aifs, params->cw_min,
+ params->cw_max, params->txop);
ret = p54_set_edcf(priv);
} else
ret = -EINVAL;
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 18/19] cw1200: prevent bounds-check bypass via speculative execution
2018-01-12 0:46 [PATCH v2 00/19] prevent bounds-check bypass via speculative execution Dan Williams
` (3 preceding siblings ...)
2018-01-12 0:47 ` [PATCH v2 16/19] p54: " Dan Williams
@ 2018-01-12 0:48 ` Dan Williams
2018-01-12 0:48 ` [PATCH v2 19/19] net: mpls: " Dan Williams
` (2 subsequent siblings)
7 siblings, 0 replies; 24+ messages in thread
From: Dan Williams @ 2018-01-12 0:48 UTC (permalink / raw)
To: linux-kernel
Cc: linux-arch, kernel-hardening, netdev, linux-wireless,
Elena Reshetova, Solomon Peachy, tglx, torvalds, akpm, Kalle Valo,
alan
Static analysis reports that 'queue' may be a user controlled value that
is used as a data dependency to read 'txq_params' from the
'priv->tx_queue_params.params' array. In order to avoid potential leaks
of kernel memory values, block speculative execution of the instruction
stream that could issue reads based on an invalid value of 'txq_params'.
In this case 'txq_params' is referenced later in the function.
Based on an original patch by Elena Reshetova.
Cc: Solomon Peachy <pizza@shaftnet.org>
Cc: Kalle Valo <kvalo@codeaurora.org>
Cc: linux-wireless@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
drivers/net/wireless/st/cw1200/sta.c | 11 +++++++----
drivers/net/wireless/st/cw1200/wsm.h | 4 +---
2 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/net/wireless/st/cw1200/sta.c b/drivers/net/wireless/st/cw1200/sta.c
index 38678e9a0562..7521077e50a4 100644
--- a/drivers/net/wireless/st/cw1200/sta.c
+++ b/drivers/net/wireless/st/cw1200/sta.c
@@ -14,6 +14,7 @@
#include <linux/firmware.h>
#include <linux/module.h>
#include <linux/etherdevice.h>
+#include <linux/nospec.h>
#include "cw1200.h"
#include "sta.h"
@@ -612,18 +613,20 @@ int cw1200_conf_tx(struct ieee80211_hw *dev, struct ieee80211_vif *vif,
u16 queue, const struct ieee80211_tx_queue_params *params)
{
struct cw1200_common *priv = dev->priv;
+ struct wsm_set_tx_queue_params *txq_params;
int ret = 0;
/* To prevent re-applying PM request OID again and again*/
bool old_uapsd_flags;
mutex_lock(&priv->conf_mutex);
- if (queue < dev->queues) {
+ txq_params = array_ptr(priv->tx_queue_params.params, queue,
+ dev->queues);
+ if (txq_params) {
old_uapsd_flags = le16_to_cpu(priv->uapsd_info.uapsd_flags);
- WSM_TX_QUEUE_SET(&priv->tx_queue_params, queue, 0, 0, 0);
- ret = wsm_set_tx_queue_params(priv,
- &priv->tx_queue_params.params[queue], queue);
+ WSM_TX_QUEUE_SET(txq_params, 0, 0, 0);
+ ret = wsm_set_tx_queue_params(priv, txq_params, queue);
if (ret) {
ret = -EINVAL;
goto out;
diff --git a/drivers/net/wireless/st/cw1200/wsm.h b/drivers/net/wireless/st/cw1200/wsm.h
index 48086e849515..8c8d9191e233 100644
--- a/drivers/net/wireless/st/cw1200/wsm.h
+++ b/drivers/net/wireless/st/cw1200/wsm.h
@@ -1099,10 +1099,8 @@ struct wsm_tx_queue_params {
};
-#define WSM_TX_QUEUE_SET(queue_params, queue, ack_policy, allowed_time,\
- max_life_time) \
+#define WSM_TX_QUEUE_SET(p, ack_policy, allowed_time, max_life_time) \
do { \
- struct wsm_set_tx_queue_params *p = &(queue_params)->params[queue]; \
p->ackPolicy = (ack_policy); \
p->allowedMediumTime = (allowed_time); \
p->maxTransmitLifetime = (max_life_time); \
^ permalink raw reply related [flat|nested] 24+ messages in thread
* [PATCH v2 19/19] net: mpls: prevent bounds-check bypass via speculative execution
2018-01-12 0:46 [PATCH v2 00/19] prevent bounds-check bypass via speculative execution Dan Williams
` (4 preceding siblings ...)
2018-01-12 0:48 ` [PATCH v2 18/19] cw1200: " Dan Williams
@ 2018-01-12 0:48 ` Dan Williams
2018-01-12 1:19 ` [PATCH v2 00/19] " Linus Torvalds
2018-01-12 10:02 ` Russell King - ARM Linux
7 siblings, 0 replies; 24+ messages in thread
From: Dan Williams @ 2018-01-12 0:48 UTC (permalink / raw)
To: linux-kernel
Cc: linux-arch, akpm, kernel-hardening, netdev, Eric W. Biederman,
tglx, torvalds, David S. Miller, Elena Reshetova, alan
Static analysis reports that 'index' may be a user controlled value that
is used as a data dependency reading 'rt' from the 'platform_label'
array. In order to avoid potential leaks of kernel memory values, block
speculative execution of the instruction stream that could issue further
reads based on an invalid 'rt' value.
Based on an original patch by Elena Reshetova.
Eric notes:
"
When val is a pointer not an integer.
Then
array2[val] = y;
/* or */
y = array2[va];
Won't happen.
val->field;
Will happen.
Which looks similar. However the address space of pointers is too
large. Making it impossible for an attack to know where to look in
the cache to see if "val->field" happened. At least on the
assumption that val is an arbitrary value.
Further mpls_forward is small enough the entire scope of "rt" the
value read possibly past the bound check is auditable without too
much trouble. I have looked and I don't see anything that could
possibly allow the value of "rt" to be exfitrated. The problem
continuing to be that it is a pointer and the only operation on the
pointer besides derferencing it is testing if it is NULL.
Other types of timing attacks are very hard if not impossible
because any packet presenting with a value outside the bounds check
will be dropped. So it will hard if not impossible to find
something to time to see how long it took to drop the packet.
"
The motivation of resending this patch despite the NAK is to
continue a community wide discussion on the bar for judging Spectre
changes. I.e. is any user controlled speculative pointer in the
kernel a pointer too far, especially given the current array_ptr()
implementation.
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: netdev@vger.kernel.org
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
net/mpls/af_mpls.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/net/mpls/af_mpls.c b/net/mpls/af_mpls.c
index 8ca9915befc8..c92b1033adc2 100644
--- a/net/mpls/af_mpls.c
+++ b/net/mpls/af_mpls.c
@@ -8,6 +8,7 @@
#include <linux/ipv6.h>
#include <linux/mpls.h>
#include <linux/netconf.h>
+#include <linux/nospec.h>
#include <linux/vmalloc.h>
#include <linux/percpu.h>
#include <net/ip.h>
@@ -77,12 +78,13 @@ static void rtmsg_lfib(int event, u32 label, struct mpls_route *rt,
static struct mpls_route *mpls_route_input_rcu(struct net *net, unsigned index)
{
struct mpls_route *rt = NULL;
+ struct mpls_route __rcu **platform_label =
+ rcu_dereference(net->mpls.platform_label);
+ struct mpls_route __rcu **rtp;
- if (index < net->mpls.platform_labels) {
- struct mpls_route __rcu **platform_label =
- rcu_dereference(net->mpls.platform_label);
- rt = rcu_dereference(platform_label[index]);
- }
+ rtp = array_ptr(platform_label, index, net->mpls.platform_labels);
+ if (rtp)
+ rt = rcu_dereference(*rtp);
return rt;
}
^ permalink raw reply related [flat|nested] 24+ messages in thread
* Re: [PATCH v2 00/19] prevent bounds-check bypass via speculative execution
2018-01-12 0:46 [PATCH v2 00/19] prevent bounds-check bypass via speculative execution Dan Williams
` (5 preceding siblings ...)
2018-01-12 0:48 ` [PATCH v2 19/19] net: mpls: " Dan Williams
@ 2018-01-12 1:19 ` Linus Torvalds
2018-01-12 1:41 ` Dan Williams
2018-01-13 0:15 ` Tony Luck
2018-01-12 10:02 ` Russell King - ARM Linux
7 siblings, 2 replies; 24+ messages in thread
From: Linus Torvalds @ 2018-01-12 1:19 UTC (permalink / raw)
To: Dan Williams
Cc: Linux Kernel Mailing List, Mark Rutland, kernel-hardening,
Peter Zijlstra, Alan Cox, Will Deacon, Alexei Starovoitov,
Solomon Peachy, H. Peter Anvin, Christian Lamparter,
Elena Reshetova, linux-arch, Andi Kleen, James E.J. Bottomley,
Linux SCSI List, Jonathan Corbet, the arch/x86 maintainers,
Russell King, Ingo Molnar, Catalin
On Thu, Jan 11, 2018 at 4:46 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>
> This series incorporates Mark Rutland's latest ARM changes and adds
> the x86 specific implementation of 'ifence_array_ptr'. That ifence
> based approach is provided as an opt-in fallback, but the default
> mitigation, '__array_ptr', uses a 'mask' approach that removes
> conditional branches instructions, and otherwise aims to redirect
> speculation to use a NULL pointer rather than a user controlled value.
Do you have any performance numbers and perhaps example code
generation? Is this noticeable? Are there any microbenchmarks showing
the difference between lfence use and the masking model?
Having both seems good for testing, but wouldn't we want to pick one in the end?
Also, I do think that there is one particular array load that would
seem to be pretty obvious: the system call function pointer array.
Yes, yes, the actual call is now behind a retpoline, but that protects
against a speculative BTB access, it's not obvious that it protects
against the mispredict of the __NR_syscall_max comparison in
arch/x86/entry/entry_64.S.
The act of fetching code is a kind of read too. And retpoline protects
against BTB stuffing etc, but what if the _actual_ system call
function address is wrong (due to mis-prediction of the system call
index check)?
Should the array access in entry_SYSCALL_64_fastpath be made to use
the masking approach?
Linus
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 00/19] prevent bounds-check bypass via speculative execution
2018-01-12 1:19 ` [PATCH v2 00/19] " Linus Torvalds
@ 2018-01-12 1:41 ` Dan Williams
2018-01-18 13:18 ` Will Deacon
2018-01-13 0:15 ` Tony Luck
1 sibling, 1 reply; 24+ messages in thread
From: Dan Williams @ 2018-01-12 1:41 UTC (permalink / raw)
To: Linus Torvalds
Cc: Linux Kernel Mailing List, Mark Rutland, kernel-hardening,
Peter Zijlstra, Alan Cox, Will Deacon, Alexei Starovoitov,
Solomon Peachy, H. Peter Anvin, Christian Lamparter,
Elena Reshetova, linux-arch, Andi Kleen, James E.J. Bottomley,
Linux SCSI List, Jonathan Corbet, the arch/x86 maintainers,
Russell King, Ingo Molnar, Catalin
On Thu, Jan 11, 2018 at 5:19 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Thu, Jan 11, 2018 at 4:46 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>>
>> This series incorporates Mark Rutland's latest ARM changes and adds
>> the x86 specific implementation of 'ifence_array_ptr'. That ifence
>> based approach is provided as an opt-in fallback, but the default
>> mitigation, '__array_ptr', uses a 'mask' approach that removes
>> conditional branches instructions, and otherwise aims to redirect
>> speculation to use a NULL pointer rather than a user controlled value.
>
> Do you have any performance numbers and perhaps example code
> generation? Is this noticeable? Are there any microbenchmarks showing
> the difference between lfence use and the masking model?
I don't have performance numbers, but here's a sample code generation
from __fcheck_files, where the 'and; lea; and' sequence is portion of
array_ptr() after the mask generation with 'sbb'.
fdp = array_ptr(fdt->fd, fd, fdt->max_fds);
8e7: 8b 02 mov (%rdx),%eax
8e9: 48 39 c7 cmp %rax,%rdi
8ec: 48 19 c9 sbb %rcx,%rcx
8ef: 48 8b 42 08 mov 0x8(%rdx),%rax
8f3: 48 89 fe mov %rdi,%rsi
8f6: 48 21 ce and %rcx,%rsi
8f9: 48 8d 04 f0 lea (%rax,%rsi,8),%rax
8fd: 48 21 c8 and %rcx,%rax
> Having both seems good for testing, but wouldn't we want to pick one in the end?
I was thinking we'd keep it as a 'just in case' sort of thing, at
least until the 'probably safe' assumption of the 'mask' approach has
more time to settle out.
>
> Also, I do think that there is one particular array load that would
> seem to be pretty obvious: the system call function pointer array.
>
> Yes, yes, the actual call is now behind a retpoline, but that protects
> against a speculative BTB access, it's not obvious that it protects
> against the mispredict of the __NR_syscall_max comparison in
> arch/x86/entry/entry_64.S.
>
> The act of fetching code is a kind of read too. And retpoline protects
> against BTB stuffing etc, but what if the _actual_ system call
> function address is wrong (due to mis-prediction of the system call
> index check)?
>
> Should the array access in entry_SYSCALL_64_fastpath be made to use
> the masking approach?
I'll take a look. I'm firmly in the 'patch first / worry later' stance
on these investigations.
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 00/19] prevent bounds-check bypass via speculative execution
2018-01-12 1:41 ` Dan Williams
@ 2018-01-18 13:18 ` Will Deacon
2018-01-18 16:58 ` Dan Williams
0 siblings, 1 reply; 24+ messages in thread
From: Will Deacon @ 2018-01-18 13:18 UTC (permalink / raw)
To: Dan Williams
Cc: Linus Torvalds, Linux Kernel Mailing List, Mark Rutland,
kernel-hardening, Peter Zijlstra, Alan Cox, Alexei Starovoitov,
Solomon Peachy, H. Peter Anvin, Christian Lamparter,
Elena Reshetova, linux-arch, Andi Kleen, James E.J. Bottomley,
Linux SCSI List, Jonathan Corbet, the arch/x86 maintainers,
Russell King, Ingo Molnar, Catalin Marinas, Alexey Kuznetsov,
Linux Media Mailing List
Hi Dan, Linus,
On Thu, Jan 11, 2018 at 05:41:08PM -0800, Dan Williams wrote:
> On Thu, Jan 11, 2018 at 5:19 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> > On Thu, Jan 11, 2018 at 4:46 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> >>
> >> This series incorporates Mark Rutland's latest ARM changes and adds
> >> the x86 specific implementation of 'ifence_array_ptr'. That ifence
> >> based approach is provided as an opt-in fallback, but the default
> >> mitigation, '__array_ptr', uses a 'mask' approach that removes
> >> conditional branches instructions, and otherwise aims to redirect
> >> speculation to use a NULL pointer rather than a user controlled value.
> >
> > Do you have any performance numbers and perhaps example code
> > generation? Is this noticeable? Are there any microbenchmarks showing
> > the difference between lfence use and the masking model?
>
> I don't have performance numbers, but here's a sample code generation
> from __fcheck_files, where the 'and; lea; and' sequence is portion of
> array_ptr() after the mask generation with 'sbb'.
>
> fdp = array_ptr(fdt->fd, fd, fdt->max_fds);
> 8e7: 8b 02 mov (%rdx),%eax
> 8e9: 48 39 c7 cmp %rax,%rdi
> 8ec: 48 19 c9 sbb %rcx,%rcx
> 8ef: 48 8b 42 08 mov 0x8(%rdx),%rax
> 8f3: 48 89 fe mov %rdi,%rsi
> 8f6: 48 21 ce and %rcx,%rsi
> 8f9: 48 8d 04 f0 lea (%rax,%rsi,8),%rax
> 8fd: 48 21 c8 and %rcx,%rax
>
>
> > Having both seems good for testing, but wouldn't we want to pick one in the end?
>
> I was thinking we'd keep it as a 'just in case' sort of thing, at
> least until the 'probably safe' assumption of the 'mask' approach has
> more time to settle out.
>From the arm64 side, the only concern I have (and this actually applies to
our CSDB sequence as well) is the calculation of the array size by the
caller. As Linus mentioned at the end of [1], if the determination of the
size argument is based on a conditional branch, then masking doesn't help
because you bound within the wrong range under speculation.
We ran into this when trying to use masking to protect our uaccess routines
where the conditional bound is either KERNEL_DS or USER_DS. It's possible
that a prior conditional set_fs(KERNEL_DS) could defeat the masking and so
we'd need to throw some heavy barriers in set_fs to make it robust.
The question then is whether or not we're likely to run into this elsewhere,
and I don't have a good feel for that.
Will
[1] http://lkml.kernel.org/r/CA+55aFz0tsreoa=5Ud2noFCpng-dizLBhT9WU9asyhpLfjdcYA@mail.gmail.com
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 00/19] prevent bounds-check bypass via speculative execution
2018-01-18 13:18 ` Will Deacon
@ 2018-01-18 16:58 ` Dan Williams
2018-01-18 17:05 ` Will Deacon
0 siblings, 1 reply; 24+ messages in thread
From: Dan Williams @ 2018-01-18 16:58 UTC (permalink / raw)
To: Will Deacon
Cc: Linus Torvalds, Linux Kernel Mailing List, Mark Rutland,
kernel-hardening, Peter Zijlstra, Alan Cox, Alexei Starovoitov,
Solomon Peachy, H. Peter Anvin, Christian Lamparter,
Elena Reshetova, linux-arch, Andi Kleen, James E.J. Bottomley,
Linux SCSI List, Jonathan Corbet, the arch/x86 maintainers,
Russell King, Ingo Molnar
On Thu, Jan 18, 2018 at 5:18 AM, Will Deacon <will.deacon@arm.com> wrote:
> Hi Dan, Linus,
>
> On Thu, Jan 11, 2018 at 05:41:08PM -0800, Dan Williams wrote:
>> On Thu, Jan 11, 2018 at 5:19 PM, Linus Torvalds
>> <torvalds@linux-foundation.org> wrote:
>> > On Thu, Jan 11, 2018 at 4:46 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>> >>
>> >> This series incorporates Mark Rutland's latest ARM changes and adds
>> >> the x86 specific implementation of 'ifence_array_ptr'. That ifence
>> >> based approach is provided as an opt-in fallback, but the default
>> >> mitigation, '__array_ptr', uses a 'mask' approach that removes
>> >> conditional branches instructions, and otherwise aims to redirect
>> >> speculation to use a NULL pointer rather than a user controlled value.
>> >
>> > Do you have any performance numbers and perhaps example code
>> > generation? Is this noticeable? Are there any microbenchmarks showing
>> > the difference between lfence use and the masking model?
>>
>> I don't have performance numbers, but here's a sample code generation
>> from __fcheck_files, where the 'and; lea; and' sequence is portion of
>> array_ptr() after the mask generation with 'sbb'.
>>
>> fdp = array_ptr(fdt->fd, fd, fdt->max_fds);
>> 8e7: 8b 02 mov (%rdx),%eax
>> 8e9: 48 39 c7 cmp %rax,%rdi
>> 8ec: 48 19 c9 sbb %rcx,%rcx
>> 8ef: 48 8b 42 08 mov 0x8(%rdx),%rax
>> 8f3: 48 89 fe mov %rdi,%rsi
>> 8f6: 48 21 ce and %rcx,%rsi
>> 8f9: 48 8d 04 f0 lea (%rax,%rsi,8),%rax
>> 8fd: 48 21 c8 and %rcx,%rax
>>
>>
>> > Having both seems good for testing, but wouldn't we want to pick one in the end?
>>
>> I was thinking we'd keep it as a 'just in case' sort of thing, at
>> least until the 'probably safe' assumption of the 'mask' approach has
>> more time to settle out.
>
> From the arm64 side, the only concern I have (and this actually applies to
> our CSDB sequence as well) is the calculation of the array size by the
> caller. As Linus mentioned at the end of [1], if the determination of the
> size argument is based on a conditional branch, then masking doesn't help
> because you bound within the wrong range under speculation.
>
> We ran into this when trying to use masking to protect our uaccess routines
> where the conditional bound is either KERNEL_DS or USER_DS. It's possible
> that a prior conditional set_fs(KERNEL_DS) could defeat the masking and so
> we'd need to throw some heavy barriers in set_fs to make it robust.
At least in the conditional mask case near set_fs() usage the approach
we are taking is to use a barrier. I.e. the following guidance from
Linus:
"Basically, the rule is trivial: find all 'stac' users, and use address
masking if those users already integrate the limit check, and lfence
they don't."
...which translates to narrow the pointer for get_user() and use a
barrier for __get_user().
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 00/19] prevent bounds-check bypass via speculative execution
2018-01-18 16:58 ` Dan Williams
@ 2018-01-18 17:05 ` Will Deacon
2018-01-18 21:41 ` Laurent Pinchart
0 siblings, 1 reply; 24+ messages in thread
From: Will Deacon @ 2018-01-18 17:05 UTC (permalink / raw)
To: Dan Williams
Cc: Linus Torvalds, Linux Kernel Mailing List, Mark Rutland,
kernel-hardening, Peter Zijlstra, Alan Cox, Alexei Starovoitov,
Solomon Peachy, H. Peter Anvin, Christian Lamparter,
Elena Reshetova, linux-arch, Andi Kleen, James E.J. Bottomley,
Linux SCSI List, Jonathan Corbet, the arch/x86 maintainers,
Russell King, Ingo Molnar, Catalin Marinas, Alexey Kuznetsov,
Linux Media Mailing List
On Thu, Jan 18, 2018 at 08:58:08AM -0800, Dan Williams wrote:
> On Thu, Jan 18, 2018 at 5:18 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Thu, Jan 11, 2018 at 05:41:08PM -0800, Dan Williams wrote:
> >> On Thu, Jan 11, 2018 at 5:19 PM, Linus Torvalds
> >> <torvalds@linux-foundation.org> wrote:
> >> > On Thu, Jan 11, 2018 at 4:46 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> >> >>
> >> >> This series incorporates Mark Rutland's latest ARM changes and adds
> >> >> the x86 specific implementation of 'ifence_array_ptr'. That ifence
> >> >> based approach is provided as an opt-in fallback, but the default
> >> >> mitigation, '__array_ptr', uses a 'mask' approach that removes
> >> >> conditional branches instructions, and otherwise aims to redirect
> >> >> speculation to use a NULL pointer rather than a user controlled value.
> >> >
> >> > Do you have any performance numbers and perhaps example code
> >> > generation? Is this noticeable? Are there any microbenchmarks showing
> >> > the difference between lfence use and the masking model?
> >>
> >> I don't have performance numbers, but here's a sample code generation
> >> from __fcheck_files, where the 'and; lea; and' sequence is portion of
> >> array_ptr() after the mask generation with 'sbb'.
> >>
> >> fdp = array_ptr(fdt->fd, fd, fdt->max_fds);
> >> 8e7: 8b 02 mov (%rdx),%eax
> >> 8e9: 48 39 c7 cmp %rax,%rdi
> >> 8ec: 48 19 c9 sbb %rcx,%rcx
> >> 8ef: 48 8b 42 08 mov 0x8(%rdx),%rax
> >> 8f3: 48 89 fe mov %rdi,%rsi
> >> 8f6: 48 21 ce and %rcx,%rsi
> >> 8f9: 48 8d 04 f0 lea (%rax,%rsi,8),%rax
> >> 8fd: 48 21 c8 and %rcx,%rax
> >>
> >>
> >> > Having both seems good for testing, but wouldn't we want to pick one in the end?
> >>
> >> I was thinking we'd keep it as a 'just in case' sort of thing, at
> >> least until the 'probably safe' assumption of the 'mask' approach has
> >> more time to settle out.
> >
> > From the arm64 side, the only concern I have (and this actually applies to
> > our CSDB sequence as well) is the calculation of the array size by the
> > caller. As Linus mentioned at the end of [1], if the determination of the
> > size argument is based on a conditional branch, then masking doesn't help
> > because you bound within the wrong range under speculation.
> >
> > We ran into this when trying to use masking to protect our uaccess routines
> > where the conditional bound is either KERNEL_DS or USER_DS. It's possible
> > that a prior conditional set_fs(KERNEL_DS) could defeat the masking and so
> > we'd need to throw some heavy barriers in set_fs to make it robust.
>
> At least in the conditional mask case near set_fs() usage the approach
> we are taking is to use a barrier. I.e. the following guidance from
> Linus:
>
> "Basically, the rule is trivial: find all 'stac' users, and use address
> masking if those users already integrate the limit check, and lfence
> they don't."
>
> ...which translates to narrow the pointer for get_user() and use a
> barrier for __get_user().
Great, that matches my thinking re set_fs but I'm still worried about
finding all the places where the bound is conditional for other users
of the macro. Then again, finding the places that need this macro in the
first place is tough enough so perhaps analysing the bound calculation
doesn't make it much worse.
Will
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 00/19] prevent bounds-check bypass via speculative execution
2018-01-18 17:05 ` Will Deacon
@ 2018-01-18 21:41 ` Laurent Pinchart
0 siblings, 0 replies; 24+ messages in thread
From: Laurent Pinchart @ 2018-01-18 21:41 UTC (permalink / raw)
To: Will Deacon
Cc: Dan Williams, Linus Torvalds, Linux Kernel Mailing List,
Mark Rutland, kernel-hardening, Peter Zijlstra, Alan Cox,
Alexei Starovoitov, Solomon Peachy, H. Peter Anvin,
Christian Lamparter, Elena Reshetova, linux-arch, Andi Kleen,
James E.J. Bottomley, Linux SCSI List, Jonathan Corbet,
the arch/x86 maintainers, Russell King
Hi Will,
On Thursday, 18 January 2018 19:05:47 EET Will Deacon wrote:
> On Thu, Jan 18, 2018 at 08:58:08AM -0800, Dan Williams wrote:
> > On Thu, Jan 18, 2018 at 5:18 AM, Will Deacon wrote:
> >> On Thu, Jan 11, 2018 at 05:41:08PM -0800, Dan Williams wrote:
> >>> On Thu, Jan 11, 2018 at 5:19 PM, Linus Torvalds wrote:
> >>>> On Thu, Jan 11, 2018 at 4:46 PM, Dan Williams wrote:
> >>>>> This series incorporates Mark Rutland's latest ARM changes and adds
> >>>>> the x86 specific implementation of 'ifence_array_ptr'. That ifence
> >>>>> based approach is provided as an opt-in fallback, but the default
> >>>>> mitigation, '__array_ptr', uses a 'mask' approach that removes
> >>>>> conditional branches instructions, and otherwise aims to redirect
> >>>>> speculation to use a NULL pointer rather than a user controlled
> >>>>> value.
> >>>>
> >>>> Do you have any performance numbers and perhaps example code
> >>>> generation? Is this noticeable? Are there any microbenchmarks showing
> >>>> the difference between lfence use and the masking model?
> >>>
> >>> I don't have performance numbers, but here's a sample code generation
> >>> from __fcheck_files, where the 'and; lea; and' sequence is portion of
> >>> array_ptr() after the mask generation with 'sbb'.
> >>>
> >>> fdp = array_ptr(fdt->fd, fd, fdt->max_fds);
> >>>
> >>> 8e7: 8b 02 mov (%rdx),%eax
> >>> 8e9: 48 39 c7 cmp %rax,%rdi
> >>> 8ec: 48 19 c9 sbb %rcx,%rcx
> >>> 8ef: 48 8b 42 08 mov 0x8(%rdx),%rax
> >>> 8f3: 48 89 fe mov %rdi,%rsi
> >>> 8f6: 48 21 ce and %rcx,%rsi
> >>> 8f9: 48 8d 04 f0 lea (%rax,%rsi,8),%rax
> >>> 8fd: 48 21 c8 and %rcx,%rax
> >>>>
> >>>> Having both seems good for testing, but wouldn't we want to pick one
> >>>> in the end?
> >>>
> >>> I was thinking we'd keep it as a 'just in case' sort of thing, at
> >>> least until the 'probably safe' assumption of the 'mask' approach has
> >>> more time to settle out.
> >>
> >> From the arm64 side, the only concern I have (and this actually applies
> >> to our CSDB sequence as well) is the calculation of the array size by
> >> the caller. As Linus mentioned at the end of [1], if the determination
> >> of the size argument is based on a conditional branch, then masking
> >> doesn't help because you bound within the wrong range under speculation.
> >>
> >> We ran into this when trying to use masking to protect our uaccess
> >> routines where the conditional bound is either KERNEL_DS or USER_DS.
> >> It's possible that a prior conditional set_fs(KERNEL_DS) could defeat
> >> the masking and so we'd need to throw some heavy barriers in set_fs to
> >> make it robust.
> >
> > At least in the conditional mask case near set_fs() usage the approach
> > we are taking is to use a barrier. I.e. the following guidance from
> > Linus:
> >
> > "Basically, the rule is trivial: find all 'stac' users, and use address
> > masking if those users already integrate the limit check, and lfence
> > they don't."
> >
> > ...which translates to narrow the pointer for get_user() and use a
> > barrier for __get_user().
>
> Great, that matches my thinking re set_fs but I'm still worried about
> finding all the places where the bound is conditional for other users
> of the macro. Then again, finding the places that need this macro in the
> first place is tough enough so perhaps analysing the bound calculation
> doesn't make it much worse.
It might not now, but if the bound calculation changes later, I'm pretty sure
we'll forget to update the speculation barrier macro at least in some cases.
Without the help of static (or possibly dynamic) code analysis I think we're
bound to reintroduce problems over time, but that's true for finding places
where the barrier is needed, not just for barrier selection based on bound
calculation.
--
Regards,
Laurent Pinchart
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 00/19] prevent bounds-check bypass via speculative execution
2018-01-12 1:19 ` [PATCH v2 00/19] " Linus Torvalds
2018-01-12 1:41 ` Dan Williams
@ 2018-01-13 0:15 ` Tony Luck
2018-01-13 18:51 ` Linus Torvalds
1 sibling, 1 reply; 24+ messages in thread
From: Tony Luck @ 2018-01-13 0:15 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dan Williams, Linux Kernel Mailing List, Mark Rutland,
kernel-hardening, Peter Zijlstra, Alan Cox, Will Deacon,
Alexei Starovoitov, Solomon Peachy, H. Peter Anvin,
Christian Lamparter, Elena Reshetova, linux-arch@vger.kernel.org,
Andi Kleen, James E.J. Bottomley, Linux SCSI List,
Jonathan Corbet, the arch/x86 maintainers, Russell
On Thu, Jan 11, 2018 at 5:19 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> Should the array access in entry_SYSCALL_64_fastpath be made to use
> the masking approach?
That one has a bounds check for an inline constant.
cmpq $__NR_syscall_max, %rax
so should be safe.
The classic Spectre variant #1 code sequence is:
int array_size;
if (x < array_size) {
something with array[x]
}
which runs into problems because the array_size variable may not
be in cache, and while the CPU core is waiting for the value it
speculates inside the "if" body.
The syscall entry is more like:
#define ARRAY_SIZE 10
if (x < ARRAY_SIZE) {
something with array[x]
}
Here there isn't any reason for speculation. The core has the
value of 'x' in a register and the upper bound encoded into the
"cmp" instruction. Both are right there, no waiting, no speculation.
-Tony
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 00/19] prevent bounds-check bypass via speculative execution
2018-01-13 0:15 ` Tony Luck
@ 2018-01-13 18:51 ` Linus Torvalds
2018-01-16 19:21 ` Tony Luck
0 siblings, 1 reply; 24+ messages in thread
From: Linus Torvalds @ 2018-01-13 18:51 UTC (permalink / raw)
To: Tony Luck
Cc: Dan Williams, Linux Kernel Mailing List, Mark Rutland,
kernel-hardening, Peter Zijlstra, Alan Cox, Will Deacon,
Alexei Starovoitov, Solomon Peachy, H. Peter Anvin,
Christian Lamparter, Elena Reshetova, linux-arch@vger.kernel.org,
Andi Kleen, James E.J. Bottomley, Linux SCSI List,
Jonathan Corbet, the arch/x86 maintainers, Russell
On Fri, Jan 12, 2018 at 4:15 PM, Tony Luck <tony.luck@gmail.com> wrote:
>
> Here there isn't any reason for speculation. The core has the
> value of 'x' in a register and the upper bound encoded into the
> "cmp" instruction. Both are right there, no waiting, no speculation.
So this is an argument I haven't seen before (although it was brought
up in private long ago), but that is very relevant: the actual scope
and depth of speculation.
Your argument basically depends on just what gets speculated, and on
the _actual_ order of execution.
So your argument depends on "the uarch will actually run the code in
order if there are no events that block the pipeline".
Or at least it depends on a certain latency of the killing of any OoO
execution being low enough that the cache access doesn't even begin.
I realize that that is very much a particular microarchitectural
detail, but it's actually a *big* deal. Do we have a set of rules for
what is not a worry, simply because the speculated accesses get killed
early enough?
Apparently "test a register value against a constant" is good enough,
assuming that register is also needed for the address of the access.
Linus
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 00/19] prevent bounds-check bypass via speculative execution
2018-01-13 18:51 ` Linus Torvalds
@ 2018-01-16 19:21 ` Tony Luck
0 siblings, 0 replies; 24+ messages in thread
From: Tony Luck @ 2018-01-16 19:21 UTC (permalink / raw)
To: Linus Torvalds
Cc: Dan Williams, Linux Kernel Mailing List, Mark Rutland,
kernel-hardening, Peter Zijlstra, Alan Cox, Will Deacon,
Alexei Starovoitov, Solomon Peachy, H. Peter Anvin,
Christian Lamparter, Elena Reshetova, linux-arch@vger.kernel.org,
Andi Kleen, James E.J. Bottomley, Linux SCSI List,
Jonathan Corbet, the arch/x86 maintainers, Russell
On Sat, Jan 13, 2018 at 10:51 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, Jan 12, 2018 at 4:15 PM, Tony Luck <tony.luck@gmail.com> wrote:
> So your argument depends on "the uarch will actually run the code in
> order if there are no events that block the pipeline".
And might be bogus ... I'm a software person not a u-arch expert. That
sounded good in my head, but the level of parallelism may be greater
than I can imagine.
> Or at least it depends on a certain latency of the killing of any OoO
> execution being low enough that the cache access doesn't even begin.
>
> I realize that that is very much a particular microarchitectural
> detail, but it's actually a *big* deal. Do we have a set of rules for
> what is not a worry, simply because the speculated accesses get killed
> early enough?
>
> Apparently "test a register value against a constant" is good enough,
> assuming that register is also needed for the address of the access.
People who do understand this are working on what can be guaranteed.
For now don't make big plans based on my ramblings.
-Tony
^ permalink raw reply [flat|nested] 24+ messages in thread
* Re: [PATCH v2 00/19] prevent bounds-check bypass via speculative execution
2018-01-12 0:46 [PATCH v2 00/19] prevent bounds-check bypass via speculative execution Dan Williams
` (6 preceding siblings ...)
2018-01-12 1:19 ` [PATCH v2 00/19] " Linus Torvalds
@ 2018-01-12 10:02 ` Russell King - ARM Linux
7 siblings, 0 replies; 24+ messages in thread
From: Russell King - ARM Linux @ 2018-01-12 10:02 UTC (permalink / raw)
To: Dan Williams
Cc: linux-kernel, Mark Rutland, kernel-hardening, Peter Zijlstra,
Alan Cox, Will Deacon, Alexei Starovoitov, Solomon Peachy,
H. Peter Anvin, Christian Lamparter, Elena Reshetova, linux-arch,
Andi Kleen, James E.J. Bottomley, linux-scsi, Jonathan Corbet,
x86, Ingo Molnar, Catalin Marinas, Alexey Kuznetsov, linux-media,
Tom Lendacky <tho
Do you think that the appropriate patches could be copied to the
appropriate people please?
On Thu, Jan 11, 2018 at 04:46:24PM -0800, Dan Williams wrote:
> Changes since v1 [1]:
> * fixup the ifence definition to use alternative_2 per recent AMD
> changes in tip/x86/pti (Tom)
>
> * drop 'nospec_ptr' (Linus, Mark)
>
> * rename 'nospec_array_ptr' to 'array_ptr' (Alexei)
>
> * rename 'nospec_barrier' to 'ifence' (Peter, Ingo)
>
> * clean up occasions of 'variable assignment in if()' (Sergei, Stephen)
>
> * make 'array_ptr' use a mask instead of an architectural ifence by
> default (Linus, Alexei)
>
> * provide a command line and compile-time opt-in to the ifence
> mechanism, if an architecture provides 'ifence_array_ptr'.
>
> * provide an optimized mask generation helper, 'array_ptr_mask', for
> x86 (Linus)
>
> * move 'get_user' hardening from '__range_not_ok' to '__uaccess_begin'
> (Linus)
>
> * drop "Thermal/int340x: prevent bounds-check..." since userspace does
> not have arbitrary control over the 'trip' index (Srinivas)
>
> * update the changelog of "net: mpls: prevent bounds-check..." and keep
> it in the series to continue the debate about Spectre hygiene patches.
> (Eric).
>
> * record a reviewed-by from Laurent on "[media] uvcvideo: prevent
> bounds-check..."
>
> * update the cover letter
>
> [1]: https://lwn.net/Articles/743376/
>
> ---
>
> Quoting Mark's original RFC:
>
> "Recently, Google Project Zero discovered several classes of attack
> against speculative execution. One of these, known as variant-1, allows
> explicit bounds checks to be bypassed under speculation, providing an
> arbitrary read gadget. Further details can be found on the GPZ blog [2]
> and the Documentation patch in this series."
>
> This series incorporates Mark Rutland's latest ARM changes and adds
> the x86 specific implementation of 'ifence_array_ptr'. That ifence
> based approach is provided as an opt-in fallback, but the default
> mitigation, '__array_ptr', uses a 'mask' approach that removes
> conditional branches instructions, and otherwise aims to redirect
> speculation to use a NULL pointer rather than a user controlled value.
>
> The mask is generated by the following from Alexei, and Linus:
>
> mask = ~(long)(_i | (_s - 1 - _i)) >> (BITS_PER_LONG - 1);
>
> ...and Linus provided an optimized mask generation helper for x86:
>
> asm ("cmpq %1,%2; sbbq %0,%0;"
> :"=r" (mask)
> :"r"(sz),"r" (idx)
> :"cc");
>
> The 'array_ptr' mechanism can be switched between 'mask' and 'ifence'
> via the spectre_v1={mask,ifence} command line option, and the
> compile-time default is set by selecting either CONFIG_SPECTRE1_MASK or
> CONFIG_SPECTRE1_IFENCE.
>
> The 'array_ptr' infrastructure is the primary focus this patch set. The
> individual patches that perform 'array_ptr' conversions are a point in
> time (i.e. earlier kernel, early analysis tooling, x86 only etc...)
> start at finding some of these gadgets.
>
> Another consideration for reviewing these patches is the 'hygiene'
> argument. When a patch refers to hygiene it is concerned with stopping
> speculation on an unconstrained or insufficiently constrained pointer
> value under userspace control. That by itself is not sufficient for
> attack (per current understanding) [3], but it is a necessary
> pre-condition. So 'hygiene' refers to cleaning up those suspect
> pointers regardless of whether they are usable as a gadget.
>
> These patches are also be available via the 'nospec-v2' git branch
> here:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/djbw/linux nospec-v2
>
> Note that the BPF fix for Spectre variant1 is merged in the bpf.git
> tree [4], and is not included in this branch.
>
> [2]: https://googleprojectzero.blogspot.co.uk/2018/01/reading-privileged-memory-with-side.html
> [3]: https://spectreattack.com/spectre.pdf
> [4]: https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git/commit/?id=b2157399cc98
>
> ---
>
> Dan Williams (16):
> x86: implement ifence()
> x86: implement ifence_array_ptr() and array_ptr_mask()
> asm-generic/barrier: mask speculative execution flows
> x86: introduce __uaccess_begin_nospec and ASM_IFENCE
> x86: use __uaccess_begin_nospec and ASM_IFENCE in get_user paths
> ipv6: prevent bounds-check bypass via speculative execution
> ipv4: prevent bounds-check bypass via speculative execution
> vfs, fdtable: prevent bounds-check bypass via speculative execution
> userns: prevent bounds-check bypass via speculative execution
> udf: prevent bounds-check bypass via speculative execution
> [media] uvcvideo: prevent bounds-check bypass via speculative execution
> carl9170: prevent bounds-check bypass via speculative execution
> p54: prevent bounds-check bypass via speculative execution
> qla2xxx: prevent bounds-check bypass via speculative execution
> cw1200: prevent bounds-check bypass via speculative execution
> net: mpls: prevent bounds-check bypass via speculative execution
>
> Mark Rutland (3):
> Documentation: document array_ptr
> arm64: implement ifence_array_ptr()
> arm: implement ifence_array_ptr()
>
> Documentation/speculation.txt | 142 ++++++++++++++++++++++++++++++
> arch/arm/Kconfig | 1
> arch/arm/include/asm/barrier.h | 24 +++++
> arch/arm64/Kconfig | 1
> arch/arm64/include/asm/barrier.h | 24 +++++
> arch/x86/Kconfig | 3 +
> arch/x86/include/asm/barrier.h | 46 ++++++++++
> arch/x86/include/asm/msr.h | 3 -
> arch/x86/include/asm/smap.h | 4 +
> arch/x86/include/asm/uaccess.h | 16 +++
> arch/x86/include/asm/uaccess_32.h | 6 +
> arch/x86/include/asm/uaccess_64.h | 12 +--
> arch/x86/lib/copy_user_64.S | 3 +
> arch/x86/lib/usercopy_32.c | 8 +-
> drivers/media/usb/uvc/uvc_v4l2.c | 9 +-
> drivers/net/wireless/ath/carl9170/main.c | 7 +
> drivers/net/wireless/intersil/p54/main.c | 9 +-
> drivers/net/wireless/st/cw1200/sta.c | 11 +-
> drivers/net/wireless/st/cw1200/wsm.h | 4 -
> drivers/scsi/qla2xxx/qla_mr.c | 17 ++--
> fs/udf/misc.c | 40 +++++---
> include/linux/fdtable.h | 7 +
> include/linux/nospec.h | 71 +++++++++++++++
> kernel/Kconfig.nospec | 31 +++++++
> kernel/Makefile | 1
> kernel/nospec.c | 52 +++++++++++
> kernel/user_namespace.c | 11 +-
> lib/Kconfig | 3 +
> net/ipv4/raw.c | 10 +-
> net/ipv6/raw.c | 10 +-
> net/mpls/af_mpls.c | 12 +--
> 31 files changed, 521 insertions(+), 77 deletions(-)
> create mode 100644 Documentation/speculation.txt
> create mode 100644 include/linux/nospec.h
> create mode 100644 kernel/Kconfig.nospec
> create mode 100644 kernel/nospec.c
>
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
^ permalink raw reply [flat|nested] 24+ messages in thread